1.17k likes | 1.28k Views
Combining minds: . Harnessing social collaboration for sensemaking. Aniket Kittur Ph.D. | UCLA Post-doc | Carnegi e Mellon. 1,000,000,000,000,000,000. 7. 7. 4. Halford et al., 1998; Miller, 1956 . Economy.
E N D
Combining minds: • Harnessing social collaboration for sensemaking Aniket Kittur Ph.D. | UCLA Post-doc | Carnegie Mellon
7 4 Halford et al., 1998; Miller, 1956
Economy “financial products so complex that, to this day, few people understand how they work, or what the consequences of their imploding value will be.” • Salon.com, 2008
Government "If a nation expects to be ignorant and free in a state of civilization, it expects what never was and never will be." Jefferson, 1816
Finding Filtering Understanding Integrating Deciding Pirolli & Card, 2005; Russell et al., 1993; Takayama & Card, 2008
Research overview Learning abstract concepts Modeling human memory Augmenting individual sensemaking
Research overview Understanding social collaboration Augmenting social collaboration systems
Large scale social collaboration • Advantages • Solve problems too large for individual cognition • Work of individuals benefit group • Aggregating decisions -> better outcomes (Benkler, 2002; Golder & Huberman, 2006; Grudin, 1994; Raymond, 1999)
History Sir Francis Galton
History Sir Francis Galton
History Sir Francis Galton
History Sir Francis Galton
Online collective intelligence • Predicting: Iowa Electronic Market • Filtering: Digg, Reddit • Organizing: del.icio.us • Recommending: netflix, amazon product reviews
Common assumptions • Independent judgments • Automatic aggregation
Complex information processing • Independent judgments and automatic aggregation are not enough • Scientists collaborating on a new discovery • Detectives cooperating to track serial killer • Volunteers writing encyclopedia • Need to coordinate, build consensus • Coordination is the norm, not the exception
Research question How do we harness the power of the crowd for complex tasks that involve coordination?
Why study Wikipedia? • May have thousands of individuals involved in a single sensemaking task • Integrating many conflicting sources into an article • Many tasks require high coordination • Planning an article • Building consensus on what should be included • Organizing and structuring • Resolving conflicts • Achieving neutral point of view • Full history available (200+ million edits, 2.5+TB)
Roadmap • Understanding coordination • Characterizing coordination [CHI 07] • Coordination and quality [CSCW 08] • Augmenting social collaboration • Conflict [CHI 07][VAST 07] • Trust [CHI 08][CSCW 08] • Future directions Collaborators: Robert Kraut (CMU), Bryant Lee (CMU), Bryan Pendleton (CMU) Ed Chi (PARC), Bongwon Suh (PARC)
Coordination costs • Increasing contributors incurs process losses (Boehm, 1981; Steiner, 1972) • Diminishing returns with added people (Hill, 1982; Sheppard, 1993) • Super-linear increase in communication pairs • Linear increase in added work • In the extreme, costs may exceed benefits to quality (Brooks, 1975) • The more you can support coordination, the more benefits from adding people “Adding manpower to a late software project makes it later” Brooks, 1975
Research question To what degree are editors in Wikipedia working independently versus coordinating?
Research infrastructure • Analyzed entire history of Wikipedia • Every edit to every article • Large dataset (as of 2008) • 10+ million pages • 200+ million revisions • 2.5+ Tb • Used distributed processing • Hadoop distributed filesystem • Map/reduce to process data in parallel • Reduce time for analysis from weeks to hours
Types of work Direct work Editing articles Indirect work User talk, creating policy Maintenance work Reverts, vandalism
Less direct work • Decrease in proportion of edits to article page 70%
More indirect work • Increase in proportion of edits to user talk 8%
More indirect work • Increase in proportion of edits to user talk • Increase in proportion of edits to policy pages 11%
More maintenance work • Increase in proportion of edits that are reverts 7%
More wasted work • Increase in proportion of edits that are reverts • Increase in proportion of edits reverting vandalism 1-2%
Global level • Coordination costs are growing • Less direct work (articles) • More indirect work (article talk, user, procedure) • More maintenance work (reverts, vandalism) Kittur, Suh, Pendleton, & Chi, 2007
Roadmap • Understanding coordination • Characterizing coordination [CHI 07] • Coordination and quality [CSCW 08] • Augmenting social collaboration • Conflict [CHI 07][VAST 07] • Trust [CHI 08][CSCW 08] • Future directions Collaborators: Robert Kraut (CMU), Bryant Lee (CMU), Bryan Pendleton (CMU) Ed Chi (PARC), Bongwon Suh (PARC)
Coordination types • Explicit coordination • Direct communication among editors planning and discussing article • Implicit coordination • Division of labor and workgroup structure • Concentrating work in core group of editors Leavitt, 1951; March & Simon, 1958; Malone, 1987; Rouse et al., 1992; Thompson, 1967
Explicit coordination: “Music of Italy” readability
Coordination types • Explicit coordination • Direct communication among editors planning and discussing article • Implicit coordination • Division of labor and workgroup structure • Concentrating work in core group of editors Leavitt, 1951; March & Simon, 1958; Malone, 1987; Rouse et al., 1992; Thompson, 1967
Implicit coordination: “Music of Italy” TUF-KAT: Set scope and structure
Implicit coordination: “Music of Italy” Filling in by many contributors
Implicit coordination: “Music of Italy” Restructuring by Jeffmatt
Research question • What factors lead to improved quality? • Adding editors • Explicit coordination (communication) • Implicit coordination (concentration)
Wilkinson & Huberman, 2007 • Examined featured articles vs. non-featured articles • Controlling for PageRank (i.e., popularity) • Featured articles = more edits, more editors • More work, more people => better outcomes
Difficulties with generalizing results • Cross-sectional analysis • Reverse causation: articles that become featured may subsequently attract more people • Coarse quality metrics • Fewer than 2000 out of >2,000,000 articles are featured • Stringent, non-representative peer-review process