1 / 14

Data at Work: Supporting Sharing in Science and Engineering

Data at Work: Supporting Sharing in Science and Engineering. ( Birnholtz & Bietz , 2003) Adam Worrall LIS 6269 Seminar in Information Science 3/30/2010. Data and data sharing. Information science needs “a better understanding of the use of data in practice” (p. 339)

jada-hebert
Download Presentation

Data at Work: Supporting Sharing in Science and Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data at Work: Supporting Sharing in Science and Engineering (Birnholtz & Bietz, 2003) Adam Worrall LIS 6269 Seminar in Information Science 3/30/2010

  2. Data and data sharing • Information science needs “a better understanding of the use of data in practice” (p. 339) • Data fundamentally “different from documents”(p. 339) • Data sharing important (p. 339-340) • “Openness” of scientific process • Confirm findings, replicate results • Build on previous work • Large data sets require distributed collaboration • Collaboratories, e-science LIS 6269 Seminar in Information Science

  3. Data sharing problems • Collaborating and sharing of data should be encouraged • But it “is not easy” to do so (p. 340) • Why? • Lack of willingness to share, trust others • Competition for “revenue” (p. 345) • Restrictions imposed by commercial interests • Trust of sources • Trust of others; will they use data well?(see also Van House, 2003) LIS 6269 Seminar in Information Science

  4. Data sharing problems • Reasons (continued) • Problems with finding shared data • Negotiate access • Difficulties interpreting and using shared data • How collected? • How analyzed? • What format? • Metadata • Format, encoding, controlled vocabularies, etc. • Data quality (see also Stvilia et al., 2008; Wand & Wang, 1996) • “Tacit” knowledge of data (p. 340) LIS 6269 Seminar in Information Science

  5. Methodology • Three disciplines • Earthquake engineering • HIV / AIDS research • Space physics • Observation and interviews of all three, surveys of earthquake engineers • Inductive, grounded approach • Claimed they made “no assumptions about the purpose of data” (p. 340) LIS 6269 Seminar in Information Science

  6. Data dimensions • Two dimensions identified (p. 341) • “news” vs. “confirmation” • Confirm existing or expected results • Something unexpected needing further exploration • Something not fitting expected / prevailing model • “streams” vs. “events” • Longitudinal vs. cross-sectional • Context for data may change • Rate of data different • Different disciplines, different data use LIS 6269 Seminar in Information Science

  7. Data’s role in scientific communities • Defines boundaries between communities • Experimental, deductive • More possessive of data • Theoretical, inductive • More interested in sharing data • More interested in using shared data • Increasing blurring of boundaries in some fields • Provides gateway into communities • Access to data, knowledge about data is “valuable resource” (p. 343) • Those who control data and knowledge, and access to it, act as “gatekeepers of the field” (p. 343) LIS 6269 Seminar in Information Science

  8. Data’s role in scientific communities • Indicates status in community • Using one’s own data “seen as ‘better’” than using public data (p. 344) • “Analyzing somebody else’s data … arguably ‘counts’ for less” (p. 344) • Higher quality data means better reputation • For researchers, research groups, and institutions • Enables indoctrination into community • Students often work with collecting, managing data • Degree of sharing of responsibilities differs between fields, sometimes by seniority in field LIS 6269 Seminar in Information Science

  9. Categories of data uses (p. 345) • Identified with an eye to “revenue” from use • Benefits: reputation, publications, funding, etc. • “A scientist’s data set is her [or his] castle” • Researcher wants to and is able to use data to solve a particular problem or question • Will increase revenue • “With a little help from my friends” • Researcher wants to use data, but needs to collaborate with others in order to do so successfully • Data can be shared privately • Limited risk (but still some risk) • Will increase revenue LIS 6269 Seminar in Information Science

  10. Categories of data uses (p. 345) • “One scientist’s junk is another one’s treasure” • Researcher has no interest in using the data for a particular problem, but others do have interest • Sharing data will slightly increase revenue • May not be worth risk of losing other revenues • “D’oh!” • Researcher has not thought of a use, but it would be relevant to them and help them with a problem or question • Sharing data could be embarrassing, decrease revenue LIS 6269 Seminar in Information Science

  11. Categories of data use • Researchers will be less willing to share data unless incentives high, risks low • Data sharing follows social networks • Provide facilities for communication around abstractions of data sets • Encourage sharing and collaboration (category 2) • Extend researcher’s social network • Reduce risks of embarrassment (category 4) • Preliminary abstractions allow questions / comments before they are embarrassing • Increase incentives and benefits (categories 2 & 3) • Beyond boundaries of researcher’s community LIS 6269 Seminar in Information Science

  12. Recommendations and conclusions • Efforts to support “social interaction around data abstractions and the data themselves” should be made (p. 346) • Metadata should be augmented through “the sharing of supplementary materials” (i.e. abstractions) (p. 346) • Consideration of the “social and scientific roles of data” and how to support them necessary in future research (p. 346) • Better understanding of data abstractions needed (p. 347) LIS 6269 Seminar in Information Science

  13. Issues with study and article • Bias towards natural sciences • Social scientists may use, share data differently • Only 3 disciplines studied, others may differ further • Generally coherent, but some parts hard to follow • Indoctrination examples appeared similar, despite what authors termed “critical” distinction (p. 344) • Promised “three aspects of the way data are used” but only discussed two dimensions (p. 341) • Limitations only discussed briefly LIS 6269 Seminar in Information Science

  14. Questions, comments? LIS 6269 Seminar in Information Science

More Related