330 likes | 526 Views
ICPSR’s Approach to Data Citation and Persistent Identifiers. Mary Vardigan Assistant Director, ICPSR Workshop on Persistent Identifiers in the Social Sciences -- Bonn, Germany February 1, 2011. ICPSR’s use of data citations and persistent identifiers Ways that ICPSR encourages good practice
E N D
ICPSR’s Approach to Data Citation and Persistent Identifiers Mary VardiganAssistant Director, ICPSR Workshop on Persistent Identifiers in the Social Sciences -- Bonn, GermanyFebruary 1, 2011
ICPSR’s use of data citations and persistent identifiers Ways that ICPSR encourages good practice Issues to be resolved Future directions Today’s Presentation
ICPSR has been providing citations to its data since 1990 Citations based on “Cataloging Machine-Readable Data Files“ by Sue Dodd, American Library Association, 1982 ICPSR’s Use of Citations
Content Creator/Principal Investigator Title Distributor [ICPSR] Distribution place and date ICPSR study number Version number Materials designation [Computer file] DOI What Makes Up an ICPSR Citation?
Schneider, Barbara, and Linda J Waite. The 500 Family Study [1998-2000: United States] [Computer file]. ICPSR04549-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2008-05-30. doi:10.3886/ICPSR04549 Example
ICPSR started assigning DOIs in 2008 DOIs apply at the study or collection level (a study can have multiple datasets) DOIs are of the form: doi:10.3886/ICPSR04549 DOIs resolve to the study homepage (metadata record) ICPSR’s Use of DOIs
ICPSR uses the CrossRef service, “the official DOI® link registration agency for scholarly and professional publications” ICPSR pays a modest annual Publisher Fee (based on publishing revenues) and pays 6 cents per DOI To begin assigning DOIs, in 2008 sent CrossRef an XML file containing metadata on all ICPSR 7000+ studies Now get DOIs weekly How ICPSR Obtains DOIs
ICPSR runs script to create XML metadata in CrossRef format: Contributors and their roles Title Publication date Update date Study number DOI URL Weekly Process
ICPSR submits XML file to register new DOIs CrossRef sends email confirming the file is correct At that point, the DOI has an associated URL on the ICPSR Web site Weekly Process, continued
Registration could happen in a script-driven manner through an API This would happen without human intervention ICPSR database could communicate with the CrossRef database with DOIs registered automatically Alternative Process
Journals are requiring that authors provide PIDs to data they analyzed for their articles Authors are coming to ICPSR for DOIs pre-publication, generally depositing data into the Publication-Related Archive Requests for DOIs
Bibliography of Data-Related Literature includes 60,000 citations to publications based on ICPSR data Two-way linking: Studies link to publications, Bibliography links back to studies Widely used DOIs for data would make searching for and harvesting related publications much easier Encouraging Good Practice
ICPSR provides RIS export for data citations into bibliographic citation software ICPSR highlights the data citation and DOI in several places Making Citations and DOIs More Prominent
ICPSR has a project with Thomson Reuters to display data linkages in Web of Knowledge Full and summary records in Web of Knowledge will link to related data when appropriate ICPSR is providing a periodic data feed of datasets and related publications to TR TR is integrating data feeds from others including UK Data Archive Working with Vendors to Promote Links to Data
On behalf of the Data-PASS partners, ICPSR wrote to professional associations in sociology, political science, and economics Letters urged them to raise the standards for data citations in their journals Professional associations are in a position to set standards for their members and for journal editors (including copy editors) Influencing Journals
Approach was to point to the variety of ways that data were cited in specific journal issues The letter stressed the importance of citing data the same way that publications are cited and the value of persistent identifiers Organizations discussed the letters at recent national meetings American Sociological Review just revised its Notice to Contributors to reflect the importance of data citations and DOIs More on Influencing Journals
ICPSR worked with EndNote (owned by Thomson Reuters) to ensure that data citations display correctly The result is that “Dataset” is now a Reference Type in EndNote. Zotero also needs adjustment for datasets Updating Citation Software
ICPSR has joined DataCite as an associate member ICPSR has joined ORCID – Open Researcher and Contributor ID. ORCID aims to create a central registry of unique identifiers for individual researchers ICPSR is heading up an IASSIST special interest group on data citation (SIGDC) Working with the Community
IASSIST SIGDC has proposed a session as part of a data citation track including DataCite: Tracking Data Reuse: Motivations, Methods, and Obstacles -- Heather Piwowar, NESCent, University of British Columbia Building Data Citations for Discovery – Hailey Mooney, Michigan State University, and Mark Newton, Purdue University ICPSR’s Efforts to Encourage Data Citation -- Elizabeth Moss,Inter-university Consortium for Political and Social Research (ICPSR) Reactor Panel from SIGDC IASSIST Session
With the community, address situations when data resources have multiple distributors (and multiple DOIs) Implement versioning in DOIs Address level of granularity for DOIs Move to DataCite Issues to Resolve
Eurobarometer 72.2 (Nuclear Energy, Corruption, Gender Equality, Healthcare, and Civil Protection) DOI: doi:10.4232/1.10009Principal Investigator: Antonis Papacostas Publication Agent: GESIS - Leibniz-Institut für Sozialwissenschaften Papacostas, Antonis. Eurobarometer 72.2: Nuclear Energy, Corruption, Gender Equality, Healthcare, and Civil Protection, September-October 2009 [Computer file]. ICPSR28186-v1. Cologne, Germany: GESIS/Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributors], 2010-07-19. doi:10.3886/ICPSR28186 Multiple DOIs for “Same” Data
“CrossRef only registers DOIs for Definitive Works… but not for Duplicative Works, as defined in the CrossRefGlossary. …CrossRef does not permit multiple DOIs to be assigned to certain closely related versions of a work… Where a CrossRef member has content which is substantially Duplicative of Definitive Works, the member must … retrieve the DOIs of Definitive Works for display in such substantially Duplicative Works and must link from the substantially Duplicative Works to the Definitive Works.” From CrossRef’s Publisher Rules:
CrossRef policy oriented toward publications not data Arrangement between ICPSR and GESIS is clear, but there are other co-distributor relationships How much of a problem is this and can we develop a community solution? Can we use the DataCite metadata kernel (relationType) to specify relationships? Would providing explanatory text and cross-referencing DOIs in archives’ metadata records be useful? More on Multiple DOIs
ICPSR has decided to add version numbers to its DOIs ICPSR may not have previous versions online User will have to contact ICPSR for access So far the number of users requesting older versions has been very small Versioning and DOIs
ICPSR’s current practice is to assign the DOI at the study level DOI resolves to the study homepage, which includes Version History detailing changes to all files in the collection Assigning dataset-level DOIs is a challenge because ICPSR has over 65,000 datasets ICPSR is undertaking a large project to revamp archival management and dataset-level DOIs will be integrated in the new infrastructure Level of Granularity for DOIs
DataCite offers several advantages because of its focus on data Metadata kernel more robust and intended to describe data Community of trusted data centers is a shared goal Moving to DataCite for DOIs
Address situations when data resources have multiple distributors and multiple DOIs Approach other vendors including Google Scholar after TR service deployed Contact other professional associations and journals Work with other data producers on providing visible citations and DOIs and encouraging their use Continue spreading the word about data citation and persistent identifiers! Future Directions
Questions? Thank you…