1 / 20

Research Impact Measurement in OpenAIRE 2020: Text Mining or CRISs?

Learn about the research impact measurement services in OpenAIRE 2020 through text mining or CRISs, and how it can enhance visibility and access to research information.

Download Presentation

Research Impact Measurement in OpenAIRE 2020: Text Mining or CRISs?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Archiving and Networked Services Measurement of research impact in OpenAIRE 2020: via text mining or the CRISs?Elly DijkPolicy Advisor DANSProject leader OpenAIRE2020 at DANSEuroCRIS membership meetingAMUE, Paris, 12 May 2015

  2. Outline • What is DANS? • Portal NARCIS • EU initiative OpenAIRE2020 • Task 8.3: Research Impact Services • Text mining open access content • Preferred solution?

  3. Data Archiving and Networked Services Institute of Dutch Academy and Research Funding Organisation (KNAW & NWO) since 2005 First predecessor dates back to 1964 (Steinmetz Foundation), Historical Data Archive 1989 Mission: promote and provide permanent access to digital research information DataverseNL EASY NARCIS

  4. Content NARCIS = National aggregator + making Dutch research visible • CRIS information: research projects, researchers, scholarly institutes (including 8,400 projects financed by the National Research Funder NWO) • (Open access) publicationsfrom the repositories of Dutch universities, Netherlands Academy, NWO, and a number of research institutes; • Research data fromdata archives, including EASY

  5. OpenAIRE2020 • Open Access Infrastructure for Research in Europe: Promotes Open Science • Funded by Horizon2020 to develop and maintain the infrastructure to support OA policy of the EU • Network of over 500 repositories and open access journals • Access to 11 million open access publications and 7,000 data sets, 58,000 organizations, and 30,000 projects of two research funders

  6. OpenAIRE2020 ambition • From a Repository Network to a European wide Research Information System • Enhance interoperability of all research-cycle related resources: link content together (following a subset of CERIF) through its main entities: publications, research data, projects, people, organizations • Support H2020 OA mandates • 100% OA on scientific publications • Research Data Pilot • Implement Gold OA pilot • Establish OpenAIRE legal entity

  7. Task 8.3 Research Impact Services • AthenaResearch and Innovation Center (ARC - task leader), CNR (Italy) and DANS • Realization of services for measurement of research impact w.r.t. a research initiative • Such services will identify relationships between publications/datasets and a research initiative by text mining • Goal: visualizing statistics and measuring research impact over time • Use-case: pilots with selected National funding agencies e.g. Dutch NWO

  8. Text mining • To find NWO funding informations in the publications that are already in OpenAIRE • This is done with text mining algorithms • The publications that are in OpenAIRE: from ArXiv, PMC Europe open set and the OpenAIRE compliant institutional repositories • Example: FCT - Fundação para a Ciência e a Tecnologia is the major Portuguese Science Funder

  9. FCT - Fundação para a Ciência e a Tecnologia is the major Portuguese Science Funder Openaccess

  10. Information about the projects • PROJECT IDENTIFIER (MANDATORY) • PROJECT TITLE or ACRONYM (MANDATORY) • FUNDER NAME (MANDATORY) - NWO • START DATE (MANDATORY) • END DATE (MANDATORY) • FUNDING STREAM(S) (OPTIONAL) – funding categories for more detailed statistics • ORGANIZATION(S) INVOLVED (OPTIONAL)

  11. First text mining • DANS sent 8,451 research projects financed by NWO from NARCIS to ARC in Athens • ARC did the text mining in ArXiv and PMC Europe publications • Later ARC will start text mining the Dutch repositories, starting with the University of Amsterdam

  12. Results • 353 matches in EuropePMC (in 327 unique publications - i.e. some publications had more than one NWO project matches) • 323 matches in ArXiv.org (in 286 unique publications) • Project identifier – URL of the publication in Europe PMC or ArXiv • But: In EuropePMC there are 900+ extra links to NWO that where not in the NARCIS list and appear to be valid matches • What to do next?

  13. Second text mining • Partly NWO database identifiers instead of “dossier numbers” known by the researchers • From NWO list with 5,000 research projects since 2006: database identifiers and NWO dossier numbers • We matched this list with our list of 8,451 projects and sent it again for text mining to ARC • Outcome so far: only about 100 extra publications found in PMC and 14 extra publications in ArXiv!

  14. What to do next? • Trytofind out what kind of identifiersthere are in 900+ extra links in PMC • Repeat the textminingwithotheridentifiers? • Textmining the Dutch repositories Andfor the future: Use the CRISs!

  15. Advantages of this pilot text mining • Connection between research projects of funders and open access publications becomes clear • Possible: Measurement of research impact/to make statistics/graphics • Improve NARCIS by adding NWO dossier numbers and the URLs of the publications in the project descriptions • NWO might improve the information about using the “dossier number” by the researcher

  16. Disadvantages of text mining • You’ve to do it again and again and again • What are the right identifiers? • Where can you find the publications? DANS and NWO want: • To contribute to a strengthening of the national research infrastructure • To make it easier for the researcher to fill in his research information only once

  17. Future solution: Use the CRISs • NWO demands that the “dossier number” will be stored in the CRIS : projects and publications • Describe exchange format – using CERIF • Deliver all CRIS information to NARCIS • Make a technical connection in NARCIS • Make it easier for the researchers to give an overview of their ‘NWO’ publications • Deliver the information to OpenAIRE2020, or… • When it's done: the information can be used for measuring research impact in OpenAIRE

  18. Easy to realize? • No! • Necessary: Communication with NWO and the universities • Appointmentsaboutnational exchange formats • Problem or right moment: most universities are busy to change from Metis toConveris or Pure • But in the end: everyonewill benefit: • DANS/NARCIS + NWO + universities/researchers + international community

  19. Thank you for your attention For more information please contact elly.dijk@dans.knaw.nl

More Related