220 likes | 395 Views
A New Approach for Automated Author Discipline Categorization and Evaluation of Cross-Disciplinary Collaborations for Grant programs. Ilya Ponomarev 1 , Pawel Sulima 1 , Jodi Basner 1 , Unni Jensen 1 , Joshua Schnell 1 , Karen Jo 2 , and Nicole Moore 2. ilya.ponomarev@thomsonreuters.com.
E N D
A New Approach for Automated Author Discipline Categorization and Evaluation of Cross-Disciplinary Collaborations for Grant programs Ilya Ponomarev1, Pawel Sulima1, Jodi Basner1, Unni Jensen1, Joshua Schnell1, Karen Jo2, and Nicole Moore2 ilya.ponomarev@thomsonreuters.com 1Custom Analytics, Rockville, MD 2National Cancer Institute, Bethesda, MD 10/16/2013 5:30 PM
Why Cross-disciplinary Research? “Interdisciplinary research can be one of the most productive and inspiring of human pursuits” Facilitating Interdisciplinary Research National Academy of Sciences, 2005 • Innovation increasingly occurs at the boundaries of disciplines • Complex “Puzzles” require diverse background • Data avalanche from multiple sources requires fusion of information • Convergent technologies require integration across disciplines
US Government Funding of Cross-disciplinary R&D DOD DOE NSF NIH NASA
How to Measure Success of Cross-disciplinary Program? See also J. Basner, “Evaluating Collaboration and Outcomes of Health Research” Friday, 10/18/2013, 11:00am at Gunston East Rm THIS TALK: • In order to measure cross-disciplinarity define disciplines as accurate as possible • General approach of automatic assigning grant specific categories to papers and people • Application to NCI PS-OC grant program classification?
Institute Facilitate Generate NCI Physical Sciences-Oncology Centers 09/2009-Current 12 centers, 250 Researchers
Evaluation: Birds View productivity Fields convergence collaboration impact J. Basner, Friday, 10/18/2013 • Compare baseline data set (2006-2008) with ongoing research data set (2009-2012)Web of Science+ Medline 2009-2012: 601 reported pubs 2006-2008: 3,367 pubs 166 active PS-OC investigators 202,000 references 4,199 journal titles • Use publications as a proxy of outcome
Evaluation: Birds View PS-OC 2/3 broad categories Oncology Life Sciences Physical Sciences Approach:
266 Web of Science Journal Subject Categories PS-OC 3 broad categories Oncology Life Sciences Physical Sciences • Has Oncology SC • Multiple SCs per journals (up to 7) • Multidisciplinary (meaningless, but “Science”, “Nature”) • Some SCs are already inter-disciplinary • LSs dominates after aggregation
22 ESI Subject Categories • One SC per journal • Does not have Oncology • Multidisciplinary SC exists also • Clinical medicine? • LSs dominates after aggregation
Mapping. Challenges PS-OC 3 broad categories Web of Science 266 Journal SCs Web of Science 22 Broad ESI categories • One SC per journal • Does not have Oncology • Multidisciplinary SC exists also • Clinical medicine? • LSs dominates after aggregation • Has Oncology SC • Multiple SCs per journals • Multidisciplinary • Some SCs are inter-disciplinary • LSs dominates after aggregation Oncology Life Sciences Physical Sciences Approach: • Intermediate map on extended 6 Broad Categories • Paper level SC assignment based on references
Step 1. Introduce 6 Intermediate PS-OC Categories for Better Selection: (very often MED journals are closer to ON than LS) Will be dropped on final stage PS – Physical SciencesLS – Life SciencesOC – OncologyMED – Medicine OTH – Others MULT – Multidisciplinary 11
Step 2. Map 265 WoS JSC to 6 PS-OC Categories: (usually published in “Nature”, “Science” or “PNAS”) Meaningless in terms of assignment PS-OC category: article published in MULT journal can be about PS, or about LS, or OC. Usually, it is not interdisciplinary article. Additional re-classification of article’s research field is needed based on references. Examples: a) Obvious: Acoustics PS, Chemistry, Analytical PSOncology OC, Management OTH b) Dominant:Biophysics PS c) Dominant:Physics, Multidisciplinary PS d) Meaningless:Multidisciplinary MULT
Step 3. Assign PS-OC Categories Weights to Each Journal Each journal should be counted equally LS=1/2PS=1/2OC=0MED =0MUL=0OTH=0 Biology LSBiophysics PSRadiology, NM PS LSPS 2 Select distinctPS-OC categories Map Count total (denominator) Weights) (Journals in WoS can have 1 or 2, or 3, … even 7 SCs) • Examples:Journal “Radiation Research” – 3 SCs:
Step 4. Calculate combined J-R weights for publications: Better assignment of paper’s field based oninformation what paper cites Journal weights Aver. Refs Weights ½ (Journal + Refs) LS=0PS=0MED=1OC=0MUL=0OTH=0 LS=0.23PS=0.04MED=0.17OC=0.36MUL=0.19OTH=0 LS=0.12PS=0.019MED=0.58OC=0.18MUL=0.1OTH=0 Example:Coffey D., Getzenberg R. JAMA, 2006 1 journal cat (MED=1) 26 Refs: 14
Step 5. Collect all publications for each investigator, calculate average weights, and rank PS-OC categories: Person Inter-disciplinarity Averaged J-R Weights Ranks LS =0.32PS =0.04MED=0.23OC =0.41OTH =0.01 LS =2PS =4MED=3OC =1OTH =5 3 Example. DavidA 8 pubs: Average JR weights
Step 6. Redistribute MED and OTH weights between OC,LS, and PS LS =0.32PS =0.04MED=0.23OC =0.41OTH =0.01 LS =0.4PS =0.05OC =0.55
Validation At the beginning of the program: Investigators self-nominated themselves as oncologists or physicists
Future Development PhysicalScientist Oncologist Life Scientist PS-OC Network Investigators Outside Network Co-authors 19
Conclusions • Automated approach for decomposition of scientific publications into grant specific discipline categories • Multi-step method with intermediate mapping • Weighted SC assignment based on article’s and its references’ SCs • Precision-recall validation based on investigators’ self-categorizations • Oncologists within the NCI’s PS-OC program are publishing more physical sciences research and physical scientists are publishing more oncology or life sciences research during years of program participation.
ilya.ponomarev@thomsonreuters.comThomson ReutersCustom AnalyticsRockville, MD