1 / 32

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use. Dr. Friedman on-site visit, Mayo Clinic 3 September 2010. SHARP: Area 4: Secondary Use of EHR Data. 14 academic and industry partners

gamba
Download Presentation

Strategic Health IT Advanced Research Projects (SHARP) Area 4: Secondary Use

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Strategic Health IT Advanced Research Projects (SHARP)Area 4: Secondary Use Dr. Friedman on-site visit, Mayo Clinic 3 September 2010

  2. SHARP: Area 4: Secondary Use of EHR Data • 14 academic and industry partners • Develop tools and resources that influence and extend secondary uses of clinical data • Cross-integrated suite of project and products • Clinical Data Normalization • Natural Language Processing (NLP) • Phenotyping (cohorts and eligibility) • Common pipeline tooling (UIMA) and scaling • Data Quality (metrics, missing value management) • Evaluation Framework (population networks) © 2009 Mayo Clinic 2

  3. Collaborations • Agilex Technologies • CDISC (Clinical Data Interchange Standards Consortium) • Centerphase Solutions • Deloitte • Group Health, Seattle • IBM Watson Research Labs • University of Utah • Harvard Univ. & i2b2 • Intermountain Healthcare • Mayo Clinic • Minnesota HIE (MNHIE) • MIT and i2b2 • SUNY and i2b2 • University of Pittsburgh • University of Colorado

  4. Themes & Projects

  5. Major Achievements • Foster social connections across projects • Recognition by team members that not all problems must be solved within their team • NLP and phenotypes • Phenotypes and CEM normalization • Shared responsibility for overlapping dependencies

  6. The bookends - Projects 1&6Data Normalization & Evaluation Christopher G. Chute Stan Huff (Peter Haug)

  7. Overview • Build generalizable data normalization pipeline • Establish a globally available resource for health terminologies and value sets • Establish and expand modular library of normalization algorithms • Iteratively test normalization pipelines, including NLP where appropriate, against normalized forms, and tabulate discordance. • Use cohort identification algorithms in both EMR data and EDW data. (normalize against CEMs)

  8. Progress • Designation of Clinical Element Models (CEMs) as canonical form • Utilizing use case scenario’s (PAD, CPNA, etc) for CEM normalization. • Exploration into generalizable CEM models – diagnosis, medications, labs. • Development of processes/tools to identify relevant existing CEM models within CEM libraries • Development of processes to identify missing CEMs for data (and classes of data) in use-cases • Preliminary population of phenotype use-cases

  9. Planned • Adopt eMERGE EleMap tooling for CEMs to population canonical model • Formalize Meaningful Use vocabularies into LexGrid server • Design other components of Data Normalization framework (Terminology Services - NHIN connections) • Model end-to-end flow needed to produce normalized data from structured data and unstructured (natural language) data: • High level description of process for taking “wild-type” data instances to canonical CEM instances • Applicability to use-case data as well as to general classes of data • Adopt UMIA data flows for normalization services • Examine Regenstreif and SHARP 3 modules

  10. Project 2Clinical Natural Language Processing (cNLP) Dr. Guergana Savova

  11. Overview Overarching goal High-throughput phenotype extraction from clinical free text based on standards and the principle of interoperability Focus Information extraction (IE): transformation of unstructured text into structured representations (CEMs) Merging clinical data extracted from free text with structured data

  12. Progress Detailed 4-year project plan Tasks in execution: Investigative tasks: (1) defining CEMs and attributes as normalization targets for NLP, (2) defining set of clinical named entities and their attributes, (3) methods for cNE Engineering tasks: (1) defining users, (2) incorporating site NLP tools into cTAKES and UIMA, (3) common conventions and requirements, (4) de-identification flow and data sharing Forging cross-SHARP collaborations (SHARP 3, PI Kohane and Mandl)

  13. Planned Y1 Gold standard for cNEs, relations and CEMs Focus on methods for cNE discovery and populating relevant CEMs (many subtasks) Projected module releases: Medication extraction (Nov’10) CEM OrderMedAmb population (Mar’11) Deep parser for cTAKES (Nov’10) Dependency parser for cTAKES (Jan’11) Collaboration with SHARP 3 by providing medication extraction capabilities for the medication SMaRT app

  14. Project 3High throughput Phenotyping (HTP) Dr. Jyoti Pathak

  15. Overview • Overarching goal • To develop techniques and algorithms that operate on normalized EMR data to identify cohorts of potentially eligible subjects on the basis of disease, symptoms, or related findings • Focus • Portability of phenotyping algorithms • Representation of phenotyping logic • Measure goodness of EMR data 06/21/10 © 2010 Mayo Clinic 15

  16. Progress Explored use case phenotypes from eMERGE network for HTP process validation Representation of phenotype descriptions and data elements using Clinical Element Models Preliminary execution of phenotyping algorithms (Peripheral Arterial Disease) to compare aggregate data

  17. Planned Interaction and collaboration with Data Normalization and NLP teams to develop “data collection widgets” Representation of phenotyping execution logic in a machine processable format/language Development of machine learning methods for semi-automatic cohort identification

  18. Project 4Infrastructure & Scalability Jeff Ferraro Marshal Schor Calvin Beebe

  19. UIMA exploitation Some initial discussions on UIMA were held in a meeting at MIT attended by Peter Szolovits (MIT) and Guergana Savova (Harvard) and some of their team members. A plan is underway for a UIMA "deep dive" for other members from Intermountain Health and Mayo. A discussion is pending to understand the how UIMA might fit with RPE (in particular, BPEL) RPE = Retrieve Process for Execution: an IHE (Integrating the Health Enterprise) profile to automate collaborative workflow between healthcare and secondary use domains)

  20. Infrastructure Progress • Code repository – Reviewed requirements (e.g. SVN), need pre-release work areas for project teams, bulk of materials will all be in public repository. • Licensing compatibility discussion. Initial discussions on Open Source licensing which is consistent with UIMA and other project teams tooling. Will need to survey teams. • Initial platform discussions Still working on Sandbox (“Shared”) environment, need to consider Cloud in later phases of project.

  21. Planned • Review repository options with: • ONC, Source Forge, Open Health Tools • Need to establish straw man proposal for Sandbox configuration. • Conduct cross-project discussions • Inventory tools that can be shared. • Inventory data that can be shared. • Identify shared environment site location. • Initiate high-level requirements gathering.

  22. Project 5Data Quality Dr. Kent Bailey (Kim Lemmerman)

  23. Overview • Support data quality and ascertain data quality issues across projects • Deploy and enhance methods for missing or conflicting data resolution • Integrate methods into UIMA pipelines

  24. Progress & Planned • Integrate across projects and gather requirements and standards to establish data quality plan and metrics • Compare expected quality of data to actual data quality • Provide recommendation and methods to improve data quality and/or possible outcomes

  25. Cross-Area 4 Program Efforts Lacey Hart

  26. Progress • Started with early with face-to-face collaboration; cross-knowledge pollination • Individual project efforts synergized with timelines in synch; use cases vetted and determined for the first six months of focus. • IRB & Data Sharing issues have been raised with best practice sharing and inventory of existing agreements between institutions reviewed.

  27. Planned • Best practices for IRB submissions and template protocol material will be made available w/ applicable state implications • Data use agreements will be completed across sites where needed in short term; effort for ‘consortium’ agreement will commence for long-term data sharing needs

  28. Cross-ONC Efforts Dr. Christopher Chute

  29. SHARP Area Synergies • Security: ensure piplined data does not have compromisable integrity • Cognitive: explore how normalized data and phenotypes can contribute to decisions • Applications: Potential for shared architectural strategies © 2009 Mayo Clinic 29

  30. Beacon Synergies • High-throughput data normalization and phenotyping (SHARP) • Applied to population laboratory (Beacon) • Validate on consented sub-samples • Potential to include ALL patients in population area – regardless of provider © 2009 Mayo Clinic 30

  31. SHARP Area 4: More information… http://sharpn.org

  32. SE MN Beacon: More information… http://informatics.mayo.edu/beacon © 2009 Mayo Clinic 32

More Related