1 / 28

Question Answering in Biomedicine

Question Answering in Biomedicine. Student: Andreea Tutos Id: 41064739 Supervisor: Diego Molla. Project outline. Why medical question answering? Current research Project methodology Project outcomes. Why Question Answering?.

alicewebb
Download Presentation

Question Answering in Biomedicine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Question Answering in Biomedicine Student: Andreea Tutos Id: 41064739 Supervisor: Diego Molla

  2. Project outline • Why medical question answering? • Current research • Project methodology • Project outcomes

  3. Why Question Answering? • Thousands of new biological and medical research articles published daily world wide • 66% of physicians report the volume of medical information as unmanageable (Craig et al, 2001). • The main impediment in maximizing the utility of research data: insufficient time

  4. Project outline • Why question answering? • Current research • Evaluation methodology • Project outcomes

  5. What is Question Answering? • The task of automatically finding an answer to a question • Relies on analyzing large collections of documents • Aims to provide short and concise answers rather than a list of relevant documents

  6. Key steps to follow • Select the domain knowledge source • Construct the corpus of questions • Analyze the input question • Classify the question • Construct the search query • Extract the answer

  7. Domain knowledge sources • Reliability of medical information is critical (NetScoring) • MEDLINE - medical repository maintained by the US National Library of Medicine (controlled vocabulary thesaurus MeSH)

  8. Key steps to follow • Select the domain knowledge source • Construct the corpus of questions • Analyze the input question • Classify the question • Construct the search query • Extract the answer

  9. Question corpus sources • Question sources we have reviewed in our research: • The Parkhurst Exchange website • The Clinical Questions Collection website • The Journal of Family Practice website

  10. Questions format • Natural language question: “In children with an acute febrile illness, what is the efficacy of single-medication therapy with acetaminophen or ibuprofen in reducing fever?” • PICO format question: “Problem/Population: acute febrile illness / in children Intervention: acetaminophen Comparison: ibuprofen Outcome: reducing fever “ (Demner - Fushman and Lin, 2007 )

  11. Key steps to follow • Select the domain knowledge source • Construct the corpus of questions • Analyze the input question • Classify the question • Construct the search query • Extract the answer

  12. Question classification • The Evidence taxonomy (Ely et al, 2002)

  13. Query analysis • Processes included: • Keyword selection: • extract keywords using parsers such as LTCHUNK • identify named entities with the support of UMLS • Answer pattern generation (different combinations of query terms) (Molla and Vicedo, 2009)

  14. Key steps to follow • Select the domain knowledge source • Construct the corpus of questions • Analyze the input question • Classify the question • Construct the search query • Extract the answer

  15. Answer extraction • Identify relevant sentences that answer the question • Rank the answer candidates (popularity, similarity with the question, answer patterns, answer validation) (Molla and Vicedo, 2009) • Could use the IMRAD (Introduction, Methods, Results and Discussion) structure of biomedical articles (MedQA)

  16. Search engines and question answering systems • Generic: • Google • Answers.com • OneLook • Medical: • PubMed • MedQA • Google on PubMed only

  17. Project outline • Why question answering? • Current research • Project methodology • Project outcomes

  18. Project methodology - Question corpus • We have sourced 50 clinical questions and their answers from the Parkhurst Exchange web site

  19. Project methodology - Question processing • We have defined five levels of processing to be applied to improve search outcomes.

  20. Project methodology – Scoring system • We have used a scoring system first referred to in the Text Retrieval Conference (TREC), called Mean Reciprocal Rank (MRR) (Voorhees, 2001) • A relevant link returned in nth position (n<= 10) received a score of 1/n

  21. Results – No Intervention questions • No Intervention questions average scores

  22. Results – Intervention questions • Intervention questions average scores

  23. Results – Answer location • Answer location in scientific articles

  24. Project outline • Why question answering? • Current research • Project methodology • Project outcomes

  25. Medical search engines and QA systems conclusions • Pubmed obtained similar scores for both categories (0.27 for No Intervention and 0.24 for Intervention questions) • Medical search engines perform relatively equal on Intervention and No Intervention questions

  26. Generic search engines and QA systems conclusions • Google recorded the best performance for both categories of questions • Both Google and Answers.com scored better results on No Intervention questions than on Intervention questions • Non-medical oriented search engines have more difficulties in producing answers for scenario-based, complex medical questions.

  27. Conclusions • All selected questions are answerable with the current technology • 50% of answers are located in the Abstract section of scientific articles; 25% in the Conclusions section • No Intervention questions are easier to answer than Intervention questions when it comes to generic search technology

  28. Thank you

More Related