1 / 19

Class Projects

Class Projects. Future Work and Possible Project Topic in Gene Regulatory network. Learning from multiple data sources ; Learning causality in Motifs ; Learning GRN with feedback loops ;. Learning from multiple data sources.

vahe
Download Presentation

Class Projects

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Class Projects

  2. Future Work and Possible Project Topic in Gene Regulatory network Learning from multiple data sources; Learning causality in Motifs; Learning GRN with feedback loops;

  3. Learning from multiple data sources • We have gene expression data and topological ordering information; • Incorporating some other data sources as prior knowledge for the learning; • Transcription factor binding location data; • … Example: Partial regulatory network recovered using expression data and location data.

  4. Learning Causality in Motifs • They be used to assemble a transcriptional regulatory network. • Network motifs are the simplest units of network architecture.

  5. Learning GRN with feedback loops

  6. Learning GRN with feedback loops (Con’dProtein-Protein Interactions

  7. Future work and Possible Project Topics in protein interaction • Learning from multiple data sources; • Disease related protein-protein interactions; • Learning from different species;

  8. Learning from Multiple data sources • Gene Neighbor: identifies protein pair encoded in close proximity across multiple genomes. • Rosetta Stone • Phylogenetic Profile • Gene Clustering: • closely spaced genes, and assigns a probability P of observing a particular gap distance

  9. Disease related protein-protein interactions; Disease Related??? -- Query NCBI OMIM Database

  10. Learning from different species

  11. BioQA related projects

  12. Projects for BioQA • Learning • Given a set of relevant abstracts, what kind of features can we obtain to enhance our queries? • Given a set of questions from users, how can we identify keywords from the questions to form queries? • Answer Presentation • Given a relevant abstract/article, • how can we retrieve the relevant passage with respect to the user’s question? • how to extract answers?

  13. Projects for BioQA • Automatic Extraction • Extract relations of gene-disease, gene-biological process (also their corresponding organisms) • Uniquely identify the genes • A gene symbol can be associated with multiple gene identifiers. Which gene identifier is the right one? • Can these extraction processes be generalized? • Sortal Resolution • Given an abstract and query, perform sortal resolution (but not on pronouns) • Example: • Given the following abstract: • “In this report, we show that virus infection of cells results in a dramatic hyperacetylation of histones H3 and H4 that is localized to the IFN-beta promoter. … Thus, coactivator-mediated localized hyperacetylation of histones may play a crucial role in inducible gene expression. [PMID: 10024886] • and the query about histones, perform resolution on histones • Results: histones refer to H3, H4.

  14. Projects for BioQA • Semantics of Words • Dealing with the semantics of words to improve the retrieval of answers • Example: semantic relation between “role” and “play” • Gene symbol variants, disambiguate gene symbols, entity recognition • Generate gene symbol synonyms and variants given a gene symbol in a query • Example: variants of “CDC28” can be written as “Cdc28”, “Cdc28p”, “cdc-28” • “GSS” is a synonym of “PRNP”, but “GSS” itself is also a gene which is unrelated to “PRNP”. • Improve on recognition of diseases, biological processes • Extension of Ontology • To capture biological processes and their possible relations to diseases • Examples: • learning and/or memory can influence Alzheimer’s disease • Degradation of ubiquitin cycle can cause extra long/short half-life of genes • Extra long/short half-life of genes can cause cancer

  15. CBioC Class Projects

  16. Other projects

  17. Build an Ontology • Build an ontology for a domain for which we do not have an ontology yet. • Verify its consistency.

  18. Various kinds of text extraction systems • TREC suggested ones • Which method/protocol is used in which experiment/procedure • Gene – disease – role • Gene – biological process – role • Gene – mutation type – biological impact • Gene – interaction – gene – function – organ • Gene – interaction – gene – disease – organ • Protein Lounge inspired • Kinase-phosphatase • transcription factor • peptide antigen

  19. Drug classification in Pharmacogenetics Experimental Data available • Drug response on cell lines; gene expression data; gene copy data; mutation analysis data; RNAi data Data from literature • Mutation data (Sanger lab); NCI-60 drug response data; Mutation analysis data; Pathway data (e.g. BIND); Gene Ontology • Proprietary data • Where does the drug physically interact? (600 Kinase – IC 50) • Gene expression data of patients after treatments Goal: • Given a patient, what kinds of data do we need in order to determine if a drug should be applicable to that patient or not? How do we develop a classifier using these kinds of data? • Find gene and protein interaction network (or components) using these data.

More Related