60 likes | 148 Views
Towards Seamless Integration and Querying of Biological Data. Estella T. Pham – Master’s Student in CS, UML Dr. Kajal Claypool – Professor, UML. Topics of Discussion. The BIG problem A quick background information Our long-term goal My current work. The BIG problem.
E N D
Towards Seamless Integration and Querying of Biological Data Estella T. Pham – Master’s Student in CS, UML Dr. Kajal Claypool – Professor, UML
Topics of Discussion • The BIG problem • A quick background information • Our long-term goal • My current work
The BIG problem • Distributed, heterogeneous data sources. • Database systems ( DBMSs, semantic heterogenity ) • Operating systems ( files ) • Hardware • How to obtain most of the relevant information on one particular subject effectively when the pieces of the information are in different databases ? For example, find protein A structure, its folding properties and propensities, amino acid sequence, DNA sequence, organization and expression? • Why are the current data integration tools inadequate?
A Quick Background Information • 3 modes of information integration • Federated databases ( n databases ) • Warehousing ( n databases, a warehouse ) • mediation ( n databases, n wrappers, a mediator )
My Current Work • “U00096.gbk” and “ecoli.txt” ( GenBank and Swiss-Prot ) XML ‘Schema Java objects Schema matching