440 likes | 663 Views
MAGE-OM and ArrayExpress database model. Ugis Sarkans, EBI. Outline. what is MAGE-OM what is ArrayExpress what language is used for modeling MAGE-OM structure ArrayExpress status and future MAGE future developments. MAGE-OM. MicroArray Gene Expression Object Model
E N D
MAGE-OM and ArrayExpress database model Ugis Sarkans, EBI
Outline • what is MAGE-OM • what is ArrayExpress • what language is used for modeling • MAGE-OM structure • ArrayExpress status and future • MAGE future developments
MAGE-OM • MicroArray Gene Expression Object Model • also: MAGE-ML (.. Markup Language), MAGE-STK (..Software ToolKit) • Merging of MAML (MicroArray Markup Language) and GEML (Gene Expression Markup Language)
MAGE: brief history • December 2000 - initial submissions of proposals to OMG (Object Management Group): • EBI (on behalf of MGED) - MAML • Rosetta (on behalf of GEML community) - GEML + some IDLs • NetGenics - IDLs • Decision to proceed with a joint submission • Decision to comply with Model Driven Architecture (MDA) principles • October 2001 - joint submission to OMG (Rosetta and MGED)
Model Driven Architecture • Platform Independent Model (UML) • most of the time spent on this • Platform Specific Models • XML • UML (refined from PIM) • DTD (generated plus hand modifications) • CORBA (not for MAGE) • UML (refined from PIM) • IDL (hopefully generated) • ….
ArrayExpress • first version (object model) - 1999, in collaboration with German Cancer Research Centre (DKFZ) • second version (object model) - end of 2000, prototype development funded by Incyte
ArrayExpress (2) • implementation - first half of 2001 - Oracle schema, data loader (from MAML), prototype Web interface, a few datasets loaded • decision to use MAGE-OM as basis for further development • EU funding - 2002-2004, 8 new positions
ArrayExpress - features • MIAME-compliant • able to import MAML (MAGE-ML) formatted data • can deal with both raw and processed data • independence of: • experimental platforms • image analysis methods • data normalization methods • object model-based query mechanism • supports upcoming OMG standard for expression data
Unified Modeling Language • graphical language for describing software systems (and more ..) • notation - yes • methodology - no
UML diagram types • class • state • collaboration • sequence • ……..
Class diagrams - notation • classes • attributes • types • operations • relationships • subclass relationship • aggregate relationship • association • role names • cardinalities • navigation
class attribute aggregation inheritance navigation role name class from another package cardinality association name
Implementation issues • Java, C++ - “easy” • relational databases • classes - tables • 1:1, 1:N - foreign key • N:M - table • subclass relations • all subclasses in the same table • separate table for superclass and subclasses • XML
Tools • Rational Rose • bad graphical capabilities • forward/reverse engineering • API (VB-based) • open source • ArgoUML
BioEvent Protocol Treatment HigherLevelAnalysis Transformation Experiment BioMaterial BioAssayData BioAssay Audit QuantitationType ArrayManufacture Measurement DesignElement ArrayDesign Description BSANE BQS BioSequence UML Packages
ArrayExpress: current status • Object model (MAGE-OM) - stable • Database schema - generated (standard SQL, we run under Oracle) • Data loader from MAGE-ML - generated • Web interface (queries, browsing) - under development
Near future developments • Dedicated hardware for ArrayExpress • Good quality data coming from collaborators (annotation tools needed) • Data uploading and Web interface made public
Future developments • Integration with existing tools (Expression Profiler) • New analytical tools • Links with other databases • Data curation, liaison with data providers
ArrayExpress architecture API Web server application server (Java servlets) curation tool database ArrayExpress data warehouse central database (experiment-centred) curation MAGE-ML image server
MAGE schedule • OMG meeting, Dublin, November 12-16 - specification hopefully adopted • Mechanism for incorporating changes and user feedback • MAGE programming jamboree, EBI, December 6-11: API development, parser generation, annotation tools (MAGE STK)
Resources • Web site • links to documents • presentations • UML models • also HTML version and PNG image files of diagrams • http://www.geml.org/omg.htm • Mailing list • lsr-ge@ebi.ac.uk • to subscribe, send the following to majordomo@ebi.ac.uk subscribe lsr-ge <yourEmailAddress>
Acknowledgements • Michael Miller (Rosetta) • Dave Nellesen (Incyte) • Alan Robinson (EBI) • Martin Senger (EBI) • Paul Spellman (Lawrence Berkley Lab) • Jason Stewart (NCGR) • Charles Troup (Agilent) • Doug Bassett (Rosetta) • Alvis Brazma (EBI) • Steve Chervitz (Affymetrix) • Francisco Dela Vega (Applied Biosystems) • Michael Dickson (NetGenics) • David Frankel (IONA) • Scott Markel (NetGenics)