700 likes | 840 Views
Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies. A Presentation for the Federal Data Architecture Subcommittee Chuck Mosher cmosher @ metamatrix.com. October 12, 2006. Agenda. Company Overview & Value Proposition
E N D
Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies A Presentation for the Federal Data Architecture Subcommittee Chuck Mosher cmosher @ metamatrix.com October 12, 2006
Agenda • Company Overview & Value Proposition • Data Services Rationale & Best Practices • MetaMatrix Products & Capabilities • Achieving Information Sharing • Service Enabling Data Assets • Vocabularies & Semantic Interoperability • Bridging Structured/Unstructured Information • Customer Use Cases • Summary, Q & A
MetaMatrix Company OverviewUniform access to integrated information Vision – Universal bridge between information-consuming applications and enterprise information resources. Products– Lightweight design/deploy environment for project use. Enterprise-caliber information access system for enterprise deployments. Market–Global 5000 Organizations • Government Intelligence Agencies • Homeland Security • Financial Services • Pharmaceutical, Life Sciences • Manufacturing, Telecommunications • Independent Software Companies (ISVs)
Data Interoperability Is At The Very Core of The Transformation Sought by the Federal Government One of the three enablers which drives domain-wide visibility: “… is a standard enterprise data architecture — the foundation for effective and rapid data transfer and the fundamental building block to enable a common logistical picture.” Army Lt. Gen. Claude Christianson “If you look at all the trends in the IT arena over the past 30 to 40 years, we’ve moved into an environment where we’ve got faster networks, more powerful processors, but it really comes down to the data” Michael Todd, DOD CIO office
Dr. Linton Wells, as quoted in September’s NDIA Magazine, “…data compatibility may be an issue. Enabling digital interaction with nontraditional partners may require middleware or other programs that convert data from totally different formats …”
NCES & Data Net-Centricity “To-be” = SOA Stack “As-is” = Application Silos Application Application Server Server … DBMS DBMS XML-centric Information Abstraction (= Data Services) How do you achieve? • Loose coupling • Map existing data to XML • Multi-source requests • Metadata visibility • Information security • Service access • Service discovery
The Data Challenges Getting the right information to the right person at the right time requires: • Resolving data semantic and structural mismatches • Web service enabling legacy data systems (i.e., Net Centricity) • Mapping data sources to vocabularies like C2IEDM, NIEM, GJXDM, TWPDES, etc…. • Handling multi-source requests (data aggregation, mediation, fusion, federation) • Minimizing development and maintenance cost of custom code
MetaMatrix – Quick Facts • Middle-ware, model-driven, data management • DoD proven (DISA, NSA, TRANSCOM, etc.) • Version 5 – Mature product which is still unique and ahead of the competition • NIAP certified and NSA-credentialed • Can handle the enterprise (or COI) perspective as well as the bottom-up perspective (data service enablement of legacy systems) • Can rapidly implement data integration strategies
Some Key Value Propositions • Lower cost of new application development by 35 to 70% • Data interoperability is accomplished using COTS vs code • Reduce application maintenance costs • Enable detection of changes in data structures • No re-coding needed when data structures change • Fewer systems to maintain • Avoid the need for replication (OHIO) • Data owners keep control, managed access • Data abstractions are reusable components that generate tremendous value over time. • More adaptive computing
Agenda • Company Overview & Value Proposition • Data Services Rationale & Best Practices • MetaMatrix Products & Capabilities • Achieving Information Sharing • Service Enabling Data Assets • Vocabularies & Semantic Interoperability • Bridging Structured/Unstructured Information • Customer Use Cases • Summary, Q & A
Information Challenges Communities of Interest Agency Challenges • 100’s/1000’s of data sources • 100’s/1000’s of applications • Multiple access points/modes for apps • Understanding relationships/semantics • Data consistency • Data reuse – bridging data silos • Support for Web Services & SQL • Control & manageability, compliance • Security & auditing ? Information Resources Program Challenges • Multiple sources • Different interfaces/drivers • Different physical structures • Different semantics • Single interface to data desired • Real-time access to data • Performance • Maintainability as data changes • Maintainability as apps change Mission Challenges • Time-to-deploy • Agility - Responsiveness to change • Automation – Reduce cost of new development and operations • ROI of enterprise information
Information Virtualization Communities of Interest Information Virtualization Layer Information Resources
Information Virtualization Information Virtualization Layer Unification of different concepts across systems Unified Semantic Layer Single-query access to heterogeneous systems Data Federation Layer Data Access/Connectivity Layer Uniform, standardized access to any system Enterprise Data Sources
What is a Data Service? • Decouple data sources from application • Data implementation shielded from application • Semantic/Format Mediation • Standard vocabulary • Single access point • Web Service/XML • SQL • Federation • Single source or multi-source • Scalability • Security, performance XML/SOAP SQL Bridge the Gap Data Service SQL SQL API Call Master Data Agency Application Operational Data Store
FEA DRM View on Data Services DRM Version 2 Data Access Services • Context Awareness Services • Structural Awareness Services • Transactional Services • Data Query Services • Content Search and Discovery Services • Retrieval Services • Subscription Services • Notification Services Service Types include: • Metadata / Data • Structured / Unstructured • Read / Write • Push / Pull
Data Service Layer in SOA Client Process & Applications App App App App App App Business Process Services Business Services Message Services (ESB) Data Service Data Service Data Service Data Service Data Service Data Services Layer Data Sources
Data Services: Architecture for the Ages • Data Services Best Practices • Provide transparency across all sources • Define known relationships today and accommodate future relationships • Support independence of mission systems • Support ownership of operational data sources at the source • Provide accelerated mechanisms for integrating new sources • Support existing security policy and add degrees of security • The value of a managed metadata abstraction layer • "Future Proofing" (future standards, exchange models, platforms) • Limited skill set requirements • Fixed long term costs for integration middleware • Building consensus • Assure data owners they will continue to have control, and … • Vocabulary of existing production systems will not be impacted • Offer an option where legacy data migration is not 'required' 1st
Data Services Approaches <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> <X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> </X> Data Services for Multiple Purposes: • Simplified access to value-added (tagged) data in real-time • Value-added (tagged) data materialized & staged • Phased-in migration from legacy to new • Managed archiving via classification, retention tags • Enhanced search via consistent content tags Agile Information Services Model-Driven Integration Layer Logical Data Model Logical Data Model T Org, Person, Image, Location T Organization, Customer, Imagery, Location Materialized Logical Model Materialized Logical Model Data, Content Sources Data, Content Sources Enriched Data/Content Store
Information Exchange Topology Master Data Person / Facility / Vehicle Search Engine Index / Metadata Catalog Ontology Mgmt / Reasoning Mediation XSLT, Multi-source Enterprise Service Bus / Intranet / Extranet State/Local Orchestration Encryption High Availability Security/Authentication Operations Management Error / Exception Management Data Access Services • SQL, Web Service/XML • Staged Data (optional) Distributed Data Services Enterprise Data Services Stage SOA App’s Land/Sea Federal Agencies
Agenda • Company Overview & Value Proposition • Data Services Rationale & Best Practices • MetaMatrix Products & Capabilities • Achieving Information Sharing • Service Enabling Data Assets • Vocabularies & Semantic Interoperability • Bridging Structured/Unstructured Information • Customer Use Cases • Summary, Q & A
MetaMatrix I.P. MetaMatrix has 2 distinct innovations that work in concert to yield significant business benefits: Model-based Extensible Sharable, reusable Standards-based Information Modeling Cost-based optimizer Read/write/transactions Uniform API, any source Battle-tested/hardened Federated Querying
MetaMatrix Enterprise Data Services • Project-level or Enterprise-wide data services layer • Integrated views of data from multiple sources • Metadata-driven • Optimized performance • Interoperable security • Complements BI, ETL, ESB/EAI, DQ, CDI, Search
Modeling Instead of Coding SOAP ODBC JDBC <sale/> <value/> </ sale > <WSDL> (contract) <WSDL> (contract) <WSDL> (contract) Designing data services Exposed Data Services Reusable, Integrated Data Objects Enterprise Information Sources (EIS) Information Consumers Web Services,Business Processes services warehouses EAI, Data warehouses databases Logistics Packaged Apps spreadsheets xml Custom Apps geo-spatial Reporting, Analytics Intelligence rich media …
MetaMatrix Designer Physical Models Representing Actual Data Sources Virtual Models • Shows structural transformations from one or more other classifiers • Defines transformations with • Selects • Joins • Criteria • Functions • Unions • User Defined Data Service Abstraction Layers: Broker, translate, aggregate, fuse or integrate data.
MetaMatrix Products JDBC SOAP ODBC JMS Access Models Integrated Security Views XML Docs Services <a> Users … <b> in proc out MetaMatrix Integration Server </b> Virtual Data Bases Integration Server </a> VDB VDB Roles Query Processor Optimizer Processor Entitle ments Information Consumers MetaMatrix Server MetaMatrix Designer - Design and deploy data services MetaMatrix Connector Framework Packaged Connectors Web Svc XML RDBMS MetaMatrix Catalog CICS VSAM Siebel, SAP Oracle Apps
Secure Access – Accredited Username/Password Logon • Connector connects with same ID for all queries • Optional: Integrated with existing authentication system MetaMatrix Data Source Client App Connector username password username password Membership Provider authenticates Trusted Payload Logon: • Connector uses different credentials per connection, per query • Optional: Integrated with existing authentication system source- specific MetaMatrix Data Source Client App Connector trusted payload trusted payload logon info trusted payload payload payload Optionally accesses source-specific information Authentication Service Membership Provider authenticates, generates payload authenticates, optionally modifies payload
Managing Data Service Metadata Rel Process X XML Rel Process Y XML XML XML Service A Service B Classification Schemes KeyWords B Taxonomy A Relational Transformations XML Datatypes MetaMatrix Catalog Generic Typed Relationships MetaMatrix Designer Domain [UML/ER] Models & Files [versioned] Web Services [WSDL] Processes [BPM/BPEL] Search Index Web Reporting WSDL Application/ Configuration Ontologies [OWL/RDF] Taxonomies
MetaMatrix Product Lines MetaMatrix Enterprise • Web services & SQL • Modeling enterprise data • Scalable deployment server • Metadata management • Application/legacy connectors MetaMatrix Enterprise Enterprise MetaMatrix Dimension • Web service-enablement of data sources • Expose business views as XML • Lightweight modeling – rapid integration • Standard WAR-based deployment MetaMatrix Dimension Project, Node MetaMatrix Query • Embeddable Java component • Federated query engine • Query optimization • Standard JDBC to all sources • Standard SQL to all sources MetaMatrix Query ISV / Project
Agenda • Company Overview & Value Proposition • Data Services Rationale & Best Practices • MetaMatrix Products & Capabilities • Achieving Information Sharing • Service Enabling Data Assets • Vocabularies & Semantic Interoperability • Bridging Structured/Unstructured Information • Customer Use Cases • Summary, Q & A
Mediation: XML From Non-XML Sources «Relational» «XML» <person> <addresses> … </addresses> <accounts> <accountID=…> … </accountID> </accounts> </person> «Application» Target: Fixed (potentially complex) XML Schema Need: Data complying to Schema Source: Data Sources containing Information to integrate «Text File» T MetaMatrix: Mapping from Data to XML
Map Data Sources to XML & Deploy MetaMatrix Designer – for XML-centric Data Services Model XML Docs, Schemas Build XML Doc. models from XML Schemas Map XML Doc. models to other data models Enable data access via XML
Dimension – Choose your approach Data Sources Source Models Business Views Web Service Operations Web Server Import Map Model Deploy <XML> <XML> to to as WAR <XML> XSD WSDL • Rapid design & deployment of Web Services • Expose integrated data as XML-based business views • Deployment of Web Services as standard Web apps • Runtime execution optimized through use of MetaMatrix Query Engine Dimension Models Start Here? Start Here?
Agenda • Company Overview & Value Proposition • Data Services Rationale & Best Practices • MetaMatrix Products & Capabilities • Achieving Information Sharing • Service Enabling Data Assets • Vocabularies & Semantic Interoperability • Bridging Structured/Unstructured Information • Customer Use Cases • Summary, Q & A
COI Data Dictionary Location_ID Location_Type bldg_type bldg_id Depot_Number SITENUM Facility_ID Business Intelligence Applications Search Applications Web Services ODBC/JDBC JDBC SOAP Application views of information: • Relational, XML XML Document <a> … <b> </b> </a> T T T C2, Logistics, Intelligence, … Logical Data Model: • Agency or COI-specific • Rationalize, harmonize, mediate T T T Authoritative Sources: • Mapped to logical Multiple Internal/External Information Sources
Semantic Matching - example Ontology “Sex” semantically related to “Gender” Semantic Data Services • key component of information sharing and interoperability programs • automated semantic mapping to aid domain experts in quickly reconciling disparate schemas and vocabularies • more rapid deployment of a mediation solution MatchIt • an extensible ontology-driven tool • variety of algorithms for determining semantic equivalence • discovers similarities between elements of heterogeneous data, automatically exposing potential semantic matches. • matches elements of data sources to target schemas of Data Services, such as TWPDES, GJXDM, NIEM, C2IEDM, HL7 Matched (Confidence of 90%) Gender ID Semantic Data Services Person Sex Code FBI CBP NYC NY NJ Data Sources
Automated Term Discovery (Interpret) All the available definitions found in the MatchIT knowledge-base Results of the automated tokenization All the usage instances where each term was used in any of the sources A comprehensive list of terms automatically discovered across all sources
Contextualize (Interpret) ArticleAmount Amount Article Synonym Creation Sum Type-of Assets Automated term tokenization Automated semantic linking using the default knowledge-base contained within MatchIT
Semantic Matching (Mediate) • With relationships pre-established within the knowledge-base… • Identify the Target and the Source(s) and run the match. ArticleAmount Automatically linked by a specific % distance ProductShares
Facilitate Decision Making (Mediate) Target element for matching Automatically calculated semantic distance between terms Helps facilitate rapid decision making Source candidate for matching
Support Multiple Enterprise Semantic Models J-1 Manpower / Personnel J-7 Operational Plans J-4 Logistics (GCSS) J-8 Force Structure J-5 Plans & Policy J-2 Intelligence J-3 Operations J-6 C4CS Business Intelligence Applications Portal Applications Web Services ODBC/JDBC JDBC SOAP Enterprise-wide or COI-driven Data Models • Rationalization • Harmonization • Data Catalogs (DDMS) T T T Data Sources - Authoritative • Redundant • Overlapping Multiple Internal/External Information Sources
Why Vocabulary Management? You can’t act on data alone! • Knowledge lies everywhere - you must involve data from disparate sources • The volume and disparately of data is too significant - you must enable machine involvement • Using semantics is not enough - you must be able to leverage domain concepts and terminologies • You must have the ability to infer relationships across the data
Benefits of Vocabulary Management • Develop reusable information models and schemas • Implicitly improves data integrity • Capture business and technology requirements in a single vocabulary • Capture institutional knowledge • Enables semantic mining techniques for deeper data discovery and information sharing • Accelerate interoperability, web services and SOA development and deployment • Establish and maintain a common relationship across data sources • Establish and maintain compliance with industry exchange models • Reduce IT expenses by leveraging data in its native source • Reduce IT expenses associated with building and maintaining partner integration • Improved information sharing directly enhances decision making
Knoodl.com - from Revelytix • A publicly-available collaborative wiki for collaborative vocabulary/ontology development • Extends the wiki metaphor with a formal model for semantic markup • Ideal for • Community of Interest (COI) based OWL development • Domain vocabulary creation and management • OWL registry/repository • Scheduled to go live 30 Oct 06
Integration Driven By Semantics Ontology Models (e.g. OWL, RDF) XML XML XML Relate information in different domains/models Search within and across domains for related information Enterprise Model (UML) Model & Relate information within any domain Data Models (Relational, XML) Physical Sources
Ontology-Driven Integration Example equivalence equivalence equivalence equivalence Logical Views Ontology Physical Sources Transportation T Land T 4 Wheel 2 Wheel T Bus Truck Car T Cargo Truck Fuel Truck
Agenda • Company Overview & Value Proposition • Data Services Rationale & Best Practices • MetaMatrix Products & Capabilities • Achieving Information Sharing • Service Enabling Data Assets • Vocabularies & Semantic Interoperability • Bridging Structured/Unstructured Information • Customer Use Cases • Summary, Q & A
Person Search - Conceptual Use Case Enterprise Information: Addresses Organizations Affiliations Accounts Transactions Call History Agreements Policies Relationships inherent in the search results link to enterprise apps, databases, and other repositories
Incorporating Enterprise Data into Search • The usefulness of an organization's data is dependent upon understanding and applying context • In a typical text search application, context is supplied by document content, or metadata tags (filename, author, date, etc.) • An organization's structured data sources do not usually lend themselves to document-centric approaches • The context of structured data relies on: • metadata (typically implicit) for table names, column names, datatypes, and business descriptions for each • implied DB relationships such as foreign keys between tables • relationships (mappings) to a business data dictionary • The volume of structured data requires a combination of indexed and non-indexed approaches
MetaMatrix and Google Google Search Appliance ContentRepository ContentRepository 2 Text Search w/ filtering criteria (optional) . . . ContentRepository Select & drill down to discover record details, related data links, & metadata Structured Data crawling & index build 3 1 RDBMS HTML I/F Field name look-up in Business Data Dictionary JDBC HTML I/F Connector Framework MetaMatrixServer ERP, CRM… Legacy Systems 4 Navigate to related data from Search UI Custom Application