1 / 29

Scalable Ontological Sense Matching

Scalable Ontological Sense Matching. Dr. Geoffrey P Malafsky TECHi2 LLC, Fairfax, VA. Need for Smarter Systems. Enormous and ever increasingly amounts of data and information are available

paloma
Download Presentation

Scalable Ontological Sense Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalable Ontological Sense Matching Dr. Geoffrey P Malafsky TECHi2 LLC, Fairfax, VA

  2. Need for Smarter Systems • Enormous and ever increasingly amounts of data and information are available • Potential exists for significant increases in efficiency, effectiveness, and success in all fields IF the data and information can be harnessed • Most common use case is information overload- too much to sift through in too little time and too little resources/support/authority Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  3. Major Challenges • Value is subjective and context-based, i.e. not deterministic • Metrics and decision criteria are heavily dependent on conditions, information uncertainty, decision/activity timelines, vulnerability to error, risk capacity • Rules and interaction mechanisms are usually nebulous, poorly defined, non-existent, or incorrect • Information Technology approaches are immature with handling this situation with poor scalability (e.g. too computationally intensive, storage requirements, security) and/or untrustworthy results Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  4. Advanced Techniques • Improve machine understanding of information by annotating meaning (e.g. Ontologies) • Compute best scenario using domain models built using Subject Matter Experts defining core knowledge coupled to probabilistic fit calculations • Extract patterns from very large scale data sets • Hire lots of people Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  5. Example: Knowledge Discovery & Dissemination (KDD) • Seeks to find knowledge for practical purposes within large scale data/information stores • Cross organizational and functional boundaries • High relevance to searcher • Uncertainty is assessed and used • Secure • Rules and domain model based • Automated Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  6. Knowledge is Not Just Information or Data • Knowledge has: • Context: what is it about? • Confidence: is it right? • Relationships: what does it have to do with that? • Priorities: what is most important? • Types • Explicit knowledge is codified and can be manipulated • Tacit knowledge is unspoken “know-how” • Looks just like data when in an electronic system • It is data • Annotations on “about” and “how” tied to intelligent application logic make it knowledge for user Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  7. KDD Knowledge Map • Analysis of scientific and technological areas on emphasis in Knowledge Discovery and Dissemination (KDD) • Gaps reveal technical vulnerabilities Most research and development is concentrated in the discovery portion of KDD Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  8. Information & Data Mining Dominates KDD Focus Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  9. What is Missing to Make KDD Work • Knowledge-based metadata architecture • Predictive personalization algorithms • Models of knowledge lifecycle • Computational knowledge techniques • Real-time analysis with large data and high uncertainty • Conduit to feedback from end-users to capture and evolve domain knowledge Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  10. Bringing Structure to Unbounded Knowledge • Knowledge Mgmt systems have failed to meet operational requirements • Knowledge is inherently expansive and evolving • IT tends to collect and organize assets without relevance, context, .. • Level of Effort to manually collect, map, cleanse source data/information is too high • AI is not around the corner Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  11. Structured Knowledge • Applying a structured framework creates repeatable, interoperable, consistent solutions • Knowledge fidelity is maintained with combination of human and machine processable representations • Unknowns are discovered using known analogies via triangulation • Reduction of universe of possible combinations of knowledge and user needs to engineering scale solutions Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  12. Applying Quantum Mechanics to Knowledge Processing • Even the small area of this room has an infinite number of possible combinations of things (macro, micro, atomic, subatomic) • Representing these “knowledge” and “states” with a domain model reducing infinite possible to a tractable few: Schrödinger Equation (H = E ) • Wavefunctions describe specific states (4 quantum numbers) • Connection between objects defined by overlap integral of wavefunctions •  *0  1 = Degree of match Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  13. The KORS™ Framework • Knowledge: collected using templates from SMEs • Ontologies: Conceptual models of domain knowledge • Rules: Business and technical rules are extracted and defined from domain knowledge and ontologies • Semantic metadata: knowledge, ontology relationships, and rules are connected to and represented with data and information Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  14. KORS Structured Knowledge • Knowledge is inherently expansive and evolving • KM systems have failed to meet operational requirements • IT collects & organizes assets without relevance, context, .. • Level of Effort is too high to manually collect, map, cleanse source data/information • KORS™-pending framework creates repeatable, interoperable, consistent solutions • Knowledge fidelity is maintained with combination of human and machine processable representations • Ontologies (concepts) are expressed with domain specific and standard terms • Reduction of universe of possible combinations of knowledge and concepts to engineering scale solutions Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  15. KORS Ontologies • Conventional wisdom • Multi-tiered, fully explicit, broad coverage = Too large; Too difficult to maintain; Too hard to implement • KORS™ Ontologies • Cross-domain framework, domain specific instances of classes, leverage existing ontologies, concepts defined with domain-based uncontrolled vocabulary AND common controlled vocabulary = Smaller; easier to maintain; supports engineering processes • Answers the broad question: ”Do the concepts in this other ontology have semantic similarity to those in my ontology?” • Semantic metadata used to characterize domain ontologies Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  16. Conventional Wisdom • Upper, middle and lower ontologies. • Every concept made fully explicit Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  17. The KORS Engineering Framework Structured knowledge capture Incorporate within the engineered solution Identify rules and metadata structures Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  18. Cross-Concept Overlap Calculation • Overlap integrals calculated from semantic metadata – updated when ontologies change • Overlap (S) is computed at: • Ontology-ontology level using primary task-description pairs • Term level using allowed and disallowed senses (not just synonyms) • Real-time determination using coarse-medium-fine concept match • Coarse= ontology-ontology • Medium=Ontology-term • Fine=term-term • With metadata architecture implementation, calculation uses what is available • Inherently scalable, distributed with evolving improvements Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  19. Semantic Variance Across Domains • For example, the term “Insurgent” means: • Mission Planner: person who takes part in an armed rebellion against the constituted authority • Geospatial Analyst: someone who participates in a peaceful public display of group feeling • Diplomatic Corps: someone who participates in a public display against an established government Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  20. Semantic Challenges • Semantic Consistency • Semantic Variance Across Domains • Search and Discovery Requirements • Controlled • Uncontrolled Vocabularies • Have “local” variants Navy fliers and Air force Pilots • Change to match changing reality Yesterday’s friend could be tomorrow’s foe • Change to match changes in policy (spin) Today, “freedom fighter;” tomorrow, “insurgent” Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  21. KORS is Extensible, Scaleable, Adaptive • Ontology level metadata describes the basic functional concepts and processes of the domain • Can be linked to Enterprise Architecture products • Direct conceptual match two functional domains • Ontology term descriptions use: • domain specific (uncontrolled) expressions • Allowed senses from controlled vocabularies • Disallowed senses from controlled vocabularies • True meaning is found from combination of domain, allowed, disallowed as is done in real language • Metadata architecture: values used if present but not required  scalability and extensibility Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  22. Example Domain: GEOINT • Exploitation and analysis of imagery and geospatial information to describe, assess, and visually depict physical features and geographically referenced activities on the Earth. • GEOINT encompasses all the activities involved in the collection, analysis, and exploitation of spatial information in order to gain knowledge about the national security environment, and the visual depiction of that knowledge. Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  23. MyGEOINT Ontological Architecture Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  24. Functional Application • Semantic Expansion • Cross domain commonality • Qualified synonym identification >> discovery of potentially relevant knowledge • Semantic Resolution • Allowed and disallowed alternate semantics • Binding of dynamic domain-specific semantics to controlled vocabularies. >> computable semantic comparisons and knowledge relevance ranking Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  25. Cross domain commonality • Answers the broad question:“Do the concepts in this other ontology have semantic similarity to those in my ontology?” • Semantic metadata used to characterize domain ontologies • Overlap integrals calculated from semantic metadata – updated when ontologies change • Run-time ontology-to-ontologies greatly simplified. Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  26. Qualified synonym identification • Knowledge discovery via synonym lists is well supported by standardized lexical tools (WordNet, etc) • Broad-domain perspective limits ability to isolate domain-specific usage • KORS domain-specific ontologies and semantic metadata allow the use of broad-spectrum vocabularies to make fine-grain distinctions. Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  27. Normalizing Uncontrolled Vocabularies • Semantic Homing • The Binding of dynamic domain-specific semantics to controlled vocabularies • Allows domain-specific ontologies to evolve • Provides stable semantic anchors for knowledge computability. Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  28. MyGEOINT: Ontology Knowledge Discovery Ontology applies concept matching to make discoveries more relevant Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

  29. Functional Impact • Discovery of knowledge sources of potentially relevance. • Simpler solutions, • lower real-time computational requirements, • practical multi-domain solutions. • Computable semantic comparisons and knowledge relevance ranking • Directly computed rather than inferred • Higher domain-level precision without the overhead of extensive upper and mid-level ontologies Dr. Geoffrey P Malafsky, TECHi2, CapSci 2006, ASTI

More Related