290 likes | 486 Views
How Description Logic Ontologies Benefit from Formal Concept Analysis. Bar ış Sertkaya SAP Research Center Dresden Germany. FCA and DLs, what are they?. Formal Concept Anal ysis (FCA) field of mathematics based on lattice theory anal yze data and der ive a
E N D
How Description Logic Ontologies Benefit fromFormal Concept Analysis Barış Sertkaya SAP Research Center Dresden Germany
FCA and DLs, what are they? • Formal Concept Analysis (FCA) • field of mathematics based on lattice theory • analyze data and derive a conceptual structuring • medicine, psychology, ontologies, linguistic databases, software engineering, musicology, … • Description Logics (DLs) • logical languages that are fragments of First Order Logic • representconceptualknowledge of an application domain • semantic web, ontologies, life sciences, bio-medical computer science, software engineering, … Concept: collection of objects sharing certain properties
FCA vs. DLs • Formal Concept Analysis (FCA) • data • algorithms • formal concepts: • concept lattice • Description Logics (DLs) • atomic concepts, roles: logical constructors: • concept descriptions: • classification algorithm • subsumption hierarchy
FCA vs. DLs FCA • intensional knowledge derived from the extensional knowledge • concept definitions are conjunctions of atomic concepts (attributes) • objects fully described (closed world semantics) DLs • intensional definition of a concept given independent of a specific domain • rich language for describing concepts (negation, exists, forall, number restrictions…) • individuals partially described (open world semantics)
Knowledge Representation (KR) Develop formalisms • for representing conceptual knowledge of an application domain, • that have a well-defined syntax, • formal, unambigioussemantics, • and practical methods for reasoning / efficient implementations. Conceptual Knowledge • Classes: country, ocean-country, … • Relations: has border to, has neighbor, … • Individuals: Spain, Mediterranean, Atlantic, …
Description Logics (DLs) • family of logic-based knowledge representation formalisms • describe an application domain in terms of • concepts (classes): like Country, Ocean, … • roles (relations): like hasBorderTo, hasNeighbour, … • individuals like Spain, Atlantic, … • logical constructors: • well-defined formal semantics, decidable fragments of First Order Logic
The DL : The smallest propositionally closed description logic • atomic concepts: A, B, … (unary predicates) • atomic roles: r, s, … (binary predicates) • constructors: • (negation) • (conjunction) • (disjunction) • (existential restriction) • (value restriction) • Examples:
Semantics of • Based on interpretation consisting of: • a domain (a non-empty set), and an interpretation function Concept and role names: • (concept names interpreted as subsets of the domain) • (role names interpreted as binary relations) Complex concept descriptions: is a model of if
Exampleof an interpretation Conceptnames Interpretation domain Individual names Sea Mediterranean Ocean Pacific Atlantic BodyOfWater Interpretation function Interpretation function OceanCountry Portugal LandlockedCountry Spain Country Austria Roles hasBorderTo hasNeighbour
Exampleof an interpretation Interpretation domain Ocean Atlantic Country Portugal Spain hasBorderTo hasNeighbour
Reasoning • Main reasoning task: • Concept subsumption: Is subsumed by ? (written ) (Does hold for all ) • Concept subsumption for computing the subsumption hierarchy (classification) LandMass BodyOfWater Country Sea Ocean OceanCountry LandLockedCountry Reasoner
DL Knowledge Bases (Ontologies) • DL Knowledge Base (Ontology) = TBox + ABox • TBoxdefines the terminology of the application domain • ABoxstates facts about a specific world • TBox: a set of concept definitions • ABox: concept and role assertions • General TBox: General concept inclusion axioms
Bridging the gap between FCA and DLs • Existing work mainly under 2 categories: • enriching FCA by borrowing constructors from DLs • theory-driven logical scaling [Prediger,Stumme’99] • terminological attribute logic [Prediger’00] • relational concept analysis [Rouane,Huchard,Napoli,Valtchev’07] • logical concept analysis [Ferré, Ridoux’01] • employing FCA methods in DL knowledge bases • Computation of an extended subsumption hierarchy [Baader’95] • Subsumption hierarchy of conjunctions and disjunctions of DL concepts [Stumme’96] • Subsumption hierarchy of least common subsumers[Baader,Molitor’00] • Relational exploration [Rudolph’04,06] • Supporting bottom-up construction of DL knowledge bases [Baader,Turhan,Sertkaya’07] • Knowledge Base Completion [Baader,Ganter,Sattler,Sertkaya’07] • Role assertion analysis [Coulet, Smail-Tabbone, Napoli, Devignes’08] • Exploring finite models [Baader,Distel’08,09]
Extended Subsumption Hierarchy of DL Concepts • traditional TBox classification: subsumption hierarchy of concepts • not sufficient in some settings: interaction between defined concepts not visible • consider the concepts , , and • no subsumption relation between these three concepts • but, subsumed by • not visible from the subsumption hierarchy! • hierarchy of conjunctions of defined concepts enables faster inferences. • precompute and store it. • how? Using attribute exploration • define a formal context whose concept lattice represents this hierarchy
Extended Subsumption Hierarchy of DL Concepts • Formal context s.t. the concept lattice is isomorphic to the hierarchy of conjunctions of DL concepts [Baader’95]: … … … • and , but , which is not visible in the usual hierarchy • implication questions are subsumption tests • a DL reasoner can act as an expert • a modified DL reasoner is needed for providing countexamples
Contributions to bridging the gap:1) supporting bottom-up construction of KBs • traditional way of creating ontologies: (top-down manner) • define concepts • specify properties of individuals using them • not always adequate • which concepts are relevant? • how to define them correctly? • alternative: bottom-up construction of ontologies User selects similar ABox individuals Individuals automatically generalized into concept descriptions (MSC computation) Commonalities automatically extracted (LCS computation) The LCS inspected/modified by the ontology engineer and added to the ontology ABox
Supporting bottom-up construction of KBs • subsumption hierarchy of conjunctions of concept names and their negations needed for computing LCS • requires subsumption tests for a TBox containing concept names • each subsumption test computationally expensive • computing the hierarchy smartly without checking all pairs? • using attribute exploration • Again define an appropriate formal context • DL reasoner can answer implication questions • Use background knowledge • implies • implies on the FCA side
Bridging the gap:2) Ontology completion Detecting inconsistencies Inferring consequences Finding reasons for them Quality dimesion of soundness Existingontology toolssupport: What about completeness? • are there • missing relations between classes? • missing individuals? • if so how to extend the ontology appropriately?
Ontology Completion TBox • All European countries EU members? • All EU members that have a border to Mediterranean have territories in Europe?
The Phosphatese Ontology • OWL Ontology for human protein phosphatese family [Wolstencroft, Brass, Horrocks, Lord, Sattler, Turi, & Stevens (2005)] • developed based on peer-reviewed publications • detailed knowledge about different classes of such proteins • TBox: classes of proteins, relations among these classes • ABox: large set of human phospthatesesidentified and documented by expert biologists • Given this ontology, the biologist wants to know: • Are there relations that hold in the real world, but that do not follow from the TBox? • Are there phospthateses that are not represented in the ABox, or even that have • not yet been identified?
When is an ontology (formally) complete? • is complete w.r.t. the intended application domain if these are equivalent: • ( and are sets of concept names) • is satisfied by • follows from • does not contain a counterexample to • Cannot be achieved by an automated tool alone, a domain expert needed! • questions ( the number of concept names) • Many of them redundant • Do not bother the expert unnecessarily • A smart way to get answers to these questions: attribute exploration!
Attribute Exploration for DL Ontologies • Extension for open-world semantics of DL ABoxes • Attribute exploration for partial/incomplete formal contexts • Already existing approaches[Burmeister & Holzer 2005] • the resulting knowledge is incomplete (certain implications, uncertain implications) • In contrast we want to have complete knowledge at the end • Our expert has / can access to complete knowledge • But he should be able to give partial descriptions of objects during exploration • Proved termination, correctness, minimum number of questions • An ABox is a partial context • Integrated a DL reasonerfor avoiding questions • Improved usability. The expert can: • Skip questions • Stop exploration, see previous answers, undo previous actions, • See why an implication automatically was accepted
Ontology Completion • When a question is asked: • first check if it follows from the ontology • if not ask the expert • if the expert confirms, add a new axiom to the TBox • if the expert rejects, get a new ABox assertion as counterexample
Summary:How DLs benefit from FCA? Mainly 2 categories: • using concept lattice to detect implicit relations between classes • Extended subsumption hierarchy (of conjunctions of concepts) • Subsumption hierarchy of least common subsumers • Supporting bottom-up construction • using attribute exploration to complete knowledge • Knowledge base completion
FCA at SAP Research • The Aletheia Project • Obtaining product information through the use of semantic technologies • FCA used for requirement analysis • sponsored by the Federal Ministry of Education and Research (BMBF) • Partners: SAP AG, ABB, BMW Group, Deutsche Post, OntoPrise, Otto, TU Dresden, FU Berlin, HU Berlin, Frauenhofer IIS, TecO, Giesecke & Devrient, Eurolog, • http://www.aletheia-projekt.de • New project CUBIST(Combining and Uniting Business Intelligence with Semantic Technologies) • FCA used for visual analytics on top of business intelligence • Partners: SAP AG, Sheffield Halam University, Heriot-Watt University, Innovantage, Ontotext Lab, CentraleRechereche S.A. (CRSA) – Laboratoire MAS, Space Applications Services NV • Academic articles at ICCS, ICFCA on Role Based Access Control for Ontologies, …
Early Days of KR • Semantic Networks [Quilian 1967] • nodes represent classes • links represent relations • hasBorderTo: does it mean there exists a border, or for all borders? • ambigious semantics! • KL-ONE [Brachman & Levesque 1985] • logic-based semantics PieceOfLand is a is a OceanCountry Country hasBorderTo is a Ocean IslandCountry hasBorderTo is a BodyOfWater