1 / 113

Ontology Generation and Applications

Ontology Generation and Applications. Dr. A.C.M. Fong, CEng Professor of Computer Engineering School of Computing and Mathematical Sciences Faculty of Design and Creative Technologies Auckland University of Technology afong@aut.ac.nz. Contents. Introduction – Semantic Web and Ontology

declan
Download Presentation

Ontology Generation and Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ontology Generation and Applications Dr. A.C.M. Fong, CEng Professor of Computer Engineering School of Computing and Mathematical Sciences Faculty of Design and Creative Technologies Auckland University of Technology afong@aut.ac.nz

  2. Contents • Introduction – Semantic Web and Ontology • Related Work – Ontology Generation • Toward Automated Ontology Generation • Fuzzy Ontology Generation Framework • Application 1 – Scholarly Info • Application 2 – Service Helpdesk afong@aut.ac.nz

  3. IntroductionSemantic Web • The basis for the Semantic Web is on its ability to representreal-life domains accurately so that it enables programs to completely understand the environment in which they operate. In summary, Semantic Web provides the following benefits: • SWeb offers an expressive metadata model to represent data, so that data can be managed effectively. • Programs can understand the semantic concepts described in metadata used on Semantic Web. Hence, knowledge carried on the Semantic Web can be shared and reused among different programs. • Users can interact with programs using a semantic query language to specify their requests and thereby improving the retrieval performance. • Deductive mechanism that is used to derive new information from existing information can be described clearly, so that knowledge can be reasoned with efficiently. afong@aut.ac.nz

  4. IntroductionSemantic Web Architecture afong@aut.ac.nz

  5. IntroductionSemantic Web Architecture - Layers • Foundation Layer. Semantic Web uses Uniform Resource Identifier URI to identify resources and uses Unicode to encode the documents. • Schema Layer. This layer comprises XML + NS (Namespace) + xmlschema layer and the RDF + rdfschemalayer. This layer defines objects and classes, their relations and constrains. The XML Schema (XMLS) and RDF Schema (RDFS), which are based on XML and RDF respectively, are used for these layers. RDFS has widely been used to describe classes at the Schema Layers. afong@aut.ac.nz

  6. IntroductionSemantic Web Architecture - Layers • Ontology Layer. This layer provides constructs on using meta-information to represent domain knowledge. In this layer, information is represented as ontology, which is adopted by the Semantic Web to define knowledge. • Logic Layer. This layer infers more knowledge from the existing knowledge. It can be integrated with the Ontology Layer. In this layer, concepts and relationships defined in lower layers are converted into Turing-complete logic languages in order to generate new knowledge. afong@aut.ac.nz

  7. IntroductionSemantic Web Architecture - Layers • Proof Layer. This layer provides a mechanism to check whether a statement is true or not. • Trust Layer. This Layer provides a mechanism which resolves conflicts between knowledge carried by the Semantic Web to form the "Web of Trust" • Digital Signature Layer. This layer uses public key cryptography to secure documents. afong@aut.ac.nz

  8. IntroductionOntology – Definition • Ontology has different definitions. A commonly cited definition defines ontology as a formal, explicit specification of a sharedconceptualization. • Conceptualization refers to an abstract model of phenomena in the world by having identified the relevant concepts of those phenomena. • Explicit means that the type of concepts used, and the constraints on their use are explicitly defined. • Formal: should be machine readable. • Shared: should capture consensual knowledge accepted by the communities. afong@aut.ac.nz

  9. IntroductionOntology Research • Ontology is regarded as a standard conceptual model for knowledge representation, especially on Semantic Web. • The term ontology engineering has been proposed to imply ontology-related research in computer science • Current interesting issues on ontology engineering include ontology generation, ontology mapping, ontology integration and ontology versioning. • This presentation focuses on ontology generation. afong@aut.ac.nz

  10. IntroductionOntology Description Languages • Ontology is described using an ontology description language. Ontology description languages are based on Web metadata description languages, which can be classified into the following three groups: • HTML-based • XML-based • RDF- based afong@aut.ac.nz

  11. IntroductionHTML-based Ontology Description Languages • The tags supported by traditional Web are sufficient to represent some semantic knowledge. • Simple HTML Extension (SHOE) and Ontobroker have embeddedadditional tags into HTML to represent knowledge. • However, HTML does not support self-defined tags. Therefore, HTML-based approach is difficult to define classes for ontology. • Hence, XML-based ontology description languages have been proposed to overcome this limitation. afong@aut.ac.nz

  12. IntroductionXML-based Ontology Description Languages • These languages are usually based on XML Schema (XMLS) or Document Type Definition (DTD). • DTD allows users to define new markup types to describe information. Therefore, users can define ontology classes using DTD. • Moreover, XMLS supports the definition of relations between classes. • Thus, XMLS and DTD can be used directly to embed semantic information. • However, since XML actually only renders syntactic support for knowledge representation, XML-based ontology description languages face the following problems when representing knowledge afong@aut.ac.nz

  13. IntroductionXML-based Ontology Description Languages • A mechanism to define some relationships that are usually central in ontologies such as is-a or element-of relationships is lacking in XML. • XML does not support any notion of inheritance, which is an important attribute in ontologies. • In XML, concepts are defined through tags, which can be either a string or a combination of other nested tags. Such mechanism may not be sufficient for defining concepts in ontology, which may require richer data structures to be represented. • In XML, the order of tags appearing in a document must be previously defined. In contrast, the ordering of attribute description does not matter on ontology. afong@aut.ac.nz

  14. IntroductionRDF-based Ontology Description Languages • RDF extends XML to become a standard for knowledge representation. • In addition, RDF Schema (RDFS) can be used to define classes and class hierarchies in a domain. The standardization supported by RDF provides two important contributions: • A standard set of modeling primitives (e.g. class, instance, etc.) and their relationships (e.g. subclass) are provided. • A standardized syntax for writing ontologies is supported. • Popular RDF-based ontology description languages include DARPA Agent Markup Language (DAML), Ontology Inference Language (OIL), DAML+OIL and Web Ontology Language (OWL) afong@aut.ac.nz

  15. Introduction DARPA Agent Markup Language • DAML or DAML-ONT extends RDFS to represent ontology using the object-oriented approach. • It embeds some object-oriented concepts to represent classes. Thus, the class representation of DMAL-ONT is better than RDF. • Example of DAML-ONT to represent the class "Journal", which is a subclass of the class "Publication Medium", but is disjoint with classes "Conference" and "Workshop" (i.e. an object which belongs to class "Journal" can not belong to classes "Conference" or "Workshop" <Class ID="Journal"> <subClassOf resource="#Publication Medium"= > <disjointFrom resource="#Conference"= > <disjointFrom resource="#Workshop"= > < =Class> afong@aut.ac.nz

  16. IntroductionOntology Inference Language • OIL extends RDFS to represent ontology. It is designed based on three criteria: • Frame-based. It supports frames to define classes and properties of classes. Thus, class contents can be described more informatively (e.g. constraints can be used for class properties) • Description Logic. It describes knowledge using logic rules. Thus, knowledge is represented mathematically and can be processed by programs. • Uses Web Standard. It is based on XML and RDFS. afong@aut.ac.nz

  17. IntroductionOntology Inference Language <rdfs:Class rdf:ID="animal"= > <rdfs:Class rdf:ID="plant"> <rdfs:subClassOf> <oil:NOT> <oil:hasOperand rdf:resource="#animal"= > <oil:NOT= > < =rdfs:subClassOf> < =rdfs:Class> <rdfs:Class rdf:ID="tree"> <rdfs:subClassOf rdf:resource="#plant"> < =rdfs:Class> • Class "animal" is defined, followed by class "plant", which is defined with the operator "NOT" used to state that it is strictly not identical with class "animal“ (i.e. objects which belong to class "animal" can not belong to class "plant" and vice-versa). • Finally, class "tree" is defined as a subclass of "plant". afong@aut.ac.nz

  18. IntroductionDAML vs. OIL • Compared with DAML, OIL can represent class properties better, but DAML can represent class relationships more clearly. • Hence, they can be combined to form a better ontology description language DAML + OIL • It defines class relationships based on DAML. • Class properties are defined in a similar way as OIL. • Hence, DAML+OIL takes the advantages of both DAML and OIL. afong@aut.ac.nz

  19. IntroductionWeb Ontology Language • OWL is extended from DAML+OIL to allow users to define various types of relationships between classes. • Properties can also be defined using additional constructs in OWL. OWL has three sublanguages • OWL Lite • OWL DL • OWL Full. afong@aut.ac.nz

  20. IntroductionWeb Ontology Language Even though there is the same OWL syntax used among these sublanguages, they have a little difference in design aimed at various communities of implementers and users: • OWL Lite only primarily supports classification hierarchy and simple constrains when designing classes. • OWL DL includes all OWL language constructs but they can be used only under certain restriction (e.g. a class cannot be an instance of another class). • OWL Full allows all OWL language constructs to be used without any restriction. afong@aut.ac.nz

  21. IntroductionWeb Ontology Language <rdf:RDF> xmlns:owl ="http://www.w3.org/2002/07/owl#" xmlns:rdf ="http://www.w3.org/1999/02/22-rdf-syntax-nsl#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:xsd ="http://www.w3.org/2000/10/XMLSchema#" xmlns:daml="http://www.w3.org/2001/10/daml+oil#" <owl:Ontology rdf:about="Scholarly Information"> <owl:versionInfo>v 1.0 2009-12-07 19:06:40</owl:versionInfo> < =owl:Ontology> <owl:Class rdf:ID="Concept1"> <owl:rdfLabel="Data Mining"> < =owl:Class> <owl:Class rdf:ID="Concept2"> <owl:rdfLabel="Fuzzy Logic"> < =owl:Class> < =owl:Class rdf:ID="Concept2"> < =owl:Class rdf:ID="Concept3"> <owl:rdfLabel="Data Mining, Fuzzy Logic"= > <rdf:subClassOf> <owl:Restriction> <owl:onProperty rdf:resource="Concept1"> <owl:onProperty rdf:resource="Concept2"> < =rdf:subClassOf> < =rdf:RDF> Header Info Ontology Name and Version 3 classes: Concept1 (labelled Data mining), Concept2 (labelled Fuzzy Logic) and Concept3. Concept3 is a subclass of both Concept 1 and Concept2. afong@aut.ac.nz

  22. 2. Related WorkOntology Generation • Ontology uses classes, which contain attributes, to represent concepts. • Ontology also supports taxonomy and non-taxonomy relations between classes. • Although editing tools such as Protege [1] and OilEd [2] have been developed to help users to create and edit ontology, it is a tedious task to manually derive ontology from data. afong@aut.ac.nz

  23. 2. Related WorkOntology Generation – Approaches • Ontology can be generated from various types of data, mostly textual. • Large corpora [3,4] are considered as good sources for mining knowledge for constructing ontology, since the information in the corpus is usually well annotated. Therefore, it can be easily processed by other programs. • Ontology can also be generated from a knowledge base of rules [5], which is represented as a tree with rules residing at tree nodes. Statistical approaches have been used to estimate the existence of relationships between entities involved in rules [6]. afong@aut.ac.nz

  24. 2. Related WorkOntology Generation – Approaches • When knowledge is represented in semi-structured schemata such as XML and RDF, its contents can easily be parsed by programs; techniques have been proposed to generate ontology from semi-structured schemata based on Graph Theory [7] and statistical approaches [8]. • Learning Source Description (LSD) proposed [9] to generate ontology from any arbitrary formalisms of semi-structured schemata. • Entity-Relationship model used in database schema has also been adopted as an information source for generating ontology [10,11]. afong@aut.ac.nz

  25. 2. Related WorkOntology Generation –Textual Data • For textual data, ontology concepts can be extracted efficiently using Natural Language Processing (NLP) techniques [12,13]. • NLP for preprocessing the textual data in order to extract significant keywords. • WordNet [14] can be used to improve accuracy of ontology generated by NLP-based techniques. • However, the NLP techniques have difficulty in finding semantic relationships among the keywords. • Data mining techniques can be combined with NLP to improve the efficiency of ontology generation. In Text-to-Onto [15], association rules are used to ¯find associative relations between keywords, which are used to construct non-taxonomy relations for the ontology. afong@aut.ac.nz

  26. 2. Related WorkOntology Generation –Textual Data • Keywords' frequencies are often used in statistical approaches [16,17] to identify significant keywords that can be used to represent a certain concept. • Clustering techniques have also been applied to generate ontology from textual data [18]. • Using significant keywords extracted from textual data, clustering techniques can cluster documents and interpret topics from the generated clusters. afong@aut.ac.nz

  27. 2. Related WorkOntology Generation –Clustering • Clustering can be used to mine hidden knowledge from data to construct an ontology. It can also be used to enrich existing ontology. • Traditional clustering techniques are useful for generating non-taxonomy relations for ontology. • In particular, conceptual clustering techniques are powerful clustering techniques that can conceptualize clusters and construct a concept hierarchy of clusters useful for generating taxonomy relations for ontology. • E.g. approach based on COBWEB [18] that can generate taxonomy relations among concepts on a domain for ontology generation. • Mo'K [19] is a system that can obtain taxonomy relations from tagged text using conceptual clustering. afong@aut.ac.nz

  28. 2. Related WorkOntology Applications – Scholarly Info • In E-Scholar Knowledge Inference MOdel (ESKIMO) [20], knowledge on scholarly publications is represented as a simple ontology, known as OntoPortal, which is manually developed and maintained. • OntoPortal describes and provides links to other external research pages on the Web. Hypertext links between the web pages are also described in the OntoPortal ontology. • ESKIMO allows users to retrieve scholarly information from the constructed ontology by using queries represented as Prolog-like rules. afong@aut.ac.nz

  29. 2. Related WorkOntology Applications – Scholarly Info • In the Scholarly Ontology Project [21], a digital library Web server is constructed using Semantic Web technologies in order to support scholarly retrieval. • Developed using a collaborative approach in which researchers will submit their documents in a specifically structured format. • As such, the contents of the submitted documents can be further processed in the system and converted into scholarly ontology accordingly. afong@aut.ac.nz

  30. 2. Related WorkOntology Applications – Scholarly Info • In the Research in Semantic Scholarly Publishing (RSSP) project, scientific publications are collected from online archives such as the Open Archive Initiative (OAI) [22]. • Information of the documents (e.g. their authors, titles, citations, publishers, etc.) is extracted, indexed and converted into ontology formalism. • DAML+OIL is used to annotate the ontology as Semantic Web pages to support scholarly retrieval afong@aut.ac.nz

  31. 2. Related WorkSummary • Many techniques to construct ontology from various data types/sources; mainly textual data • Traditionally, NLP techniques are used to analyze textual data. • Recently, data mining techniques have been incorporated into NLP to further discover hidden knowledge from textual data. • Conceptual clustering is an advanced data mining technique that can organize data in a hierarchical conceptual structure. • Thus, conceptual clustering is a useful technique to discover knowledge for generating ontology from textual data. afong@aut.ac.nz

  32. 3. Toward Automated Ontology GenerationBasics • Initial focus on Scholarly info • Scholarly ontology generated directly from explicit information on scientific publications (e.g. their titles, authors, citations, etc.). • Other advanced scholarly knowledge such as research experts and areas are usually inferred manually by human experts. afong@aut.ac.nz

  33. 3. Toward Automated Ontology GenerationBasics • To construct scholarly ontology from citation database, we use data mining techniques to discover hidden knowledge in the database. • Data mining techniques include Context-based Cluster Analysis (CCA) and Fuzzy Concept Hierarchy Generation (FCHG) • Discovered knowledge then converted and integrated into the ontology formalism. • As such, apart from the implicit information available on scientific publications, Scholarly Ontology can also support other useful scholarly retrieval functions such as research experts finding and trends detection afong@aut.ac.nz

  34. 3. Toward Automated Ontology GenerationContext-based Cluster Analysis • CCA is based on Formal Concept Analysis (FCA) [23] technique. • FCA provides a formal model, known as formal context, to represent relations between objects and attributes in a data set. • We use formal contexts to represent multiple resultant clustering data. • Then, relations between the formal contexts are analyzed to find the relations between the corresponding resultant clustering data afong@aut.ac.nz

  35. 3. Toward Automated Ontology GenerationFuzzy Concept Hierarchy Generation • Concept hierarchy is a data structure useful for knowledge presentation. • Widely used in data mining applications. • Size of a concept hierarchy may be large to reflect the knowledge in a domain precisely. • Manual construction may be difficult and tedious. • Need conceptual clustering afong@aut.ac.nz

  36. 3. Toward Automated Ontology GenerationFuzzy Concept Hierarchy Generation • Many conceptual clustering techniques organize knowledge as a concept hierarchy. It may not be sufficient for representing information in a real domain. • FCA, which is a data exploratory technique, supports concept lattice that provides a more informative conceptual model for representing knowledge. • FCA-based conceptual clustering techniques are potentially useful for constructing taxonomy knowledge of ontology. • However, the typical FCA-based conceptual clustering techniques do not support uncertainty information. afong@aut.ac.nz

  37. 3. Toward Automated Ontology GenerationFuzzy Concept Hierarchy Generation • Traditional FCA-based conceptual clustering approaches can’t represent vague information… Need fuzziness • L-Fuzzy context uses linguistic variables to represent uncertainty in the context. • But needs human interpretation to define linguistic variables. • Fuzzy concept lattice generated from L-fuzzy context usually causes a combinatorial explosion of concepts (compared to traditional concept lattice) afong@aut.ac.nz

  38. 3. Toward Automated Ontology GenerationFuzzy Concept Hierarchy Generation • We combine fuzzy logic and FCA as Fuzzy Formal Concept Analysis (FFCA). • In FFCA, uncertainty information is directly represented by a real number of membership value in the range of [0,1]. • Linguistic variables are no longer needed. • Compared to fuzzy concept lattice generated from L-fuzzy context, the fuzzy concept lattice generated using FFCA will be simpler in terms of the number of formal concepts. • It also supports a formal mechanism for calculating concept similarities. • Based on FFCA, we propose the Fuzzy Conceptual Clustering technique in FCHG to generate fuzzy concept hierarchy. afong@aut.ac.nz

  39. 4. Fuzzy Ontology Generation FrameworkFuzzy Ontology • Application of fuzzy logic offers a possible solution for dealing with uncertainty information • Fuzzy ontology is generated and used in text retrieval and search engines, where membership values are used to evaluate the similarities between the concepts in a concept hierarchy • Manual generation of fuzzy ontology from a predefined concept hierarchy is a difficult and tedious task that often requires expert interpretation. afong@aut.ac.nz

  40. 4. Fuzzy Ontology Generation FrameworkIntroduction • Efficient method for generation of concept hierarchy and fuzzy ontology is highly desirable • We propose a Fuzzy Ontology Generation Framework (FOGF)that can automate fuzzy ontology generation from uncertainty data based on Formal Concept Analysis (FCA) theory • Generated fuzzy ontology is mapped to a semantic representation in OWL afong@aut.ac.nz

  41. 4. Fuzzy Ontology Generation FrameworkOverview • Fuzzy Formal Concept Analysis incorporates fuzzy logic into Formal Concept Analysis to represent vague information • Concept Hierarchy Generation clusters the fuzzy concept lattice generated by FFCA to construct a concept hierarchy in two steps: Fuzzy Conceptual Clustering and Hierarchical Relation Generation • Fuzzy Ontology Generation constructs fuzzy ontology from a fuzzy context using the concept hierarchy created by fuzzy conceptual clustering • Semantic Representation Conversion – make knowledge accessible and sharable on the Web environment. Use OWL afong@aut.ac.nz

  42. 4. Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • Definition (Fuzzy Formal Context) A fuzzy formal context is a triple K =(G, M, I = (G M)) where G is a set of objects, M is a set of attributes, and I is a fuzzy set on domain G M. Each relation (g, m) I has a membership value (g,m) in [0,1]. afong@aut.ac.nz

  43. 4.Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • Fuzzy formal context can be represented as a cross-table (Table 1) • An α-cut can be set to eliminate relations with low membership values, e.g. α = 0.5 (Table 2) • The context has 3 objects representing 3 documents, D1, D2 and D3. It also has 3 attributes, “Data Mining”, “Clustering” and “Fuzzy Logic” representing 3 research topics. The relationship between an object and an attribute is represented by a membership value in [0, 1]. afong@aut.ac.nz

  44. 4.Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • Definition (Fuzzy Representation of Object) Each object O in a fuzzy formal context K can be represented by a fuzzy set (O) as where {A1, A2,…, Am} is the set of attributes in K and µi is the membership of O with attribute Ai in K.  (O) is called the fuzzy representation of O. afong@aut.ac.nz

  45. 4.Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • Generally, we can consider the attributes of a formal concept as the description of the concept. • Thus, the relationships between the object and the concept should be the intersection of the relationships between the objects and the attributes of the concept • Since each relationship between the object and an attribute is represented as a membership value in fuzzy formal context, the intersection of these membership values should be the minimum of these membership values, hence… afong@aut.ac.nz

  46. 4.Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • Definition (Fuzzy Formal Concept) Given a fuzzy formal context K =(G, M, I) and a confidence threshold T, we define A*= {m M | gA: (g, m) T} for AG and B* = {g G | mB: (g,m) T} for B M. A fuzzy formal concept (or fuzzy concept)of a fuzzy formal context (G, M, I) with a confidence threshold T is a pair (Af =(A), B) where AG, B M, A* = B and B* = A. Each object g(A) has a membership g defined as g = min (g,m) mB where (g,m) = membership value between object g and attribute m defined in I. If B = {} then g = 1 for every g. A and B are the extent and intent of the formal concept ((A), B) respectively. afong@aut.ac.nz

  47. 4.Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • This version of FFCA as presented in these Definitions preserves differently continuous values of objects’ memberships, crucial for calculating concepts’ similarities. • In a formal context, a concept can have many superconcepts and subconcepts. However, the similarities of a concept to its superconcepts and subconcepts are different. • With fuzzy concept lattice, we can make use of the fuzzy set theory to calculate the similaritiesbetween a concept and its subconcepts. afong@aut.ac.nz

  48. 4.Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • Definition (Fuzzy Formal Concept Cardinality) Since the fuzziness of a fuzzy formal concept is represented by membership values of objects of the concept, the cardinality of a fuzzy formal concept Kf = ((A), B) is defined as |Kf| = |(A)|. afong@aut.ac.nz

  49. 4.Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • Definition (Fuzzy Formal Concept Similarity) The similarity of a fuzzy formal concept Kf1 = ((A1), B1) and its subconcept Kf2 = ((A2), B2) is defined as E(Kf1,Kf2) = E((A1), (A2)). afong@aut.ac.nz

  50. 4.Fuzzy Ontology Generation FrameworkStep 1 Fuzzy Formal Concept Analysis • Fuzzy concept lattice generated from fuzzy formal context in Table 2 (similarities between concepts shown) • Traditional concept lattice generated from Table 1 without membership values Fig. 3 Fig. 2 afong@aut.ac.nz

More Related