1 / 42

What’s New in Oracle Database 12c Graph Database

What’s New in Oracle Database 12c Graph Database. Xavier Lopez, Ph.D. Senior Director Zhe Wu, Ph.D. Architect. Agenda. Graph Database Strategy Customer Use Cases Oracle Spatial and Graph RDF Graph Features Future Plans. Graph Database Strategy. Support Graph Data Types….

Download Presentation

What’s New in Oracle Database 12c Graph Database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. What’s New in Oracle Database 12c Graph Database Xavier Lopez, Ph.D. Senior Director Zhe Wu, Ph.D. Architect

  2. Agenda • Graph Database Strategy • Customer Use Cases • Oracle Spatial and Graph RDF Graph Features • Future Plans

  3. Graph Database Strategy Support Graph Data Types… …On all enterprise platforms • Oracle Database • Oracle NoSQL Database • Oracle Big Data Appliance • Oracle Cloud

  4. What Sets Us Apart? Scalability: Trillions of triples Transactional: Concurrent loading and updates with ACID properties Security: OLS security labels at “triple” level (OLS). Standards based: W3C Manageable: Use existing DB tools, utilities and expertise Multi-type support: graph, relational, search, geospatial … Multi-platform: Relational database, NoSQL, Hadoop

  5. RDF Graph v. Property Graph RDF Semantic Graphs Property Graphs Use Case: Social network analysis Analytics: Clustering, centrality, page rank, path finding Analytics Execution In-memory, In-database • Use Case: • Linked data, semantic metadata layer • Analytics: • pattern matching, Inferencing • Analytics Execution • In-database

  6. RDF Semantic Graph feature ofOracle Spatial and Graph For Oracle Database 12c

  7. Two Application Use Cases Linked Data Entity Analytics Find related content & relations by navigating connected entities “Reason” across entities • Find related content & relations by navigating connected entities • “Reason” across entities • SPARQL pattern matching • Detecting related entities across large, sparse, disparate collections of data • Inferencing: Applying rules on asserted data • Unified metadata model for distributed data sources • Flexible model for sparse and evolving data • Validate semantic and structural consistency

  8. Linked Data in Support of Distributed Data Mid-Tier Server Application 2 Application 3 • Graph-based Metadata Layer Application 1 SPARQL • W3C standard, flexible model for sparse and evolving data • Common vocabulary enables data integration & app development • Relational data stays in place,apps don’t need to change Shared Ontologies RDF Graph SQL Sales Graph Inventory Graph HR Schema Inventory Schema Sales Schema HR Database Inventory Database Sales Database Database Server

  9. Linked Data in Enterprise Access & Presentation Layer Index Semantic Graph model Data Servers Event Server Data Warehouse Hadoop Appliance BI Server Content Mgmt Data Sources / Types Human Sourced Information Subscription Services Social Media Transaction Systems Machine Generated Data

  10. Hutchinson 3G Austria Linked Data / Enterprise Metadata • Industries • Life Sciences • Finance • Media • Networks & Communications • Defense & Intelligence • Police

  11. Novartis Institutes for BioMedical Research (NIBR) • Business Challenge • Link database information on genes, proteins, metabolic pathways,compounds, ligands, etc. to original sources. • Increase productivity for accessing, sharing, searching, navigating, cross-linking, analyzing internal /external data • Solution • Semantic integration layer on RDF graph • Rich domain-specific terminology (biology, chemistry and medicine) 1.6 M terms • Terminology Hub: 8 GB of referential data that cross-references between data repositories.

  12. RDF Semantic Graph-based Applications Linked Data Entity Analytics Find related content & relations by navigating connected entities “Reason” across entities • Find related content & relations by navigating connected entities • “Reason” across entities • SPARQL pattern matching • Detecting related entities across large, sparse, disparate collections of data • Inferencing: Applying rules on asserted data • Unified metadata model for distributed data sources • Flexible model for sparse and evolving data • Validate semantic and structural consistency

  13. Knowledge Management in Intelligence Domain Country: UK Nationality: Somalian Extracted Entities & Relationships Information Extraction Feature Extraction, Term Extraction Has Currently resides Country: Morocco Search, Presentation, Report, Visualization, Query Person: AbduwaliAbdukhadir Muse Group: Al Shabab Member of Financial Data Link ? Telephone Records Supports RDF Currently resides Intelligence Ontologies Internet Traffic Person: ChehabAbdouljamidBouyaly Link ? SQL/SPARQL Ideology: Islamist Member of Enterprise Data images Spatial Documents Supports Group: ? Person: ? Group: al Qaeda Data Sources Contents Repository Databases Web resources Blogs, Mails, news, RSS feeds Member of National Intelligence Scenario Has Currently resides Country: Pakistan Nationality: Pakistani

  14. Oracle Spatial and Graph RDF Semantic Graph Features

  15. Oracle Database 12c RDF Semantic Graph Database • Exadata ready • Compression & partitioning • Parallel load, inference, query • High availability • Label security: triple-level • W3C standards compliance • Semantic Indexing of text • Enterprise Manager • Native RDF graph data store • Manages billions of triples • Optimized storage architecture Load / Storage • SPARQL-Jena/Joseki, Sesame • SQL/graph query, B-tree indexing • Ontology assisted SQL query Query • RDFS, OWL2 RL, EL, SKOS • User-defined rules • Incremental, parallel reasoning • User-defined inferencing • Plug-in architecture Reasoning • Semantic indexing framework • Integration with • OBIEE, Oracle R Enterprise • Oracle Data Mining Analytics

  16. Support for Apache Jena and OpenRDF Sesame Leverage existing investments in open source frameworks • Provides application developers with: • Easy-to-use Java APIs to access Oracle databases and RDF files • A standard-compliant SPARQL web service endpoint (Joseki, Fuseki) • Data loading (RDF/XML, N-TRIPLES, N-QUADS, TriG ,Turtle) • JSON output • Oracle-specific extensions for query execution control and management

  17. Relational to RDF Mapping • RDB to RDF Mapping • RDF views on relational tables • Enables SPARQL query on distributed resources • Views: Automatic and custom • Aligns with W3C RDB2RDF standard • No duplication of data and storage

  18. Oracle Label Security Data Classification • Fine grained security through integration with Oracle Label Security • Model level security through GRANT/REVOKE privileges • Oracle Label Security - mandatory access control • Labels assigned to both users and data • Data labels determine the sensitivity of the rows or the rights a person must posses in order to read or write the data. • User labels indicate their access rights to the data records. 18

  19. Core Inferencing Features • Forward-chaining based inference engine in the database • Native rulebases:RDFS, OWL 2 RL, OWL 2 EL, SKOS • Validation of inferred data • Proof generation • User defined inferencing • - Temporal reasoning, Spatial reasoning • Ladder Based Inference • - Fine grained security for inference graph • Integration with external OWL 2 reasoners (TrOWL, Pellet)

  20. RDF Semantic Graph: Graph Visualization & Modeling Support Semantic Modeling Graph Visualization Protégé Cytoscape Oracle Confidential – Internal/Restricted/Highly Restricted 20

  21. Analyzing RDF with Oracle BI and Oracle Advanced Analytics Oracle BI Oracle Advanced Analytics

  22. Oracle Partner Tools: (IO Informatics)

  23. Oracle Partner Tools: Tom Sawyer Social Network Analysis

  24. Manageability of RDF Semantic Graph • Built in support from Oracle Database utilities and tools Tune / Analyze Manage Ingest / Replicate / Recover • Control query execution: • in database & Jena client • Create & monitor graph w/ SQL Developer: • Semantic Network • Models, virtual models • Btree indexes • Rule bases • Entailments • Security data labels • Semantic index policies • Bulk load: • Apache Jena bulk loader • Oracle external tables & • SQL*Loader (Direct Path) w/ PL/SQL Bulk Load API • Replicate & recover: • Data Guard: physical standby • Data Pump: staging tables • Recovery Manager: RMAN • Tune load/ query/ inference: • Parallelism • Btree indexing triple/quad • Typed literals indexing • SPARQL query hints • Statistics gathering • Dynamic Sampling • Analyze performance: • Enterprise Manager: viewoptimizer plans, monitor execution / resource usage

  25. Open Geospatial Consortium: GeoSPARQL Support Defines a Vocabulary for Spatial Query Patterns • Classes • Spatial Object, Feature, Geometry • Properties • Topological relations • Links between features and geometries • Datatypes for geometry literals • ogc:wktLiteral, ogc:gmlLiteral • Query Functions • Topological relations, distance, buffer, intersection, …

  26. Graph Support on Oracle NoSQL DB Brings horizontal scalability to RDF graph applications • RDF Graph for Oracle NoSQL • RDF Graph support in Oracle NoSQL Database Enterprise Edition • High performance Key Value store • SPARQL 1.1 access to graph data • Jena & Joseki SPARQL Web Services • Massive horizontal scalability • Support for World Wide Web Consortium (W3C) Semantic Web standards

  27. When to Consider a NoSQL Database for Graphs Horizontal scalability, low query latency/cost, ease of install & management • High volume, simple queries (low latency) • Queries aggregating over most of the graph (e.g. what are the hobbies of the 100 most popular people in the network) • Frequent, large-scale updates • Large Data Centers • RDF Graph for Oracle NoSQL

  28. Quick Steps to Get Started

  29. Quick Steps to Get Started Using SQL/PLSQL APIs exec create_sem_model insert/delete triples, bulk load, run SEM_MATCH, create_entailment, … • Initialize • Creating a tablespace ‘ts’ • Run as SYS in SQL*Plus • exec sem_apis.create_sem_network(‘ts’) • Run as SYS (for 12.1.0.2 only) in SQL*Plus • exec mdsys.enableGeoRaster; Install Oracle Database 12c or Use a Prebuilt VM from OTN Using Java APIs Load/Query/Inference through GraphOracleSem, DatasetGraphOracleSem, OracleBulkUpdateHandler, … ConfigureJoseki/Fuseki web service endpoint SPARQL Query SPARQL Update REST APIs

  30. Quick Steps to Get Started Using SQL/PLSQL APIs exec create_sem_model insert/delete triples, bulk load, run SEM_MATCH, create_entailment, … • Initialize • Creating a tablespace ‘ts’ • Run as SYS in SQL*Plus • exec sem_apis.create_sem_network(‘ts’) • Run as SYS (for 12.1.0.2 only) in SQL*Plus • exec mdsys.enableGeoRaster; Install Oracle Database 12c or Use a Prebuilt VM from OTN Using Java APIs Load/Query/Inference through GraphOracleSem, DatasetGraphOracleSem, OracleBulkUpdateHandler, … ConfigureJoseki/Fuseki web service endpoint SPARQL Query SPARQL Update REST APIs

  31. Quick Steps to Get Started Using SQL/PLSQL APIs exec create_sem_model insert/delete triples, bulk load, run SEM_MATCH, create_entailment, … • Initialize • Creating a tablespace ‘ts’ • Run as SYS in SQL*Plus • exec sem_apis.create_sem_network(‘ts’) • Run as SYS (for 12.1.0.2 only) in SQL*Plus • exec mdsys.enableGeoRaster; Install Oracle Database 12c or Use a Prebuilt VM from OTN Using Java APIs Load/Query/Inference through GraphOracleSem, DatasetGraphOracleSem, OracleBulkUpdateHandler, … ConfigureJoseki/Fuseki web service endpoint SPARQL Query SPARQL Update REST APIs

  32. Performance

  33. Oracle Spatial and Graph - LUBM 200K on 3-Node RAC X2-4Load, Inference and Query Performance • The LUBM 200K Graph has 48+ Billion triples (edges) • Original graph has 26.6 Billion unique triples (quads) • Inference produced another 21.4 Billion triples • Data Loading Performance • Triples Loaded and Indexed Per Second (TLIPS): 273K • Inference Performance • Triples Inferred and Indexed Per Second (TIIPS): 327K • SPARQL Query Performance • Query Results Per Second (QRPS): 459K 48+ Billion edges graph Setup: Hardware: Sun Server X2-4, 3-node RAC - Each node configured with 1TB RAM, 4 CPU 2.4GHz 10-Core Intel E7-4870) - Storage: Dual Node 7420, both heads configured as: Sun ZFS Storage 7420  4 CPU 2.00GHz 8-Core (Intel E7-4820) 256G Memory 4x SSD SATA2 512G (READZ) 2x SATA 500G 10K. Four disk trays with 20 x 900GB disks @10Krpm, 4x SSD 73GB (WRITEZ) Software: Oracle Database 11.2.0.3.0, SGA_TARGET=750G and PGA_AGGREGATE_TARGET=200G Note: Only one node in this RAC was used for performance test. Test performed in April 2013.

  34. Oracle Spatial and Graph – LUBM 4400K on Exadata X4-2Load, Inference and Query Performance • Data Loading Performance • Triples Loaded and Indexed Per Second (TLIPS): 1.420M • Inference Performance • Triples Inferred and Indexed Per Second (TIIPS): 1.527M • SPARQL Query Performance • Query Results Per Second (QRPS): 1.130M 1.08 Trillion edges graph Setup: • Open cursors = 1000 • Processes = 1000 • SGA = 132GB, PGA = 100GB • 32K blocksize was given to all graph tablespaces • TEMP group was created with 3 bigfiletablespaces • Test performed in Aug/Sept 2014. • Exadata X4-2 High capacity full rack • ZS3-2 with 2 controllers, 8 trays of disk • Eight compute nodes of Exadata • Oracle 12.1.0.1 DB standard install of Exadata • * A mix of DOP used: 296, 256, 192 Oracle Confidential – Internal

  35. Best Practices in Solving Performance Issues • When there is an underperforming SQL in RDF data loading, inference, or query operations, check: • Have you gathered statistics? • APIs: export_model_stats,export_entailment_stats, export _network_stats, import_model_stats, import_entailment_stats, import_network_stats • Have you tried parallel execution? • Balanced hardware is key. • Have you tried dynamic sampling? (Level 6, 8, 11) • Is there a lack of indexes (including text index)? • DO NOT just add indexes without careful & thorough testing

  36. Best Practices in Solving Performance Issues (2) • When there is an underperforming SQL in RDF data loading, inference, or query operations, check: • Have you looked at the plan? • Is it possible to write the same query in a different way? • Is it possible to simplify? • Simpler queries  Better chance to find more efficient ways to execute • Tweak plan through hints • Send a small, reproducible test case with the execution plan to Oracle Support or post it on the Forum

  37. Best Practices in Solving Performance Issues (3) • Find the top thread(s) in Java VM • Are there excessive GC activities? • Try –XX:+UseParallelGC, -XX:+UseConcMarkSweepGC, … • Has the heap size been set properly? • Try larger heap size, analyze heap by performing a heap dump • Send a small, reproducible test case with the thread dump to Oracle Support or post it on the Forum

  38. Cool Ongoing Activities: • Enable Oracle Cloud Services: Oracle Social Network • Integration with Oracle business applications and middleware • Ongoing support for RDF Graph on all major platforms • Relational Database • NoSQL Database • Big Data (Hadoop) • Cloud

  39. Appendix

  40. W3C Semantic Technology Stack • Core Technologies • URI • Uniform resource identifier • RDF • Resource description framework • RDFS • RDF Schema • OWL • Web ontology language http://www.w3.org/2007/03/layerCake.svg

  41. Subject Predicate Object What is RDF • A graph data model for web resources and their relationships • The graph can be serialized into • - RDF/XML, N3, N-TRIPLE, … • Construction unit: Triple • (or assertion, or fact) • <http://foobar> <:produces> <:mp3> • Quads (named graphs) add context, provenance, identification, etc. to assertions • <http://foobar> <:produces> <:mp3 > <:ProductGraph> http://www.foobar.com http://…/locatedIn “CA” http://…/produce http://www.foobar.com/products/mp3 http://…/customerOf http://www.oracle.com http://…/uses http://…/produce http://www.oracle.com/products/RDF

More Related