1 / 30

.NET Database Technologies: Using NoSQL databases

.NET Database Technologies: Using NoSQL databases. NoSQL – “Not only SQL”. Alternatives to the ubiquitous relational database which may be superior in specific application scenarios Object-oriented databases (ODBMS) They came, they saw, they....

karik
Download Presentation

.NET Database Technologies: Using NoSQL databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. .NET Database Technologies: Using NoSQL databases

  2. NoSQL – “Not only SQL” • Alternatives to the ubiquitous relational database which may be superior in specific application scenarios • Object-oriented databases (ODBMS) • They came, they saw, they.... • ...didn’t conquer, but they are still around • NoSQL databases • The new kids on the block • General term applied to a range of different non-relational database systems • Largely emerging to meet the needs of large-scale Web 2.0 applications

  3. Object-oriented databases • ODBMSs use the same data model as object-oriented programming languages • no object-relational impedance mismatch due to a uniform model • An object database combines the features of an object-oriented language and a DBMS (language binding) • treat data as objects • object identity • attributes and methods • relationships between objects • extensible type hierarchy • inheritance, overloading and overriding as well as customised types

  4. ODBMS history • Object Database Manifesto • Paper published in 1989 (Atkinson et. al) • Some ODBMS products • Early 1990s: Gemstone, Objectivity • Late 1990s: Versant, ObjectStore, Poet , Matisse • 2000s: db4o, Cache • ODMG (Object Data Management Group) • 1993: ODMG 1.0 standard • 1997: ODMG 2.0 • 1999: ODMG 3.0, then ODMG disbanded • 2005: ODMG reformed, working towards new standard

  5. ODMG • Object Database ManagementGroup (ODMG) founded in 1991 • standardisation body including all majorODBMS vendors • Define a standard to increase the portability across different ODBMS products • Mirroring the SQL standard for RDBMS • Object Model • Object Definition Language (ODL) • Object Query Language (OQL) • language bindings • C++, Smalltalk and Java bindings

  6. Characteristics of ODBMS • Support complex data models with no mapping issues • Tight integration with an object-oriented programming language (persistent programming language) • High performance in suitableapplication scenarios • Different products scale fromsmall-footprint embedded db (db4o) to large-scale highly-concurrent systems (e.g. Versant V/OD)

  7. Persistence patterns and ODBMS • Some of Fowler’s patterns are specific to the use of a relational database, e.g. • Data Mapper • Foreign Key Mapping • Metadata Mapping • Single-table Inheritance, etc. • Some are not specific to the data storage model and are relevant when using an ODBMS, e.g. • Identity Map • Unit of Work • Repository • Lazy-Loading

  8. db4o • Open-source object-database engine • Now owned by Versant • Complements their own V/OD product • Can be used in embedded or client-server modes • Embed in application simply by including DLLs • Native object database • Stores .NET (or Java) objects directly with no special requirements on classes • Other ODBMSs (e.g. V/OD) require classes to be marked as persistent through bytecode manipulation and also store class definitions • Tight integration with application, but trade-off in limited ad-hoc querying and reporting • Can replicate data to relational database if required

  9. IObjectContainer • IObjectContainer interface is implemented by objects which provide access to database • IObjectContainer is roughly equivalent to EF ObjectContext • Unit of Work pattern if transparent persistence is enabled (see later) • Can access DB in embedded mode (direct file access) or client-server mode (local or remote) • IObjectServer instance required in client-server mode • IObjectContainer instances created by factory classes, e.g. Db40Embedded • Queries on IObjectContainer return IObjectSet (except LINQ queries)

  10. Viewing data and ad-hoc querying • ObjectManager Enterprise • Visual Studio plug-in • Browsing and drag-and-drop queries • LINQPad • Need to include db4o DLLs and namespaces for stored classes • Executes LINQ queries and visualises results

  11. db4o query APIs • Query-by-example (QBE) • Very limited - no comparisons, ranges, etc. • Simple Object Data Access (SODA) • Build query by navigating graph and adding constraints to nodes • Native Queries • Expressed completely in programming language • Type-safe • Optimised to SODA query at runtime if possible • LINQ • .NET version, not in Java (obviously)

  12. Activation • Objects are stored in DB as an object graph • If db4o configured to cascade-on-activate (eager loading) then retrieving one object could potentially load a large number of related objects • Fixed activation depth limits depth of traversal of graph when retrieving objects • Default value is 5 • Can then explicitly activate related objects when needed • Lazy loading can be configured with transparent activation • Classes need to be “instrumented” at load time by running Db4oTool.exe • Code injected into assembly so that classes implement IActivatable interface

  13. Update depth • Similar considerations apply to updates • Storing an updated object could cause unnecessary updates to related objects • Fixed update depth limits depth of traversal of graph when retrieving objects • Default value is 1 • Can configure transparent persistence which allows changes to be tracked • Only changed objects are updated in database • Behaves like change tracking in, for example, Entity Framework • Unit of Work

  14. PI? • Stores POCOs without any need for mapping, so yes • Transparent Activation requires that classes implement a specific interface • But this is done at build time so domain classes don’t need any specific code • Has parallels with dynamic proxies in ORMs: • Classes are instances of domain classes, which have been modified ‘under the hood’ at build-time • Compare with dynamic proxy class which derive from domain classes and are created ‘under the hood’ at run-time

  15. Further reading • www.odbms.org • Resource portal • Db4o Tutorial • included in product download • The Definitive Guide to db4o (Apress)

  16. NoSQL databases • New breed of databases that are appearing largely in response to the limitations of existing relational databases • Typically: • Support massive data storage (petabyte+) • Distribute storage and processing across multiple servers • Contrast in architecture and priorities compared to relational databases • Hence term NoSQL • “Not only SQL” – absence of SQL is not a requirement

  17. NoSQL features • Wide variety of implementations, but some features are common to many of them: • Schema-less • Shared-nothing architecture • Elasticity • Sharding and asynchronous replication • BASE, not ACID • Basically Available • Soft state • Eventually consistent

  18. MapReduce • Algorithm for dividing a work load into units suitable for parallel processing • Useful for queries against large sets of data: the query can be distributed to 100’s or 1000’s of nodes, each of which works on a subset of the target data • The results are then merged together, ultimately yielding a single “answer” to the original query • Example: get total word count of a large number of documents • Map: calculate word count of each document • Each node works on a subset of the overall data set • Results emitted to intermediate storage • Reduce: calculate total of intermediate results

  19. Brewer’s CAP theorem • Can optimize for only two of three priorities in a distributed database: • Consistency • All clients have same view of the data • Requires atomicity, transaction isolation • Availability • Every request received by a non-failing node must result in a response • Partition Tolerance • Partitions happen if certain nodes can’t communicate • No set of failures less than total network failure is allowed to cause the system to respond incorrectly

  20. Implications of CAP theorem • Any two properties can be achieved • CP • If messages between nodes are lost then system waits • Possible that no response returned at all • No inconsistent data returned to client • CA • No partitions, system will always respond and data is consistent • AP • Response always returned even if some messages between nodes • Different nodes may have different views of the data

  21. Implications of CAP theorem • Choose a database whose priorities match the application http://blog.nahurst.com/visual-guide-to-nosql-systems

  22. Using a NoSQL database in a .NET application • Application typically makes connection to remote cluster • Some (but not many) NoSQL databases are supported by native .NET clients • Handle “mapping” from .NET objects to data model • Many NoSQL databases are accessed through a REST interface • Application must construct request and handle response format, e.g. JSON • Application can be written in any suitable language • Azure Table Storage is Microsoft’s NoSQL storage for cloud-based applications • However the data is accessed, you need to understand the data model, which will be significantly different from a typical relational database or object model

  23. NoSQL database types and examples • Key/value Databases • These manage a simple value or row, indexed by a key • e.g. Voldemort, Vertica • Big table Databases • “a sparse, distributed, persistent multidimensional sorted map” • e.g. Google BigTable, Azure Table Storage, Amazon SimpleDB • Document Databases • Multi-field documents (or objects) with JSON access • e.g. MongoDB, RavenDB (.NET specific), CouchDB • Graph Databases • Manage nodes, edges, and properties • e.g. Neo4j, sones

  24. MongoDB • Scalable, high-performance, open source, document-oriented database • Stores JSON-style (actually BSON) documents with dynamic schema • Replication, high-availability and auto-sharding • Supports document-based queries and map/reduce • Command line tools : • mongod – starts server as a service or daemon • mongo – client shell • Store documents defined as JSON • Retrieved documents form query displayed as JSON

  25. MongoDB and HTTP • Admin console at http://<server name>:28017 • REST interface on http://<server name>:28018 • Enabled by starting server with mongod --rest • Server responds to RESTful HTTP requests, e.g. • http://127.0.0.1:28017/company/Employee/?filter_Name=Fernando • Response is in JSON format • Could be consumed by client-side code in Ajax application

  26. MongoDB .NET driver • Can access documents as instances of Document class • Represents document as key-value pairs • Or, can serialize POCOs to database format (JSON) • Deserialize database documents to POCOs • Supports LINQ queries • MapReduce queries can be expressed as LINQ queries

  27. MongoDB schema design • Collections are essentially named groupings of documents • Roughly equivalent to relational database tables • Less "normalization" than a relational schema because there are no server-side joins • Generally, you will want one database collection for each of your top level objects • Don’t want a collection for every "class" - instead, embed objects relational document

  28. Document example • Save: • Query: http://www.10gen.com/video/mongosv2010/schemadesign

  29. MongoDB in C# applications - PI? • Up to a point • Collection class needs Id property of a specific type (MongoDB.Oid) • Object model needs to be designed with document schema in mind

  30. Further reading • http://nosql-database.org/ • http://www.nosqlpedia.com/ • http://www.mongodb.org/ • http://www.codeproject.com/KB/database/MongoDBCS.aspx • Nice code example for C# and MongoDB

More Related