Skip to content
lvca edited this page Dec 22, 2012 · 3 revisions

Graph Database

<wiki:toc max_depth="4" />

Introduction

OrientDB is a Document-Graph DBMS because has the features of both Document DBMS and Graph DBMS. This section explores the Graph capabilities of OrientDB. To better understand how OrientDB compares to other GraphDB in features and performance look at this comparison. To know more about Graph Databases look at the following presentations:

NOTE: _If you plan to use OrientDB as pure Graph Database look at the Graph Ed Tutorial.

The principal feature of a GraphDB is the ability to handle relationships. A GraphDB can traverse thousands of edges at a fraction of the cost of the Relational JOINs because relationships are direct links between document/nodes.

The GraphDB has few but strong concepts:

  • Vertex or Node, the linked entity. Vertexes can have properties
  • Edge or Arc, as the link between Vertexes. Edges can have properties and can be unidirectional or bidirectional
  • Property, is a value to assign to Vertexes and Edges. A property has a name and a value

In this diagram on the left we represent the simplified domain of a PetShop application using the UML class diagram notation following the Object Oriented paradigm. How to model it using the OrientDB's graph model?

OrientDB supports a superset of TinkerPop Blueprints model, the Property Graph Model. The difference is that with OrientDB you can create custom types of vertices and edges. In this example the orange classes are Vertex types, while the green one is the Edge type.

To cross the graph you can use the powerful Gremlin language, the OrientDB's SQL (an extended version of SQL with new operators to work with Trees and Graphs or a mix of both.

Extract, per each food, the list of animals that eat that food:

    > SELECT name, in.out.in.name FROM Food
    
    Meat, [Gaudì,Kelly]

Extract the name of the animals that eat less than 1Kg of meat per day:

    > SELECT name FROM Animal WHERE out.kgPerDay < 1 AND out[@class='Eat'].in.name = 'Meat'
    
    Gaudì

Extract the name of the animals that eat at 10 AM:

    > SELECT FROM Animal WHERE out[@class='Eat'].whenAsHours CONTAINS 10
    
    Kelly

Usage of SQL + Gremlin to extract all the outgoing vertices connected to 'Gaudì' Animal:

    > SELECT GREMLIN("current.out") FROM Animal where name = 'Gaudì'
    
    Kelly

Usage

You can work with graphs using three approaches:

How is it implemented?

    +--------------+                                       +--------------+
    |              | out       * +------------+ in         |              |
    |              |------------>|            |----------->|              |
    |   V(ertex)   |             |   E(dge)   |            |   V(ertex)   | 
    |              |<------------|            |<-----------|              |
    |              |         out +------------+         in |              |
    +--------------+                                       +--------------+

Since the internal Document-Graph architecture of OrientDB is flexible and fast with links, the Object Database, Graph Database and Key/Value Database are all built on top of the Document Database interface (ODatabaseDocumentTx class). In reality there are other layers behind the Document Database but that APIs are not yet documented and probably are too raw to be used for a real use case (unless you're writing a distributed file system)...

The GraphDB stores vertices in the class OGraphVertex just called V and edges in OGraphEdge just called E. Both classes extend OGraphElement that extends ODocumentWrapper.

So you can always query your graph using the OrientDB SQL language and act at lower level if you maintain the constraints of the GraphDB itself:

  • Edges must be always bi-directionals
  • OGraphVertex (also called V) instances store outgoing edges in the property "out" and input edges in "in". Both the properties are declared as OType.LINKSET
  • OGraphEdge (also called E) instances store outgoing vertex in the property "out" and incoming vertex in "in". Both the properties are declared as OType.LINK

If you're using the Blueprints API then:

Manipulate graphs as documents

Since the graph model has been built on top of OrientDB, you can use all the OrientDB features to manipulate graphs even without the graph API.

SQL

Simple query against vertexes:

    SELECT FROM V WHERE out CONTAINS ( label = 'knows' )

Retrieves all the vertices connected to the outgoing edges.

    SELECT out.in AS labels FROM V WHERE out CONTAINS ( label IS NOT NULL )

Traverse all the retrieved nodes with name "Tom". The traversal cross the out edges but only where the linked (in) Vertex has label "Ferrari" and then forward to the:

    SELECT out[in.label = 'Ferrari'] FROM v WHERE name = 'Tom'

To know more about crossing graph in SQL projections look at SQL Projections.

Or if you've installed the Graph(Ed) you can mix Gremlin and SQL together:

Simple query against vertexes:

    SELECT GREMLIN( 'current.outE.inV' ) AS labels FROM V WHERE out CONTAINS ( label IS NOT NULL )

Get all the friends you know by traversing up 7th level the friends of your social network profile:

    TRAVERSE friends FROM ( SELECT FROM Profile WHERE name = 'Jay Miner' ) WHERE $depth <= 7

Create a couple of vertices and connects them with an edge:

    CREATE VERTEX driver SET name = 'Magnum PI'
    CREATE VERTEX car SET brand = 'Ferrari'
    CREATE EDGE drive FROM #10:2 TO #13:6 SET since = 1980

For more information about TRAVERSE command look at Traverse operator.

Do you have doubts if the Graph Model fits at the best your needs? Look at [http://code.google.com/p/orient/wiki/UseCases#Document_or_Graph_model?]

Clone this wiki locally