Skip to content

Graph Database Tinkerpop

lvca edited this page Dec 22, 2012 · 3 revisions

Graph Database and Tinkerpop

<wiki:toc max_depth="4" />

Requirements

To use the TinkerPop Graph Database interface you must include these jars in your classpath:

    orient-commons-*.jar
    orientdb-core-*.jar
    blueprints-core-*.jar
    blueprints-orient-graph-*.jar
    pipes-*.jar

If you're using the TinkerPop Graph Database interface connected to a remote server (not local/embedded mode) include also:

    orientdb-client-*.jar
    orientdb-enterprise-*.jar

To also use the TinkerPop Gremlin language include also:

    gremlin-java-*.jar
    gremlin-groovy-*.jar
    groovy-*.jar

Introduction

Even if OrientDB already provides own APIs to handle graphs in easy way, starting from release 0.9.22 OrientDB provides an implementation of the Tinkerpop Blueprints APIs. Tinkerpop is a complete stack of projects to handle Graphs:

  • Blueprints provides a collection of interfaces and implementations to common, complex data structures. In short, Blueprints provides a one stop shop for implemented interfaces to help developers create software without being tied to particular underlying data management systems.
  • Gremlin is a Turing-complete, graph-based programming language designed for key/value-pair multi-relational graphs. Gremlin makes use of an XPath-like syntax to support complex graph traversals. This language has application in the areas of graph query, analysis, and manipulation.
  • Rexster is a RESTful graph shell that exposes any Blueprints graph as a standalone server. Extensions support standard traversal goals such as search, score, rank, and, in concert, recommendation. Rexster makes extensive use of Blueprints, Pipes, and Gremlin. In this way its possible to run Rexster over various graph systems. To configure Rexster to work with OrientDB follow this guide: configuration.
  • Pipes is a graph-based data flow framework for Java 1.6+. A process graph is composed of a set of process vertices connected to one another by a set of communication edges. Pipes supports the splitting, merging, and transformation of data from input to output.

Get started

Download the Graph Ed Tutorial.

The kind of database used depends by the Database URL used.

  • Persistent embedded GraphDB. OrientDB is linked to the application as JAR (No network transfer). Use local as prefix. Example "local:/tmp/graph/db"
  • Persistent remote GraphDB. Uses a binary protocol to send and receive data from a remote OrientDB server. Use remote as prefix. Example "remote:localhost/db". It requires a OrientDB Server instance is up and running at the specified address (localhost in this case). Remote database can be persistent or in-memory as well.
  • In-Memory embedded GraphDB. Keeps all the data only in memory. Use memory as prefix. Example "memory:test"

OrientDB provides 2 implementations of Graph interface:

  • Transactional GraphDB, the default. The class is OrientGraph. Every single operation is always executed inside a transaction. If it's not yet started a new transaction is begun. To explicit commit or rollback a transaction use the stopTransaction(). During a transaction all the changes will be kept in memory until the transaction ends (see below for the usage). When the method stopTransaction(Conclusion.SUCCESS) returns, all the transaction changes are stored to the disk.
  • Non-transactional GraphDB (default). The class is OrientBatchGraph. In this case each operation is atomic and data is updated to the disk at each operation. When the method returns the underlying storage is updated. Use this for bulk insert ans massive operations.

Both classes extends OrientBaseGraph.

Work with GraphDB

Before to work with a graph you need an instance of OrientGraph class. The constructor gets a URL that is the location of database. If the database already exists, it will be opened, otherwise will be created.

Remember always to close the graph once done using the .shutdown() method.

Example:

    OrientGraph graph = null;
    try {
      graph = new OrientGraph("local:C:/temp/graph/db");
      ...
    
    }finally{
      if( graph != null )
        graph.shutdown();
    }

Gremlin usage

If you use GREMLIN language with OrientDB remember to initialize it with:

    OGremlinHelper.global().create()

Security

If you want to use the OrientDB security use the constructor that get the URL, user and password. To know more about OrientDB security visit Security.

Transactions

Starting from Blueprints 2.0 the transaction management can't be manual anymore. But you can commit or rollback a running transaction by calling the stopTransaction( Conclusion ) with the following conclusions:

  • SUCCESS, the transaction is committed. Example: graph.stopTransaction(Conclusion.SUCCESS)
  • FAILURE, the transaction is rollbacked. Example: graph.stopTransaction(Conclusion.FAILURE)

Changes inside a transaction will be volatile till the commit, stopTransaction(Conclusion.SUCCESS), or the close of the graph instance.

Full example:

    try{
      Vertex luca = graph.addVertex(null); // 1st OPERATION: IMPLICITLY BEGIN A TRANSACTION
      luca.setProperty( "name", "Luca" );
    
      Vertex marko = graph.addVertex(null);
      marko.setProperty( "name", "Marko" );
    
      Edge lucaKnowsMarko = graph.addEdge(null, luca, marko, "knows");
    
      graph.stopTransaction(Conclusion.SUCCESS);
    } catch( Exception e ) {
    
      graph.stopTransaction(Conclusion.FAILURE);
    }

Surrounding the transaction between a try/catch assure that any errors will rollback the transaction to the previous status for all the involved elements.

Work with vertexes and edges

Create a vertex

To create a new Vertex in the current Graph call the Vertex OrientGraph.addVertex(Object id) method. Note that the id parameter is ignored since OrientDB implementation assigns a unique-id once the vertex is created. To return it use Vertex.getId(). Example:

    Vertex v = graph.addVertex(null);
    System.out.println( "Created vertex: " + v.getId() );

Create an edge

An edge links two vertexes previously created. To create a new Edge in the current Graph call the Edge OrientGraph.addEdge(Object id, Vertex outVertex, Vertex inVertex, String label ) method. Note that the id parameter is ignored since OrientDB implementation assigns a unique-id once the vertex is created. To return it use Edge.getId(). outVertex is the vertex instance where the edge starts and inVertex is the vertex instance where the edge ends. label is the edge's label. Null to not assign it. Example:

    Vertex luca = graph.addVertex(null);
    luca.setProperty( "name", "Luca" );
    
    Vertex marko = graph.addVertex(null);
    marko.setProperty( "name", "Marko" );
    
    Edge lucaKnowsMarko = graph.addEdge(null, luca, marko, "knows");
    System.out.println( "Created edge: " + e.getId() );

Remove a vertex

To remove a vertex from the current Graph call the void OrientGraph.removeVertex(Vertex vertex) method. The vertex will be disconnected from the graph and then removed. Disconnection means that all the vertex's edges will be deleted as well. Example:

    graph.removeVertex(luca);

Remove an edge

To remove an edge from the current Graph call the void OrientGraph.removeEdge(Edge edge) method. The edge will be removed and the two vertexes will result not connected anymore. Example:

    graph.removeEdge(lucaKnowsMarko);

Set and get properties

Vertexes and Edges can have multiple properties where the key is a String and the value can be any supported OrientDB types.

  • To set a property use the method void setProperty(String key, Object value).
  • To get a property use the method Object getProperty(String key).
  • To get all the properties use the method Set<String> getPropertyKeys().
  • To remove a property use the method void removeProperty(String key).

Example:

    vertex2.setProperty( "x", 30.0f );
    vertex2.setProperty( "y", ((float) vertex1.getProperty( "y" )) / 2 );
    
    for( String property : vertex2.getPropertyKeys() ){
      System.out.println("Property: " + property + "=" + vertex2.getProperty( property ) );
    }
    
    vertex1.removeProperty( "y" );

Access to the underlying Graph

Since TinkerPop Blueprints API is quite raw and doesn't provide ad-hoc methods for very common use cases you could need to access to the underlying ODatabaseGraphTx object to better use the graph-engine under the hood. Commons operations are:

  • Count incoming and outgoing edges without browsing them all
  • Get incoming and outgoing vertexes without browsing the edges
  • Execute a query using SQL-like language integrated in the engine

The OrientGraph class provides the method .getRawGraph() to return the underlying native root Graph class: ODatabaseGraphTx. Follow the Graph Database Native APIs to know its usage.

Example:

      final OrientGraph graph = new OrientGraph("local:C:/temp/graph/db");
      try{
        List<OGraphVertex> result = graph.getRawGraph().query( new OSQLSynchQuery("select from ographvertex where outEdges contains ( in.label = 'knows' )"));
      } finally {
        graph.shutdown();
      }

Use the Gremlin language

Custom types

OrientDB supports custom types for vertices and edges in an Object Oriented manner. Even if this isn't supported directly by Blueprints there are some tricks to use them. First of all

Before start using custom types you have to declare this intent via API just after you creates a OrientGraph instance:

    OrientGraph graph = new OrientGraph("local:/temp/db/graph");
    graph.getRawGraph().setUseCustomTypes(true);

Since all the custom vertex types extend the OGraphVertex class and all the custom edge types extend the OGraphEdge class, the entire Blueprints stack works like a charm with custom types. When you will access to a vertex doesn't matter if it's a OGraphVertex or any sub-class of it. The same for edges.

Work with custom types

To create custom types you need to access to the raw graph database instance:

    // CREATE THE NEW CUSTOM TYPES: 'Customer' AS VERTEX and 'Sell' AS EDGE
    graph.getRawGraph().createVertexType("Product");
    graph.getRawGraph().createVertexType("Customer");
    graph.getRawGraph().createEdgeType("Sell");

Now to create a vertex and an edge of a custom class you have to pass a special string as ID with this format: "class:<class-name>". Pass this as first argument of addVertex() and addEdge() methods. Example:

    Vertex product = graph.addVertex("class:Product");
    product.setProperty( "brand", "Commodore" );
    product.setProperty( "model", "Amiga" );
    
    Vertex customer = graph.addVertex("class:Customer");
    customer.setProperty( "name", "Luca" );
    
    Edge e = graph.addEdge("class:Sell", customer, product );
    e.setProperty("price", 400);

Tuning

As reported in Access to the underlying Graph you could use the native API to speed up some operations.

Furthermore since TinkerPop Blueprints API doesn't provide a connection pool mechanism you can avoid to close and reopen the underlying database by setting this property.

    OGlobalConfiguration.STORAGE_KEEP_OPEN.setValue(Boolean.TRUE);

or by launching your application with this parameter:

    java -Dorientdb.storage.keepOpen=true ...

This avoids to close and reopen the storage every time. It will be closed automatically when the JVM exits.

Benchmarks

If you've more information about new benchmarks or you've done your own please share it on the OrientDB Group

Clone this wiki locally