Skip to content
lvca edited this page Dec 14, 2012 · 2 revisions

Using Gremlin with OrientDB

<wiki:toc max_depth="4" />

Introduction

Gremlin is a language specialized to work with Property Graphs. Gremlin is part of TinkerPop Open Source products. For more information:

To know more about Gremlin and TinkerPop's products subscribe to the Gremlin Group.

Get Started

Launch the gremlin.sh (or gremlin.bat on Windows OS) console script located in bin directory:

    > gremlin.bat
    
             \,,,/
             (o o)
    -----oOOo-(_)-oOOo-----

Open the graph database

Before to play with Gremlin you need a valid OrientGraph instance that points to a OrientDB database. To know all the database types look at Storage types.

When you're working with a local or memory database if the database not exists it's created automatically. Using the remote connection you need to create the database on the target server before to use it. This is due to security restrictions.

Once created the OrientGraph instance with a proper URL is necessary to assign it to a variable. Gremlin is written in Groovy, so it supports all the Groovy syntax and both can be mixed to create very powerful scripts!

Example with a local database (see below for more information about it):

    gremlin> g = new OrientGraph("local:/home/gremlin/db/demo");
    ==>orientgraph[local:/home/gremlin/db/demo]

Some useful links:

Working with local database

This is the most used mode. The console opens and locks the database for exclusive use. Doesn't require to start a OrientDB Server.

    gremlin> g = new OrientGraph("local:/home/gremlin/db/demo");
    ==>orientgraph[local:/home/gremlin/db/demo]

Working with remote database

Open a database on a remote server. Assure the server is up and running. To start the server just launch server.sh (or server.bat on Windows OS) script. For more information look at OrientDB Server

    gremlin> g = new OrientGraph("remote:localhost/demo");
    ==>orientgraph[remote:localhost/demo]

Working with in-memory database

In this mode the database is volatile and all the changes will be not persistent. Use this in cluster configuration (the database life is assured by the cluster itself) or just for test.

    gremlin> g = new OrientGraph("memory:demo");
    ==>orientgraph[memory:demo]

Use the security

OrientDB supports the security by creating multiple users and roles to associate privileges. To know more look at Security. To open the graph database with a different user than default pass user and password as additional parameters:

    gremlin> g = new OrientGraph("memory:demo", "reader", "reader");
    ==>orientgraph[memory:demo]

Create a new Vertex

To create a new vertex use the addVertex() method. The vertex will be created and the unique id will be displayed as return value.

    g.addVertex();
    ==>v[#5:0]

Create an edge =

To create a new edge between two vertices use the addEdge(v1, v2, label) method. The edge will be created with the label specified.

In the example below 2 vertices are created and assigned to a variable (Gremlin is based on Groovy), then an edge is created between them.

    gremlin> v1 = g.addVertex();
    ==>v[#5:0]
    
    gremlin> v2 = g.addVertex();
    ==>v[#5:1]
    
    gremlin> e = g.addEdge(v1, v2, 'friend');
    ==>e[#6:0][#5:0-friend->#5:1]

Retrieve a vertex

To retrieve a vertex by its ID, use the v(id) method passing the RecordId as argument (with or without the prefix '#'). This example retrieves the first vertex created in the upon example.

    gremlin> g.v('5:0')
    ==>v[#5:0]

Get all the vertices

To retrieve all the vertices in the opened graph use .V (V in upper-case):

    gremlin> g.V
    ==>v[#5:0]
    ==>v[#5:1]

Retrieve an edge

Retrieving an edge it's very similar to [use the e(id) method passing the Concepts#RecordId RecordId as argument (with or without the prefix '#'). This example retrieves the first edge created in the upon example.

    gremlin> g.e('6:0')
    ==>e[#6:0][#5:0-friend->#5:1]

Get all the edges

To retrieve all the edges in the opened graph use .E (E in upper-case):

    gremlin> g.E
    ==>e[#6:0][#5:0-friend->#5:1]

Traversal

The power of Gremlin is on traversal. Once you have a graph loaded in your database you can traverse it in many ways.

Basic Traversal

To display all the outgoing edges of the first vertex just created postpone the .outE at the vertex. Example:

    gremlin> v1.outE
    ==>e[#6:0][#5:0-friend->#5:1]

And to display all the incoming edges of the second vertex created in the previous examples postpone the .inE at the vertex. Example:

    gremlin> v2.inE
    ==>e[#6:0][#5:0-friend->#5:1]

In this case the edge is the same because it's the outgoing of 5:0 and the goes up to 5:1 where is the incoming edge.

For more information look at the Basic Traversal with Gremlin.

Filter results

This examples returns all the outgoing edges of all the vertices with label equals to 'friend'.

    gremlin> g.V.outE('friend')
    ==>e[#6:0][#5:0-friend->#5:1]

Close the database =

To close a graph use the shutdown() method:

    gremlin> g.shutdown()
    ==>null

This is not strictly necessary because OrientDB always closes the database when the Gremlin console quits.

Create complex paths

Gremlin allows to concatenate expressions to create more complex traversal in a single line:

    v1.outE.inV

Of course this could be much more complex. Below an examples with the graph taken from the official documentation:

    g = new OrientGraph('memory:test')
    
    // calculate basic collaborative filtering for vertex 1
    m = [:]
    g.v(1).out('likes').in('likes').out('likes').groupCount(m)
    m.sort{a,b -> a.value <=> b.value}
    
    // calculate the primary eigenvector (eigenvector centrality) of a graph
    m = [:]; c = 0;
    g.V.out.groupCount(m).loop(2){c++ < 1000}
    m.sort{a,b -> a.value <=> b.value}

Passing input parameters

Some Gremlin expressions require declaration of input parameters to be run. This is the case, for example, of bound variables, as described in JSR223 Gremlin Script Engine. OrientDB has enabled a mechanism to pass variables to a Gremlin pipeline declared in a command as described below:

    Map<String, Object> params = new HashMap<String, Object>();
    params.put("map1", new HashMap());
    params.put("map2", new HashMap());
    db.command(new OCommandSQL("select gremlin('
    current.as('id').outE.label.groupCount(map1).optional('id').sideEffect{map2=it.map();map2+=map1;}
    ')")).execute(params);

Declaring output

In the simplest case, the output of the last step (https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps) in the Gremlin pipeline corresponds to the output of the overall Gremlin expression. However, it is possible to instruct the Gremlin engine to consider any of the input variables as output. This can be declared as:

    Map<String, Object> params = new HashMap<String, Object>();
    params.put("map1", new HashMap());
    params.put("map2", new HashMap());
    params.put("output", "map2");
    db.command(new OCommandSQL("select gremlin('
    current.as('id').outE.label.groupCount(map1).optional('id').sideEffect{map2=it.map();map2+=map1;}
    ')")).execute(params);

There are more possibilities to define the output in Gremlin pipelines so this mechanism is expected to be extended in the future. Please, contact OrientDB mailing list to discuss customized outputs.

Conclusions

Now you learned how to use Gremlin on top of OrientDB the best place to go in deep with this powerful language is the Gremlin WiKi.

Clone this wiki locally