-
Notifications
You must be signed in to change notification settings - Fork 239
Pattern Match Pattern
Many graph applications have to do with identifying patterns in the graph. That is, identifying sub-graphs that match a particular topology and/or feature set. In single-relational graphs, usually a topology is only considered. For graphs that are labeled and have key/value pairs on their elements (i.e. property graphs), then features on that topology must be considered as well to be generally useful.
The table
step is useful for cataloging parts of the graph that match a particular pattern. The best way to explain its use is with due respect to the graph pattern match language SPARQL. An example SPARQL query is provided below. The query will return those vertices that Marko knows along with their respective creations
SELECT ?x ?y WHERE {
marko knows ?x
?x created ?y
}
SPARQL is an excellent, intuitive language for defining patterns in an RDF graph (i.e. multi-relational graph). Unfortunately, RDF (and thus, SPARQL) does not support key/value pairs on the elements of the graph—it does not support property graphs (save through complex and verbose reification constructs). As such, SPARQL does not easily map over to the property graph domain.
In Gremlin, the above query is accomplished using the table
step and respective TablePipe
and Table
object. First lets load our toy graph diagrammed in Defining a Property Graph.
gremlin> g = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
In the example below, a traversal is executed starting from marko (vertex 1). The table
step will select the 1st and 2nd parts of the paths that reach it an insert them into the Table
t
as rows.
gremlin> t = new Table()
gremlin> g.v(1).out('knows').as('x').out('created').as('y').table(t)
==>v[5]
==>v[3]
gremlin> t
==>[x:v[4], y:v[5]]
==>[x:v[4], y:v[3]]
As such, components of the traversals history (path) are saved into the table and serve as the desired return bindings. Given the SPARQL query from previous, the columns of t
are equivalent to ?x
and ?y
.
Finally, to provide the developer flexibility, its possible to evaluate closures over those path components prior to inserting them into the table.
gremlin> t = new Table()
gremlin> g.v(1).out('knows').as('x').out('created').as('y').table(t){it.name}{it.name}
==>v[5]
==>v[3]
gremlin> t
==>[x:josh, y:ripple]
==>[x:josh, y:lop]
In the above Gremlin snippets, we found all the people that Marko knows who have created a product. Specifically, Marko knows Josh and Josh has created both Ripple and LoP.
For those familiar with Pipes, TablePipe
is a SideEffectPipe
and as such, can be “capped.”
gremlin> g.v(1).out('knows').as('x').out('created').as('y').table(new Table()).cap() >> 1
==>[x:v[4], y:v[5]]
==>[x:v[4], y:v[3]]
Its possible to use table
match any pattern within a pipeline expression. Thus, its possible to use table
and loop
within the same expression and get meaningful results.
gremlin> t = new Table()
gremlin> g.v(1).as('x').out.as('y').table(t).loop('x'){it.loops < 3}
==>v[5]
==>v[3]
gremlin> t
==>[x:v[1], y:v[2]]
==>[x:v[1], y:v[3]]
==>[x:v[1], y:v[4]]
==>[x:v[1], y:v[5]]
==>[x:v[1], y:v[3]]
gremlin> t.apply{it.age}{it.name}
==>[29, vadas]
==>[29, lop]
==>[29, josh]
==>[29, ripple]
==>[29, lop]
There are numerous methods that are offered by Table
. Here are some examples of its use:
gremlin> t
==>[x:josh, y:ripple]
==>[x:josh, y:lop]
gremlin> t.get(0,0)
==>josh
gremlin> t.get(0,'x')
==>josh
gremlin> t.getRow(0)
==>josh
==>ripple
gremlin> t.getColumn(1)
==>ripple
==>lop
gremlin> t.getRow(1).getColumn('y')
==>lop
gremlin> t.getRow(1).getColumn(1)
==>lop