Skip to content
lvca edited this page Dec 22, 2012 · 3 revisions

Working with indexes

Introduction

OrientDB supports indexes by using an innovative algorithm: the MVRB-Tree. It has the best of:

  • B+Tree (the most used by RDBMS and NoSQL products) and
  • Red-Black Tree

This means fast insertion and update and super fast lookup.

Indexes can be handled like classes (or tables for RDBMS users) using the SQL language and prefixing with "index:" the index name. The index is like a class (or table) with 2 properties:

  • key, as the index's key
  • rid, as the RecordId that points to the record associated with the key

Index target

Indexes can be manual or automatic. Automatic indexes are automatically updated by OrientDB engine and are bound to schema properties, example "User.id". If you have a schema-less database and you want to create automatic index you need to create the class and the property before to use indexing. Manual indexes are handled by using SQL commands (see below) and can be used for arbitrary purposes.

Index types

Indexes can be of these types. The type can't be changed once created:

  • UNIQUE, doesn't allow duplicates. For composite index means uniqueness of composite keys.
  • NOTUNIQUE, allows duplicates
  • FULLTEXT, by indexing any single word of the text. It's used in query with the operator CONTAINSTEXT
  • DICTIONARY, like UNIQUE but in case the key already exists replace the record with the new one

Dictionary

Every single database has a default manual index of type "DICTIONARY" called "dictionary" with Strings as keys. This is very useful to:

  • handle root records of trees and graphs
  • handle singleton records used for configuration

Create an index

Creates a new index. To create an automatic index bound to a schema property use section "ON" of create index command or use as name the <class.property> notation. But assure to have created the schema for it before the index. See the example below.

Syntax:

    CREATE INDEX <name> [ON <class-name> (prop-names)] <type> [<key-type>]

Where:

  • name logical name of index. Can be <class>.<property> to create an automatic index bound to a schema property. In this case class is the class of the schema and property, is the property created into the class. Notice that in another case index name can't contain '.' symbol
  • class-name name of class that automatic index created for. Class with such name must already exist in database
  • prop-names comma-separated list of properties for which automatic index is created for. Property with such name must already exist in schema. If property belongs to one of the Map types (LINKMAP, EMBEDDEDMAP) you can specify will be keys or values used for index generation. Use "by key" or "by value" expressions for that, if nothing will be specified keys will be used during index creation.
  • type, between 'unique', 'notunique' and 'fulltext'
  • key-type, is the type of key (Optional). On automatic indexes is auto-determined by reading the target schema property where the index is created. If not specified for manual indexes, at run-time during the first insertion the type will be auto determined by reading the type of the class.

Examples of custom index:

    CREATE INDEX mostRecentRecords unique date

Examples of automatic index bound to the property "id" of class "User":

    CREATE PROPERTY User.id BINARY
    CREATE INDEX User.id UNIQUE

Another index for "id" property of class "User":

    CREATE INDEX indexForId ON User (id) unique

Examples of index for "thumbs" property of class "Movie".

    CREATE INDEX thumbsAuthor ON Movie (thumbs) unique;
    CREATE INDEX thumbsAuthor ON Movie (thumbs by key) unique; 
    CREATE INDEX thumbsValue ON Movie (thumbs by value) unique; 

Example of composite index

    CREATE PROPERTY Book.author STRING
    CREATE PROPERTY Book.title STRING
    CREATE PROPERTY Book.publicationYears EMBEDDEDLIST INTEGER
    CREATE INDEX books ON Book (author, title, publicationYears) unique

For more information look at Create index command.

Drop an index

Drop definitely an index. Linked records will be not removed. Syntax:

    DROP INDEX <name>

Where:

  • name of the index to drop

For more information look at Drop index command.

Lookup

Returns all the records with the requested key.

    select from index:<index-name> where key = <key>

Example:

    select from index:dictionary where key = 'Luke'

Put an entry

Inserts a new entry in the index with key and rid.

    insert into index:<index-name> (key,rid) values (<key>,<rid>)

Example:

    insert into index:dictionary (key,rid) values ('Luke',#10:4)

Query range

Retrieves the key ranges between min and max.

    select from index:<index-name> where key between <min> and <max>

Example:

    select from index:coordinates where key between 10.3 and 10.7

Remove entries by key

Deletes all the entries with the requested key.

    delete from index:<index-name> where key = <key>

Example:

    delete from index:addressbook where key = 'Luke'

Remove an entry

Deletes an entry by passing key and rid. Returns true if removed, otherwise false if the entry wasn't found.

    delete from index:<index-name> where key = <key> and rid = <rid>

Example:

    delete from index:dictionary where key = 'Luke' and rid = #10:4

Remove all references to a record

Removes all the entries with the rid passed.

    delete from index:<index-name> where rid = <rid>

Example:

    delete from index:dictionary where rid = #10:4

Count all the entries

Returns the size of the index as entries.

    select count(*) as size from index:<index-name>

Example:

    select count(*) as size from index:dictionary

Retrieve all the keys

Retrieves all the keys of the index.

    select key from index:<index-name>

Example:

    select key from index:dictionary

Retrieve all the entries

Retrieves all the entries of the index as pairs key and rid.

    select key, value from index:<index-name>

Example:

    select key, value from index:dictionary

Clear the index

Removes all the entries. The index will be empty after this call. This removes all the entries of an index.

    delete from index:<index-name>

Example:

    delete from index:dictionary

Composite keys

You can do the same operations with composite indexes.

Composite key is a collection of values by its nature, so syntactically it is defined as collection. For example, if we have class book, and index by its three fields: author, title and publication year. So, we can use following query for lookup book:

    select from index:books where key = ["Donald Knuth", "The Art of Computer Programming", 1968]

Or for lookup book with range of publication year:

    select from index:books where key between ["Donald Knuth", "The Art of Computer Programming", 1960] and ["Donald Knuth", "The Art of Computer Programming", 2000]

Partial match search

This is mechanism that allows searching index record by several first fields of its composite key. In this case rest fields with undefined value can have any value in result.

For example, if we don't when book has been published, we can throw away publication year field from query. So, result of following query will be all books with this author and title and any publication year

    select from index:books where key = ["Author", "The Art of Computer Programming"]

If we also don't know title, we can keep only author field in query. Result of following query will be all books of this author.

    select from index:books where key = ["Donald Knuth"]

Or equivalent

    select from index:books where key = "Donald Knuth"

Direct insertion for composite indexes

Unsupported yet.

Custom keys

OrientDB since release 1.0 supports custom keys for indexes. This could give a huge improvement if you want to minimize memory used using your own serializer.

Below an example to handle SHA-256 data as binary keys without using a STRING to represent it saving disk space, cpu and memory.

Create your own type

    public static class ComparableBinary implements Comparable<ComparableBinary> {
      private byte[]	value;
    
      public ComparableBinary(byte[] buffer) {
        value = buffer;
      }
    
      public int compareTo(ComparableBinary o) {
        final int size = value.length;
        for (int i = 0; i < size; ++i) {
          if (value[i] > o.value[i])
    	return 1;
          else if (value[i] < o.value[i])
            return -1;
        }
        return 0;
      }
    
      public byte[] toByteArray() {
        return value;
      }
    }

Create your own binary serializer

    public static class OHash256Serializer implements OBinarySerializer<ComparableBinary> {
    
      public static final OBinaryTypeSerializer INSTANCE = new OBinaryTypeSerializer();
      public static final byte ID = 100;
      public static final int LENGTH = 32;
    
      public int getObjectSize(final int length) {
        return length;
      }
    
      public int getObjectSize(final ComparableBinary object) {
        return object.toByteArray().length;
      }
    
      public void serialize(final ComparableBinary object, final byte[] stream, final int startPosition) {
        final byte[] buffer = object.toByteArray();
        System.arraycopy(buffer, 0, stream, startPosition, buffer.length);
      }
    
      public ComparableBinary deserialize(final byte[] stream, final int startPosition) {
        final byte[] buffer = Arrays.copyOfRange(stream, startPosition, startPosition + LENGTH);
        return new ComparableBinary(buffer);
      }
    
      public int getObjectSize(byte[] stream, int startPosition) {
        return LENGTH;
      }
    
      public byte getId() {
        return ID;
      }
    }

Register your serializer

    OBinarySerializerFactory.INSTANCE.registerSerializer(new OHash256Serializer(), null);
    index = database.getMetadata().getIndexManager().createIndex("custom-hash", "UNIQUE", new ORuntimeKeyIndexDefinition(OHash256Serializer.ID), null, null);

Usage

    ComparableBinary key1 = new ComparableBinary(new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1 });
    ODocument doc1 = new ODocument().field("k", "key1");
    index.put(key1, doc1);

Tips and Tricks

Retrieve the list of indexes

Since OrientDB 1.0:

    select flatten(indexes) from cluster:0

Before OrientDB 1.0:

    select flatten(indexes) from #0:2 
Clone this wiki locally