Skip to content

Tutorial: Introduction to the NoSQL world

lvca edited this page Dec 29, 2012 · 4 revisions

In the last years we have witnessed an explosion of NoSQL products. On the meaning of this word has been said and written everything: it's not a campaign against the SQL language, but rather something to open the mind of developers (and others) to the new possibilities beyond the relational database.

Well, alternatives to Relational DBMS domain is always existed, but only used by niche markets, primarily in telecommunications, medical, CAD, etc. Today, fortunately, the interest in these new opportunities is increasing, not surprisingly many of the largest web companies are using a NoSQL product: Google, Amazon, Facebook, Foursquare, Twitter, Disney, etc..

So what are the reasons that drive these companies to leave the old-secure road (the Relational DBMS) for a new one? We can summarize them in: performance, scalability (often extreme), lightweight, productivity and flexibility in the management of the schema. Further analysis can be seen that these motivations are, incidentally, all the requirements of any modern Web Application. In fact, if only a few years ago a developer that designed an application had to plan hundred concurrent users, today it is not uncommon to have a potential target of thousands or millions of users.

With these assumptions must necessarily put everything at stake: on the application front, we have already seen a gradual relaxation of the framework, standards and best practices of some in favor of productivity and light; database side, however, the situation has remained more or less unchanged for over 30 years. The relational DBMS have always played the leading role from the 70s up to today, or better until yesterday. The languages ​​have evolved as well as the methodologies, but the DBMS has always been that. Someone has introduced small facelift, but the substance remains the same: tables, records, and join.

Now we realized that there is something more powerful, productive, lean, flexible and scalable relational database, or the DB NoSQL, with many variations. To do some 'clarity have been identified four broad categories:

  • Key / Value database: the model is reduced to a hash table key / value. Often it is distributed across multiple servers (DHT). Key and value types are simple. In some cases there are bucket designed as a combination of keys. The most famous products are Dynamo, Cassandra, and Berkeley DB Redis;
  • Column-oriented database: type extension is a Key / Value, where value can be a complex type. BigTable belong to this type of Google and Amazon's SimpleDB;
  • Document database: it is a more complex model of the above. Each record can have multiple fields, without necessarily defining a schema. The best known and used MongoDB and CouchDB document database;
  • Graph database (GraphDB): any type of domain model, also complex, as a graph, where each entity is a Vertex and Edge are all the reports. Vertex and Edge can have arbitrary properties and are generally indexed to speed up queries. Are GraphDB Neo4J, Sones and InfinityGraph.

Each of these categories has its own peculiarities and limitations. There is not one better than all the others, but depends on the use case in question. Moreover NoSQL means just that: Choose the best tool for your specific case. In this series of articles will look at OrientDB, NoSQL a product of new generation, open source.

Next: Tutorial: Installation

Clone this wiki locally