Skip to content

zhou/ringo

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ringo - Distributed key/value storage for immutable data
--------------------------------------------------------

Ringo is an experimental, distributed, replicating key-value store based
on consistent hashing and immutable data. Unlike many general-purpose
databases, Ringo is designed for a specific use case: For archiving
small (less than 4KB) or medium-size data items (<100MB) in real-time
so that the data can survive K - 1 disk breaks, where K is the desired
number of replicas, without any downtime, in a manner that scales to
terabytes of data. In addition to storing, Ringo should be able to
retrieve individual or small sets of data items with low latencies
(<10ms) and provide a convenient on-disk format for bulk data access.

WARNING: Ringo should not be used yet as a primary storage for critical
data. Due to the fact that Ringo treats all data immutable, data
corruption or loss should be improbable. However, Ringo needs more
testing in real-world settings before we can be reasonably sure that it
works as advertised. To make this happen, feel free to try it out, adapt 
it to your needs, and report your experiences!

If you find Ringo interesting, you might also want to check out a paper
about Amazon's Dynamo

http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf

and another Dynamo-like system, Scalaris, which is also implemented in
Erlang: http://code.google.com/p/scalaris/


Requirements
''''''''''''

Backend system (required):

- Erlang R12B or newer
- C toolchain and autotools (for building bfile)
- Lighttpd or other web server that supports SCGI

Test harness / Python interface (optional):

- pkill command (Debian package procps)
- Python
- Python module pycurl (Debian package python-pycurl)
- Python module cjson (Debian package python-cjson)


Directories
'''''''''''

doc/         Documentation

ring/        Ringo backend
ring/bfile   High-performance replacement for the Erlang's standard file 
             module
ring/src     Backend sources
ring/test    Test harness

ringogw/     Ringo web frontend
ringogw/src  Frontend sources
ringogw/web  Web interface
ringogw/py   Python interfaces for the Ringo frontend and for the Disco 
             map/reduce framework


Compiling
'''''''''

cd ringo
./compile.sh


Starting Ringo
''''''''''''''

First you need to initialize a number of virtual nodes - at least one is
required. A virtual node is defined by an empty directory whose name is
a random 128-bit integer in hexadecimal notation.

A convenience script, create_node.sh, is provided that can be used to
setup a virtual node. For instance,

ringo/create_node.sh trurl /data/ringo

sets up a new virtual node on the host trurl to the directory
/data/ringo. The script uses ssh to log in to the host. It asks for a
password unless key-based ssh authentication is properly set up (which
is recommended). You may run this script on different nodes as many
times as you like.

You need to list all hostnames that may possibly host a Ringo node
in a file called ~/.hosts.erlang. For further information about this
file, see an Erlang manual page at "man 3erl net_adm". For instance the
following command creates the required file for the virtual node that
was initialized above:

echo "'trurl'." > ~/.hosts.erlang

After a number of virtual nodes have been initialized, Ringo may be
started. Again, a simple script is provided that starts up all virtual
nodes on a specified host. For instance,

ringo/start_nodes.sh trurl /data/ringo

starts up all nodes on the host trurl. After a while, the ring should
be up and running.

The web frontend provides a convenient way to monitor status of the 
system. An example configuration file for the Lighttpd web server is
provided at ringo/ringogw/lighttpd.conf that communicates with the
frontend process over SCGI. The following script starts up the 
web server and the frontend process:

ringo/ringogw/start_ringogw.sh

Now you should see the status page at http://localhost:15000. On the
status page, you can click nodes on the leftmost panel to see domains
that they contain. You can click a domain on the middle panel to see its
replicas. By clicking a replica, you can see its status. You might need
to wait for 10 seconds or so, and reload the page, to see new nodes and
domains appear.


Usage
'''''

You can create domain, put keys and get keys from Ringo using simple
HTTP requests. Assuming that you have started a ring as instructed
above, you can create a new domain called "foobar" with the following
POST request. Here curl is used to make a request but any other HTTP
client would work as well:

curl -d "" http://localhost:15000/mon/data/foobar?create

You can put a new key-value pair to the domain with following POST
request:

curl -d "testvalue" http://localhost:15000/mon/data/foobar/testkey

and retrieve the value given the key with a GET request as follows:

curl http://localhost:15000/mon/data/foobar/testkey

this returns all values assigned with the key "testkey". If only one
value is required, the parameter ?single can be used:

curl http://localhost:15000/mon/data/foobar/testkey?single

A Python class Ringo is provided at ringo/ringogw/py/ringogw.py that
encapsulates the above HTTP requests in Python function calls.

An experimental interface for Disco, an an open-source implementation
of the Map/Reduce framework (http://discoproject.org), can be found at
ringo/ringogw/py/ringodisco.py. This function, which implements the
Disco's map reader interface, makes it possible to use data stored in
Ringo as input to a Disco job. The function provides a particularly
efficient way of accessing data directly from Ringo's live data files.


Running the Test Harness
''''''''''''''''''''''''

Ringo comes with a set of tests that cover all main features of the
system. You need to start the ringogw web frontend process to run
the tests, as shown above. After the frontend has started, you can
run the tests with the following command:

cd ringo/ring/; ./test-system.sh


Contact information
'''''''''''''''''''

Bug reports, patches, comments etc. are welcome! Contact person is 
Ville Tuulos who can be reached at

ville.h.tuulos -a- nokia.com

or on the IRC channel #discoproject at Freenode.

About

Distributed key-value storage a'la Amazon Dynamo

Resources

License

Stars

Watchers

Forks

Packages

No packages published