11 min read

In this article by Valentin Bojinov, the author of the book RESTful Web API Design with Node.JS, Second Edition, we willlook for a better storage solution, which can be scalable easily, together with our REST-enabled application. These days, the so-called NoSQL databases are used heavily in cloud environments. They have the following advantages over traditional transactional SQL databases:

  • They are schemaless; that is, they work with object representations rather than store the object state in one or several tables, depending on their complexity.
  • They are extendable, because they store an actual object. Data evolution is supported implicitly, so all you need to do is just call the operation that stores the object.
  • They are designed to be highly distributed and scalable.

Nearly all modern NoSQL solutions out there support clustering and can scale further, along with the load of your application. Additionally, most of them have REST-enabled interfaces over HTTP, which eases their usage over a load balancer in high-availability scenarios. Classical database drivers are usually not available for traditional client-side languages, such as JavaScript, because they require native libraries or drivers. However, the idea of NoSQL originated from using document data stores. Thus, most of them support the JSON format, which is native to JavaScript. Last but not least, most NoSQL solutions are open source and are available for free, with all the benefits that open source projects offer: community, examples, and freedom!

In this article, we will take a look at two NoSQL solutions: LevelDB and MongoDB. We will see how to design and test our database models, and finally, we will take a brief look at the content delivery network (CDN) infrastructures

(For more resources related to this topic, see here.)

Key/value store – LevelDB

The first data store we will look at is LevelDB. It is an open source implementation developed by Google and written in C++. It is supported by a wide range of platforms, including Node.js. LevelDB is a key/value store; both the key and value are represented as binary data, so their content can vary from simple strings to binary representations of serialized objects in any format, such as JSON or XML. As it is a key/value data store, working with it is similar to working with an associative array—a key identifies an object uniquely within the store. Furthermore, the keys are stored as sorted for better performance. But what makes LevelDB perform better than an arbitrary file storage implementation?

Well, it uses a “log-structured merge” topology, which stores all write operations in an in-memory log, transferred (flushed) regularly to a permanent storage called Sorted String Table (SST) files. Read operations first attempt to retrieve entries from a cache containing the most commonly returned results. The size of the reading cache and the flush interval of the writing log are configurable parameters, which can be further adjusted in order to be adequate for the application load. The following image shows this topology:

RESTful Web API Design with Node.JS - Second Edition

The storage is a collection of string-sorted files with a maximum size of about 2 MB. Each file consists of 4 KB segments that are readable by a single read operation. The table files are not sorted in a straightforward manner, but are organized into levels. The log level is on top, before all other levels. It is always flushed to level 0, which consists of at most four SST files. When filled, one STS file is compacted to a lower level, that is, level 1. The maximum size of level 1 is 10 MB.

When it gets filled, a file goes from level 1 to level 2. LevelDB assumes that the size of each lower level is ten times larger than the size of the previous level. So we have the following level structure:

  • Log with a configurable size
  • Level 0, consisting of four SST files
  • Level 1, with a maximum size of 10 MB
  • Level 2, with a maximum size of 100 MB
  • Level 3, with a maximum size of 1000 MB
  • Level n, with a maximum size of the previous level multiplied by 10 – (n-1)*10 MB

The hierarchical structure of this topology assures that newer data stays in the top levels, while older data is somewhere in the lower levels. A read operation always starts searching for a given key in the cache, and if it is not found there, the operation traverses through each level until the entry is found. An entry is considered non-existing if its key is not found anywhere within all levels.

LevelDB provides get, put, and delete operations to manipulate data records as well as a batch operation that can be used to perform multiple data manipulations atomically; that is, either all or none of the operations in the batch are executed successfully. LevelDB can optionally use a compression library in order to reduce the size of the stored values. This compression is provided by Google’s Snappy compression library. It is highly optimized for fast compression with low performance impact, so too high expectations should not be expected for a large compression ratio.

There are two popular libraries that enable LevelDB usage in Node: LevelDOWN and LevelUP.

Initially, LevelDOWN was acting as foundation binding, implicitly provided with LevelUP, but after version 0.9, it had been extracted out of it and became available as a standalone binding for LevelDB. Currently, LevelUP has no explicit dependency on LevelDOWN defined. It needs to be installed separately, as LevelUP expects it to be available on its Node’s require() path.

LevelDOWN is a pure C++ interface used to bind Node and LevelDB. Though it is slightly faster than LevelUP, it has some state safety and API considerations, which make it less preferable than LevelUP. To be concrete, LevelDOWN does not keep track of the state of the underlying instance. Thus, it is up to the developers themselves not to open a connection more than once or use a data manipulating operation against a closed database connection, as this will cause errors. LevelUP provides state-safe operations out the box. Thus, it prevents out-of-state operations from being sent to its foundation—LevelDOWN. Let’s move on to installing LevelUP by executing the following npm command:

npm install levelup leveldown

Even though the LevelUP module can be installed without LevelDOWN, it will not work at runtime, complaining that it can’t find an underlying dependency.

Enough theory! Let’s see what the LevelUP API looks like. The following code snippet instantiates LevelDB and inserts a dummy contact record into it. It also exposes a /contacts/:number route so that this very record can be returned as a JSON output if queried appropriately. Let’s use it in a new project in the Enide studio, in a file named levelup.js:

var express = require('express')
  , http = require('http')
  , path = require('path')
  , bodyParser = require('body-parser')
  , logger = require('morgan')
  , methodOverride = require('method-override')
  , errorHandler = require('errorhandler')
  , levelup = require('levelup');
var app = express();
var url = require('url');
// all environments
app.set('port', process.env.PORT || 3000);
app.set('views', __dirname + '/views');
app.set('view engine', 'jade');
app.use(methodOverride());
app.use(bodyParser.json());
// development only
if ('development' == app.get('env')) {
  app.use(errorHandler());
}

var db = levelup('./contact',  {valueEncoding: 'json'});
db.put('+359777123456', {
  "firstname": "Joe",
  "lastname": "Smith",
  "title": "Mr.",
  "company": "Dev Inc.",
  "jobtitle": "Developer",
  "primarycontactnumber": "+359777123456",
  "othercontactnumbers": [
    "+359777456789",
    "+359777112233"],
  "primaryemailaddress": "[email protected]",
  "emailaddresses": [
    "[email protected]"],
  "groups": ["Dev","Family"]
});

app.get('/contacts/:number', function(request, response) {
  console.log(request.url + ' : querying for ' + 
  request.params.number);
  db.get(request.params.number, 
function(error, data) {
  if (error) {
    response.writeHead(404, {
      'Content-Type' : 'text/plain'});
    response.end('Not Found');
    return;
    }
  response.setHeader('content-type', 'application/json');
    response.send(data);
  });
});
console.log('Running at port ' + app.get('port'));
http.createServer(app).listen(app.get('port'));

As the contact is inserted into LevelDB before the HTTP server is created, the record identified with the +359777123456 key will be available in the database when we execute our first GET request. But before requesting any data, let’s take a closer look at the usage of LevelUP. The get() function of LevelDB takes two arguments:

  • The first argument is the key to be used in the query.
  • The second argument is a handler function used to process the results. It also has two additional arguments:
    • A Boolean value, specifying whether an error has occurred during the query
    • The actual result entity from the database.

Let’s start it with Node’s levelup.js and execute some test requests with the REST Client tool to http://localhost:3000/contacts/%2B359777123456. This can be seen in the following screenshot:

RESTful Web API Design with Node.JS - Second Edition

Expectedly the response is a JSON representation of the contact inserted statically in LevelUP during the initialization of the application. Requesting any other key will result in an “HTTP 404 Not found” response.

This example demonstrates how to bind a LevelUP operation to an HTTP operation and process its results, but currently, it lacks support for inserting, editing, and deleting data. We will improve that with the next sample. It binds the HTTP’s GET, PUT, and DELETE operations exposed via an express route, /contacts/:number, to the LevelDB’s get, put, and del handlers:

var express = require('express')
  , http = require('http')
  , path = require('path')
  , bodyParser = require('body-parser')
  , logger = require('morgan')
  , methodOverride = require('method-override')
  , errorHandler = require('errorhandler')
  , levelup = require('levelup');
var app = express();
var url = require('url');

// all environments
app.set('port', process.env.PORT || 3000);
app.set('views', __dirname + '/views');
app.set('view engine', 'jade');

app.use(methodOverride());
app.use(bodyParser.json());

// development only
if ('development' == app.get('env')) {
  app.use(errorHandler());
}

var db = levelup('./contact',  {valueEncoding: 'json'});

app.get('/contacts/:number', function(request, response) {

  console.log(request.url + ' : querying for ' + 
  request.params.number);

  db.get(request.params.number, function(error, data) {
    if (error) {
      response.writeHead(404, {
        'Content-Type' : 'text/plain'});
      response.end('Not Found');
      return;
    }
    response.setHeader('content-type', 'application/json');
    response.send(data);
  });
});

app.post('/contacts/:number', function(request, response) {
console.log('Adding new contact with primary number' + 
    request.params.number);
  db.put(request.params.number, request.body, function(error) {
    if (error) {
      response.writeHead(500, {
        'Content-Type' : 'text/plain'});
      response.end('Internal server error');
      return;
    }
    response.send(request.params.number + ' successfully 
    inserted');
  });
});

app.del('/contacts/:number', function(request, response) {

  console.log('Deleting contact with primary number' + 
      request.params.number);
  db.del(request.params.number, function(error) {
    if (error) {
      response.writeHead(500, {
        'Content-Type' : 'text/plain'});
      response.end('Internal server error');
      return;
    } 
    response.send(request.params.number + ' successfully 
    deleted');
  });
});

app.get('/contacts', function(request, response) {
  console.log('Listing all contacts');
  var is_first = true;

  response.setHeader('content-type', 'application/json');
  db.createReadStream()
    .on('data', function (data) {
      console.log(data.value);
      if (is_first == true) {
        response.write('[');	
      }
      else {
        response.write(',');
      }
      response.write(JSON.stringify(data.value));
      is_first = false;
    })
    .on('error', function (error) { 
      console.log('Error while reading', error)
    })
    .on('close', function () { console.log('Closing db stream');})
    .on('end', function () {
      console.log('Db stream closed');
      response.end(']');
    })
});

console.log('Running at port ' + app.get('port'));
http.createServer(app).listen(app.get('port'));

Perhaps the most interesting part of the preceding sample is the handler of the /contacts route. It writes a JSON array of all the contacts available in the database to the output stream of the HTTP response.

LevelUP’s createInputStream method exposes a data handler for every key/value pair available. As LevelDB is not aware of the format of its values, we have to use the native JSON.stringify method to convert each value to a JSON object, based on which we can implement any kind of login. Let’s assume we want a function that flushes to the HTTP response only those contacts whose first name is Joe. Then we will need to add filtering logic to the data handler:

db.createReadStream()
  .on('data', function (data) {
    if (is_first == true) {
      response.write('[');
    } else {
      response.write(',');
    }

    if (data.value.lastname.toString() == 'Smith') {
      var jsonString = JSON.stringify(data.value)
      console.log('Adding Mr. ' + data.value.lastname + ' to 
      the response');
      response.write(jsonString);
      is_first = false;
    } else{
      console.log('Skipping Mr. ' + data.value.lastname);
    }
  })
  .on('error', function (error) {
    console.log('Error while reading', error)
  })
  .on('close', function () {
    console.log('Closing db stream');
  })
  .on('end', function () {
    console.log('Db stream closed');
    response.end(']');

  })

This looks a bit artificial, doesn’t it? Well, this is all that LevelDB can possibly offer us, since LevelDB can search only by a single key. This makes it an inappropriate option for data that has to be indexed by several different attributes. This is where document stores come into play.

Summary

In this article, we looked at one type of NoSQL database: LevelDB, a key/value datastore. We utilized it to implement automated test for the database layer.

Resources for Article:


Further resources on this subject:


LEAVE A REPLY

Please enter your comment!
Please enter your name here