(For more resources related to this topic, see here.)

Neo4j is a graph database, which means that it does not use tables and rows to represent data logically; instead, it uses nodes and relationships. Both nodes and relationships can have a number of properties. While relationships must have one direction and one type, nodes can have a number of labels. For example, the following diagram shows three nodes and their relationships, where every node has a label (language or graph database), while relationships have a type (QUERY_LANGUAGE_OF and WRITTEN_IN).

The properties used in the graph shown in the following diagram are: name, type, and from. Note that every relation must have exactly one type and one direction, whereas labels for nodes are optional and can be multiple.

working-neo4j-embedded-database-img-0

Neo4j running modes

Neo4j can be used in two modes:

An embedded database in a Java application;

A standalone server via REST

In any case, this choice does not affect the way you query and work with the database. It's only an architectural choice driven by the nature of the application (whether a standalone server or a client-server), performance, monitoring, and safety of data.

An embedded database

An embedded Neo4j database is the best choice for performance. It runs in the same process of the client application that hosts it and stores data in the given path. Thus, an embedded database must be created programmatically. We choose an embedded database for the following reasons:

When we use Java as the programming language for our project

When our application is standalone

Preparing the development environment

The fastest way to prepare the IDE for Neo4j is using Maven. Maven is a dependency management and automated building tool. In the following procedure, we will use NetBeans 7.4, but it works in a very similar way with the other IDEs (for Eclipse, you would need the m2eclipse plugin). The procedure is described as follows:

Create a new Maven project as shown in the following screenshot:

In the next page of the wizard, name the project, set a valid project location, and then click on Finish.

After NetBeans has created the project, expand Project Files in the project tree and open the pom.xml file. In the <dependencies> tag, insert the following XML code:
<dependencies> <dependency> <groupId>org.neo4j</groupId> <artifactId>neo4j</artifactId> <version>2.0.1</version> </dependency> </dependencies> <repositories> <repository> <id>neo4j</id> <url>http://m2.neo4j.org/content/repositories/releases/</url> <releases> <enabled>true</enabled> </releases> </repository> </repositories>

This code instructs Maven the dependency we are using on our project, that is, Neo4j. The version we have used here is 2.0.1. Of course, you can specify the latest available version.

Once saved, the Maven file resolves the dependency, downloads the JAR files needed, and updates the Java build path. Now, the project is ready to use Neo4j and Cypher.

Creating an embedded database

Creating an embedded database is straightforward. First of all, to create a database, we need a GraphDatabaseFactory class, which can be done with the following code:

GraphDatabaseFactory graphDbFactory = new GraphDatabaseFactory();

Then, we can invoke the newEmbeddedDatabase method with the following code:

GraphDatabaseService graphDb = graphDbFactory .newEmbeddedDatabase("data/dbName");

Now, with the GraphDatabaseService class, we can fully interact with the database, create nodes, create relationships, set properties and indexes.

Invoking Cypher from Java

To execute Cypher queries on a Neo4j database, you need an instance of ExecutionEngine; this class is responsible for parsing and running Cypher queries, returning results in a ExecutionResult instance:

import org.neo4j.cypher.javacompat.ExecutionEngine; import org.neo4j.cypher.javacompat.ExecutionResult; // ... ExecutionEngine engine = new ExecutionEngine(graphDb); ExecutionResult result = engine.execute("MATCH (e:Employee) RETURN e");

Note that we use the org.neo4j.cypher.javacompat package and not the org.neo4j.cypher package even though they are almost the same. The reason is that Cypher is written in Scala, and Cypher authors provide us with the former package for better Java compatibility.

Now with the results, we can do one of the following options:

Dumping to a string value

Converting to a single column iterator

Iterating over the full row

Dumping to a string is useful for testing purposes:

String dumped = result.dumpToString();

If we print the dumped string to the standard output stream, we will get the following result:

working-neo4j-embedded-database-img-2

Here, we have a single column (e) that contains the nodes. Each node is dumped with all its properties. The numbers between the square brackets are the node IDs, which are the long and unique values assigned by Neo4j on the creation of the node.

When the result is single column, or we need only one column of our result, we can get an iterator over one column with the following code:

import org.neo4j.graphdb.ResourceIterator; // ... ResourceIterator<Node> nodes = result.columnAs("e");

Then, we can iterate that column in the usual way, as shown in the following code:

while(nodes.hasNext()) { Node node = nodes.next(); // do something with node }

However, Neo4j provides a syntax-sugar utility to shorten the code that is to be iterated:

import org.neo4j.helpers.collection.IteratorUtil; // ... for (Node node : IteratorUtil.asIterable(nodes)) { // do something with node }

If we need to iterate over a multiple-column result, we would write this code in the following way:

ResourceIterator<Map<String, Object>> rows = result.iterator(); for(Map<String,Object> row : IteratorUtil.asIterable(rows)) { Node n = (Node) row.get("e"); try(Transaction t = n.getGraphDatabase().beginTx()) { // do something with node } }

The iterator function returns an iterator of maps, where keys are the names of the columns. Note that when we have to work with nodes, even if they are returned by a Cypher query, we have to work in transaction. In fact, Neo4j requires that every time we work with the database, either reading or writing to the database, we must be in a transaction. The only exception is when we launch a Cypher query. If we launch the query within an existing transaction, Cypher will work as any other operation. No change will be persisted on the database until we commit the transaction, but if we run the query outside any transaction, Cypher will open a transaction for us and will commit changes at the end of the query.