Neo4j Datastores

DataNucleus supports persisting/retrieving objects to/from Neo4j graph datastores (using the datanucleus-neo4j plugin) which utilises the Neo4j Java driver. Simply specify your "connectionURL" as follows

datanucleus.ConnectionURL=neo4j:{db_location}

For example

datanucleus.ConnectionURL=neo4j:myNeo4jDB

You then create your PMF/EMF as normal and use JDO/JPA as normal.

There are tutorials available for use of DataNucleus with Neo4j for JDO and for JPA

Things to bear in mind with Neo4j usage :-

  • Querying can be performed using JDOQL or JPQL. Some components of a filter are handled in the datastore, and the remainder in-memory. Currently any expression of a field (in the same 'table'), or a literal are handled in-datastore, as are the operators &&, ||, >, >=, <, <=, ==, and !=. Also the majority of ordering and result clauses are evaluatable in the datastore, as well as query result range restrictions.
  • When an object is persisted it becomes a Node in Neo4j. You define the names of the properties of that node by specifying the "column" name using JDO/JPA metadata
  • Any 1-1, 1-N, M-N, N-1 relation is persisted as a Relationship object in Neo4j and any positioning of elements in a List or array is preserved via a property on the Relationship.
  • This plugin is in prototype stage so would welcome feedback and, better still, some contributions to fully exploit the power of Neo4j. Register your interest on the DataNucleus Forum



Persistence Implementation

Let's take some example classes, and then describe how these are persisted in Neo4j.

public class Store
{
    @Persistent(primaryKey="true", valueStrategy="identity")
    long id;

    Inventory inventory;

    ...
}

public class Inventory
{
    @Persistent(primaryKey="true", valueStrategy="identity")
    long id;

    Set<Product> products;

    ...
}

public class Product
{
    @Persistent(primaryKey="true", valueStrategy="identity")
    long id;

    String name;

    double value;

    ...
}

When we persist a Store object, which has an Inventory, which has three Product objects, then we get the following

  • Node for the Store , with the "id" is represented as the node id
  • Node for the Inventory , with the "id" is represented as the node id
  • Relationship between the Store Node and the Inventory Node, with the relationship type as "SINGLE_VALUED", and with the property DN_FIELD_NAME as "inventory"
  • Node for Product #1, with properties for "name" and "value" as well as the "id" represented as the node id
  • Node for Product #2, with properties for "name" and "value" as well as the "id" represented as the node id
  • Node for Product #3, with properties for "name" and "value" as well as the "id" represented as the node id
  • Relationship between the Inventory Node and the Product #1 Node, with the relationship type "MULTI_VALUED" and the property DN_FIELD_NAME as "products"
  • Relationship between the Inventory Node and the Product #2 Node, with the relationship type "MULTI_VALUED" and the property DN_FIELD_NAME as "products"
  • Relationship between the Inventory Node and the Product #3 Node, with the relationship type "MULTI_VALUED" and the property DN_FIELD_NAME as "products"
  • Index in "DN_TYPES" for the Store Node with "class" as "mydomain.Store"
  • Index in "DN_TYPES" for the Inventory Node with "class" as "mydomain.Inventory"
  • Index in "DN_TYPES" for the Product Node with "class" as "mydomain.Product"

Note that, to be able to handle polymorphism more easily, if we also have a class Book that extends Product then when we persist an object of this type we will have two entries in "DN_TYPES" for this Node, one with "class" as "mydomain.Book" and one with "class" as "mydomain.Product" so we can interrogate the Index to find the real inheritance level of this Node.



Query Implementation

In terms of querying, a JDOQL/JPQL query is converted into a generic query compilation, and then this is attempted to be converted into a Neo4j "Cypher" query. Not all syntaxis are convertable currently and the query falls back to in-memory evauation in that case.