Cassandra Datastores

DataNucleus supports a limited for of persisting/retrieving objects to/from Cassandra datastores (using the datanucleus-cassandra plugin, which utilises the DataStax Java driver). Simply specify your "connectionURL" as follows

datanucleus.ConnectionURL=cassandra:[{host1}[:{port}] [,{host2} [,{host3}]]]

where it will create a Cassandra cluster with contact points of host1 (host2, host3 etc), and if the port is specified on the first host then will use that as the port (no port specified on alternate hosts).

For example, to connect to a local server

datanucleus.ConnectionURL=cassandra:

The jars required to use DataNucleus Cassandra persistence are datanucleus-core, datanucleus-api-jdo/datanucleus-api-jpa, datanucleus-cassandra and cassandra-driver-core

There are tutorials available for use of DataNucleus with Cassandra for JDO and for JPA

Things to bear in mind with Cassandra usage :-

  • Creation of a PMF/EMF will create a Cluster. This will be closed then the PMF/EMF is closed.
  • Any PM/EM will use a single Session, by default, shared amongst all PM/EMs.
  • If you specify the persistence property datanucleus.cassandra.sessionPerManager to true then each PM/EM will have its own Session object.
  • You can set the number of connections per host with the persistence property datanucleus.mongodb.connectionsPerHost
  • Cassandra doesn't use transactions, so any JDO/JPA transaction operation is a no-op (i.e will be ignored).
  • This uses Cassandra 2.x (and CQL v3.x), not Thrift (like the previous unofficial attempts at a datanucleus-cassandra plugin used)
  • You need to specify the "schema" (datanucleus.mapping.Schema)
  • Queries are evaluated in-datastore when they only have (indexed) members and literals and using the operators ==, !=, >, >=, <, <=, &&, ||.
  • You can query the datastore using JDOQL, JPQL, or CQL

Queries : Cassandra CQL Queries

Note that if you choose to use Cassandra CQL Queries then these are not portable to any other datastore. Use JDOQL/JPQL for portability

Cassandra provides the CQL query language. To take a simple example using the JDO API

// Find all employees
PersistenceManager persistenceManager = pmf.getPersistenceManager();
Query q = pm.newQuery("CQL", "SELECT * FROM schema1.Employee");
// Fetch 10 Employee rows at a time
query.getFetchPlan().setFetchSize(10);
query.setResultClass(Employee.class);
List<Employee> results = (List)q.execute();

You can also query results as List<Object[]> without specifying a specific result type as shown below.

// Find all employees
PersistenceManager persistenceManager = pmf.getPersistenceManager();
Query q = pm.newQuery("CQL", "SELECT * FROM schema1.Employee");
// Fetch all Employee rows as Object[] at a time.
query.getFetchPlan().setFetchSize(-1);
List<Object[]> results = (List)q.execute();

So we are utilising the JDO API to generate a query and passing in the Cassandra "CQL".

If you wanted to use CQL with the JPA API, you would do

// Find all employees
Query q = em.createNativeQuery("SELECT * FROM schema1.Employee", Employee.class);
List<Employee> results = q.getResultList();

Note that the last argument to createNativeQuery is optional and you would get List<Object[]> returned otherwise.