JDO : Datastore Connections

DataNucleus utilises datastore connections as follows

  • PMF : single connection at any one time for datastore-based value generation. Obtained just for the operation, then released
  • PMF : single connection at any one time for schema-generation. Obtained just for the operation, then released
  • PM : single connection at any one time. When in a transaction the connection is held from the point of retrieval until the transaction commits or rolls back; the exact point at which the connection is obtained is defined more fully below. When used for non-transactional operations the connection is obtained just for the specific operation (unless configured to retain it).

If you have multiple threads using the same PersistenceManager then you can get "ConnectionInUse" problems where another operation on another thread comes in and tries to perform something while that first operation is still in use. This happens because the JDO spec requires an implementation to use a single datastore connection at any one time. When this situation crops up the user ought to use multiple PersistenceManagers.

Another important aspect is use of queries for Optimistic transactions, or for non-transactional contexts. In these situations it isn't possible to keep the datastore connection open indefinitely and so when the Query is executed the ResultSet is then read into core making the queried objects available thereafter.


Transactional Context

For pessimistic/datastore transactions a connection will be obtained from the datastore when the first persistence operation is initiated. This datastore connection will be held for the duration of the transaction until such time as either commit() or rollback() are called.

For optimistic transactions the connection is only obtained when flush()/commit() is called. When flush() is called, or the transaction committed a datastore connection is finally obtained and it is held open until commit/rollback completes. when a datastore operation is required. The connection is typically released after performing that operation. So datastore connections, in general, are held for much smaller periods of time. This is complicated slightly by use of the persistence property java.jdo.option.IgnoreCache. When this is set to false, the connection, once obtained, is not released until the call to commit()/rollback().

Note that for Neo4j/MongoDB a single connection is used for the duration of the PM for all transactional and nontransactional operations.


Nontransactional Context

When performing non-transactional operations, the default behaviour is to obtain a connection when needed, and release it after use. With RDBMS you have the option of retaining this connection ready for the next operation to save the time needed to obtain it; this is enabled by setting the persistence property datanucleus.connection.nontx.releaseAfterUse to false.

Note that for Neo4j/MongoDB a single connection is used for the duration of the PM for all transactional and nontransactional operations.


User Connection

JDO defines a mechanism for users to access the native connection to the datastore, so that they can perform other operations as necessary. You obtain a connection as follows (for RDBMS)

// Obtain the connection from the JDO implementation
JDOConnection conn = pm.getDataStoreConnection();
try
{
    Object native = conn.getNativeConnection();
    
    ... use the "sqlConn" connection to perform some operations.
}
finally
{
    // Hand the connection back to the JDO implementation
    conn.close();
}

For the datastores supported by DataNucleus, the "native" object is of the following types

  • RDBMS : java.sql.Connection
  • Excel : org.apache.poi.hssf.usermodel.HSSFWorkbook
  • OOXML : org.apache.poi.hssf.usermodel.XSSFWorkbook
  • ODF : org.odftoolkit.odfdom.doc.OdfDocument
  • LDAP : javax.naming.ldap.LdapContext
  • MongoDB : com.mongodb.DB
  • HBase : NOT SUPPORTED
  • JSON : NOT SUPPORTED
  • XML : org.w3c.dom.Document
  • NeoDatis : org.neodatis.odb.ODB
  • GAE Datastore : com.google.appengine.api.datastore.DatastoreService
  • Neo4j : org.neo4j.graphdb.GraphDatabaseService
  • Cassandra : com.datastax.driver.core.Session

The "JDOConnection" in the case of DataNucleus is a wrapper to the native connection for the type of datastore being used. You now have a connection allowing direct access to the datastore. Things to bear in mind with this connection

  • You must return the connection back to the PersistenceManager before performing any JDO PM operation. You do this by calling conn.close()
  • If you don't return the connection and try to perform a JDO PM operation which requires the connection then a JDOUserException is thrown.

Connection Pooling : when specifying the connection via URL

When you create a PersistenceManagerFactory using a connection URL, driver name, and the username/password, this does not necessarily pool the connections (so they would be efficiently opened/closed when needed to utilise datastore resources in an optimum way). For some of the supported datastores DataNucleus allows you to utilise a connection pool to efficiently manage the connections to the datastore when specifying the datastore via the URL. We currently provide support for the following


You need to specify the persistence property datanucleus.connectionPoolingType to be whichever of the external pooling libraries you wish to use (or "None" if you explicitly want no pooling). DataNucleus provides two sets of connections to the datastore - one for transactional usage, and one for non-transactional usage. If you want to define a different pooling for nontransactional usage then you can also specify the persistence property datanucleus.connectionPoolingType.nontx to whichever is required.


RDBMS : JDBC driver properties with connection pool

If using RDBMS and you have a JDBC driver that supports custom properties, you can still use DataNucleus connection pooling and you need to specify the properties in with your normal persistence properties, but add the prefix datanucleus.connectionPool.driver. to the property name that the driver requires. For example if an Oracle JDBC driver accepts defaultRowPrefetch then you would specify something like

datanucleus.connectionPool.driver.defaultRowPrefetch=50

and it will pass in defaultRowPrefetch as "50" into the driver used by the connection pool.


RDBMS : Apache DBCP

DataNucleus provides a builtin version of DBCP to provide pooling. This is automatically selected if using RDBMS, unless you specify otherwise. An alternative is to use an external DBCP ( DBCP). This is accessed by specifying the persistence property datanucleus.connectionPoolingType etc like this

// Specify our persistence properties used for creating our PMF
Properties props = new Properties();
properties.setProperty("datanucleus.ConnectionDriverName","com.mysql.jdbc.Driver");
properties.setProperty("datanucleus.ConnectionURL","jdbc:mysql://localhost/myDB");
properties.setProperty("datanucleus.ConnectionUserName","login");
properties.setProperty("datanucleus.ConnectionPassword","password");
properties.setProperty("datanucleus.connectionPoolingType", "DBCP");

So the PMF will use connection pooling using DBCP. To do this you will need commons-dbcp, commons-pool and commons-collections JARs to be in the CLASSPATH.

You can also specify persistence properties to control the actual pooling. The currently supported properties for DBCP are shown below

# Pooling of Connections
datanucleus.connectionPool.maxIdle=10
datanucleus.connectionPool.minIdle=3
datanucleus.connectionPool.maxActive=5
datanucleus.connectionPool.maxWait=60

# Pooling of PreparedStatements
datanucleus.connectionPool.maxStatements=0

datanucleus.connectionPool.testSQL=SELECT 1

datanucleus.connectionPool.timeBetweenEvictionRunsMillis=2400000
datanucleus.connectionPool.minEvictableIdleTimeMillis=18000000

RDBMS : C3P0

DataNucleus allows you to utilise a connection pool using C3P0 to efficiently manage the connections to the datastore. C3P0 is a third-party library providing connection pooling. This is accessed by specifying the persistence property datanucleus.connectionPoolingType. To utilise C3P0-based connection pooling we do this

// Specify our persistence properties used for creating our PMF
Properties props = new Properties();
properties.setProperty("datanucleus.ConnectionDriverName","com.mysql.jdbc.Driver");
properties.setProperty("datanucleus.ConnectionURL","jdbc:mysql://localhost/myDB");
properties.setProperty("datanucleus.ConnectionUserName","login");
properties.setProperty("datanucleus.ConnectionPassword","password");
properties.setProperty("datanucleus.connectionPoolingType", "C3P0");

So the PMF will use connection pooling using C3P0. To do this you will need the C3P0 JAR to be in the CLASSPATH. If you want to configure C3P0 further you can include a "c3p0.properties" in your CLASSPATH - see the C3P0 documentation for details.

You can also specify persistence properties to control the actual pooling. The currently supported properties for C3P0 are shown below

# Pooling of Connections
datanucleus.connectionPool.maxPoolSize=5
datanucleus.connectionPool.minPoolSize=3
datanucleus.connectionPool.initialPoolSize=3

# Pooling of PreparedStatements
datanucleus.connectionPool.maxStatements=0

RDBMS : Proxool

DataNucleus allows you to utilise a connection pool using Proxool to efficiently manage the connections to the datastore. Proxool is a third-party library providing connection pooling. This is accessed by specifying the persistence property datanucleus.connectionPoolingType. To utilise Proxool-based connection pooling we do this

// Specify our persistence properties used for creating our PMF
Properties props = new Properties();
properties.setProperty("datanucleus.ConnectionDriverName","com.mysql.jdbc.Driver");
properties.setProperty("datanucleus.ConnectionURL","jdbc:mysql://localhost/myDB");
properties.setProperty("datanucleus.ConnectionUserName","login");
properties.setProperty("datanucleus.ConnectionPassword","password");
properties.setProperty("datanucleus.connectionPoolingType", "Proxool");

So the PMF will use connection pooling using Proxool. To do this you will need the proxool and commons-logging JARs to be in the CLASSPATH.

You can also specify persistence properties to control the actual pooling. The currently supported properties for Proxool are shown below

datanucleus.connectionPool.maxConnections=10

datanucleus.connectionPool.testSQL=SELECT 1

RDBMS : BoneCP

DataNucleus allows you to utilise a connection pool using BoneCP to efficiently manage the connections to the datastore. BoneCP is a third-party library providing connection pooling. This is accessed by specifying the persistence property datanucleus.connectionPoolingType. To utilise BoneCP-based connection pooling we do this

// Specify our persistence properties used for creating our PMF
Properties props = new Properties();
properties.setProperty("datanucleus.ConnectionDriverName","com.mysql.jdbc.Driver");
properties.setProperty("datanucleus.ConnectionURL","jdbc:mysql://localhost/myDB");
properties.setProperty("datanucleus.ConnectionUserName","login");
properties.setProperty("datanucleus.ConnectionPassword","password");
properties.setProperty("datanucleus.connectionPoolingType", "BoneCP");

So the PMF will use connection pooling using BoneCP. To do this you will need the BoneCP JAR (and SLF4J, google-collections) to be in the CLASSPATH.

You can also specify persistence properties to control the actual pooling. The currently supported properties for BoneCP are shown below

# Pooling of Connections
datanucleus.connectionPool.maxPoolSize=5
datanucleus.connectionPool.minPoolSize=3

# Pooling of PreparedStatements
datanucleus.connectionPool.maxStatements=0

RDBMS : HikariCP

DataNucleus allows you to utilise a connection pool using HikariCP to efficiently manage the connections to the datastore. HikariCP is a third-party library providing connection pooling. This is accessed by specifying the persistence property datanucleus.connectionPoolingType. To utilise this connection pooling we do this

// Specify our persistence properties used for creating our PMF
Properties props = new Properties();
properties.setProperty("datanucleus.ConnectionDriverName","com.mysql.jdbc.Driver");
properties.setProperty("datanucleus.ConnectionURL","jdbc:mysql://localhost/myDB");
properties.setProperty("datanucleus.ConnectionUserName","login");
properties.setProperty("datanucleus.ConnectionPassword","password");
properties.setProperty("datanucleus.connectionPoolingType", "HikariCP");

So the PMF will use connection pooling using HikariCP. To do this you will need the HikariCP JAR (and SLF4J, javassist as required) to be in the CLASSPATH.

You can also specify persistence properties to control the actual pooling. The currently supported properties for HikariCP are shown below

# Pooling of Connections
datanucleus.connectionPool.maxPoolSize=5

RDBMS : Tomcat

DataNucleus allows you to utilise a connection pool using Tomcats JDBC Pool to efficiently manage the connections to the datastore. This is accessed by specifying the persistence property datanucleus.connectionPoolingType. To utilise Tomcat-based connection pooling we do this

// Specify our persistence properties used for creating our PMF
Properties props = new Properties();
properties.setProperty("datanucleus.ConnectionDriverName","com.mysql.jdbc.Driver");
properties.setProperty("datanucleus.ConnectionURL","jdbc:mysql://localhost/myDB");
properties.setProperty("datanucleus.ConnectionUserName","login");
properties.setProperty("datanucleus.ConnectionPassword","password");
properties.setProperty("datanucleus.connectionPoolingType", "Tomcat");

So the PMF will use a DataSource with connection pooling using Tomcat. To do this you will need the tomcat-jdbc JAR to be in the CLASSPATH.

You can also specify persistence properties to control the actual pooling, just like other pools.


RDBMS : Manually create a DataSource (e.g DBCP, C3P0, Proxool, etc)

We could have used the built-in DBCP support which internally creates a DataSource ConnectionFactory, alternatively the support for external DBCP, C3P0, Proxool, BoneCP etc, however we can also do this manually if we so wish. Let's demonstrate how to do this with one of the most used pools Apache Commons DBCP

With DBCP you need to generate a javax.sql.DataSource, which you will then pass to DataNucleus. You do this as follows

// Load the JDBC driver
Class.forName(dbDriver);

// Create the actual pool of connections 
ObjectPool connectionPool = new GenericObjectPool(null);

// Create the factory to be used by the pool to create the connections
ConnectionFactory connectionFactory = new DriverManagerConnectionFactory(dbURL, dbUser, dbPassword);

// Create a factory for caching the PreparedStatements
KeyedObjectPoolFactory kpf = new StackKeyedObjectPoolFactory(null, 20);

// Wrap the connections with pooled variants
PoolableConnectionFactory pcf = 
    new PoolableConnectionFactory(connectionFactory, connectionPool, kpf, null, false, true);

// Create the datasource
DataSource ds = new PoolingDataSource(connectionPool);

// Create our PMF
Map properties = new HashMap();
properties.put("javax.jdo.option.ConnectionFactory", ds);
PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory(properties);

Note that we haven't passed the dbUser and dbPassword to the PMF since we no longer need to specify them - they are defined for the pool so we let it do the work. As you also see, we set the data source for the PMF. Thereafter we can sit back and enjoy the performance benefits. Please refer to the documentation for DBCP for details of its configurability (you will need commons-dbcp, commons-pool, and commons-collections in your CLASSPATH to use this above example).


RDBMS : Lookup a DataSource using JNDI

DataNucleus allows you to use connection pools (java.sql.DataSource) bound to a javax.naming.InitialContext with a JNDI name. You first need to create the DataSource in the container (application server/web server), and secondly you define the datanucleus.ConnectionFactoryName property with the DataSource JDNI name.

The following example uses a properties file that is loaded before creating the PersistenceManagerFactory. The PersistenceManagerFactory is created using the JDOHelper.

datanucleus.ConnectionFactoryName=YOUR_DATASOURCE_JNDI_NAME
Properties properties = new Properties();

// the properties file is in your classpath
PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory("/yourpath/yourfile.properties");

Please read more about this in RDBMS DataSources.


LDAP : JNDI

If using an LDAP datastore you can use the following persistence properties to enable connection pooling

datanucleus.connectionPoolingType=JNDI

Once you have turned connection pooling on if you want more control over the pooling you can also set the following persistence properties

  • datanucleus.connectionPool.maxPoolSize : max size of pool
  • datanucleus.connectionPool.initialPoolSize : initial size of pool

RDBMS : Data Sources

DataNucleus allows use of a data source that represents the datastore in use. This is often just a URL defining the location of the datastore, but there are in fact several ways of specifying this data source depending on the environment in which you are running.

Java Client Environment : Non-managed Context

DataNucleus permits you to take advantage of using database connection pooling that is available on an application server. The application server could be a full JEE server (e.g WebLogic) or could equally be a servlet engine (e.g Tomcat, Jetty). Here we are in a non-managed context, and we use the following properties when creating our PersistenceManagerFactory, and refer to the JNDI data source of the server.

If the data source is avaiable in WebLogic, the simplest way of using a data source outside the application server is as follows.

Hashtable ht = new Hashtable();
ht.put(Context.INITIAL_CONTEXT_FACTORY,"weblogic.jndi.WLInitialContextFactory");
ht.put(Context.PROVIDER_URL,"t3://localhost:7001");
    
Context ctx = new InitialContext(ht);
DataSource ds = (DataSource) ctx.lookup("jdbc/datanucleus");
    
Map properties = new HashMap();
properties.setProperty("datanucleus.ConnectionFactory",ds);
PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory(properties);

If the data source is avaiable in Websphere, the simplest way of using a data source outside the application server is as follows.

Hashtable ht = new Hashtable();
ht.put(Context.INITIAL_CONTEXT_FACTORY,"com.ibm.websphere.naming.WsnInitialContextFactory");
ht.put(Context.PROVIDER_URL,"iiop://server:orb port");

Context ctx = new InitialContext(ht);
DataSource ds = (DataSource) ctx.lookup("jdbc/datanucleus");

Map properties = new HashMap();
properties.setProperty("javax.jdo.PersistenceManagerFactoryClass",
                    "org.datanucleus.api.jdo.JDOPersistenceManagerFactory");
properties.setProperty("datanucleus.ConnectionFactory",ds);
PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory(properties);

Servlet Environment : Managed Context

As an example of setting up such a JNDI data source for Tomcat 5.0, here we would add the following file to $TOMCAT/conf/Catalina/localhost/ as "datanucleus.xml"

<?xml version='1.0' encoding='utf-8'?>
<Context docBase="/home/datanucleus/" path="/datanucleus">
    <Resource name="jdbc/datanucleus" type="javax.sql.DataSource"/>
    <ResourceParams name="jdbc/datanucleus">
        <parameter>
            <name>maxWait</name>
            <value>5000</value>
        </parameter>
        <parameter>
            <name>maxActive</name>
            <value>20</value>
        </parameter>
        <parameter>
            <name>maxIdle</name>
            <value>2</value>
        </parameter>

        <parameter>
            <name>url</name>
            <value>jdbc:mysql://127.0.0.1:3306/datanucleus?autoReconnect=true</value>
        </parameter>
        <parameter>
            <name>driverClassName</name>
            <value>com.mysql.jdbc.Driver</value>
        </parameter>
        <parameter>
            <name>username</name>
            <value>mysql</value>
        </parameter>
        <parameter>
            <name>password</name>
            <value></value>
        </parameter>
    </ResourceParams>
</Context>

With this Tomcat JNDI data source we would then specify the PMF ConnectionFactoryName as java:comp/env/jdbc/datanucleus.

   
Properties properties = new Properties();
properties.setProperty("datanucleus.ConnectionFactoryName","java:comp/env/jdbc/datanucleus");
PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory(properties);

JEE Environment : Managed Context

As in the above example, we can also run in a managed context, in a JEE/Servlet environment, and here we would make a minor change to the specification of the JNDI data source depending on the application server or the scope of the jndi: global or component.

Using JNDI deployed in global environment:

   
Properties properties = new Properties();
properties.setProperty("datanucleus.ConnectionFactoryName","jdbc/datanucleus");
PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory(properties);

Using JNDI deployed in component environment:

   
Properties properties = new Properties();
properties.setProperty("datanucleus.ConnectionFactoryName","java:comp/env/jdbc/datanucleus");
PersistenceManagerFactory pmf = JDOHelper.getPersistenceManagerFactory(properties);

See also : JEE Tutorial for JDO