JDO : Query API

Once you have persisted objects you need to query them. For example if you have a web application representing an online store, the user asks to see all products of a particular type, ordered by the price. This requires you to query the datastore for these products. JDO allows support for several query languages using its API. DataNucleus provides querying using

Note that for some datastores additional query languages may be available specific to that datastore - please check the datastores documentation. The query language you choose is your choice, typically dependent on the skillset of the developers of your application.

We recommend using JDOQL for queries wherever possible since it is object-based and datastore agnostic, giving you extra flexibility in the future. If not possible using JDOQL, only then use a language appropriate to the datastore in question

Creating a query

The principal ways of creating a query are

  • Specifying the query language, and using a single-string form of the query
    Query q = pm.newQuery("javax.jdo.query.JDOQL", 
        "SELECT FROM mydomain.MyClass WHERE field2 < threshold PARAMETERS java.util.Date threshold");
    or alternatively
    Query q = pm.newQuery("SQL", "SELECT * FROM MYTABLE WHERE COL1 == 25);
  • A "named" query, (pre-)defined in metadata (refer to metadata docs).
    Query<MyClass> q = pm.newNamedQuery(MyClass.class, "MyQuery1");
  • JDOQL : Use the single-string form of the query
    Query q = pm.newQuery("SELECT FROM mydomain.MyClass WHERE field2 < threshold PARAMETERS java.util.Date threshold");
  • JDOQL : Use the declarative API to define the query
    Query<MyClass> q = pm.newQuery(MyClass.class);
    q.setFilter("field2 < threshold");
    q.declareParameters("java.util.Date threshold");
  • JDOQL : Use the Typed Query API to define the query
    JDOQLTypedQuery<MyClass> q = pm.newJDOQLTypedQuery(MyClass.class);
    QMyClass cand = QMyClass.candidate();
    List<Product> results = q.filter(cand.field2.lt(q.doubleParameter("threshold"))).executeList();

Please note that with the query API you can also specify execution time information for the query, such as whether it executes in memory, or whether to apply a datastore timeout etc.

Compiling a query

An intermediate step once you have your query defined, if you want to check its validity, is to compile it. You do this as follows


If the query is invalid, then a JDO exception will be thrown.

Setting query parameters

Queries can be made flexible and reusable by defining parameters as part of the query, so that we can execute the same query with different sets of parameters and minimise resources.

// JDOQL Using named parameters
Query<Product> q = pm.newQuery(Product.class);
q.setFilter("this.name == :name && this.serialNo == :serial");

Map params = new HashMap();
params.put("name", "Walkman");
params.put("serial", "123021");

// JDOQL Using numbered parameters
Query<Product> q = pm.newQuery(Product.class);
q.setFilter("this.name == ?1 && this.serialNo == ?2");

q.setParameters("Walkman", "123021");

Executing a query

So we have set up our query. We now execute it. We have various methods to do this, depending on what result we are expecting etc

Object result = q.execute();

Object result = q.execute(paramVal1);

Object result = q.execute(paramVal1, paramVal2);

Object result = q.executeWithArray(new Object[]{paramVal1, paramVal2});

Object result = q.executeWithMap(paramMap);

List results = q.executeList();

Object result = q.executeUnique();

List<ResultClass> results = q.executeResultList(ResultClass.class);

ResultClass result = q.executeResultUnique(ResultClass.class);

Flush changes before execution

When using optimistic transactions all updates to data are held until flush()/commit(). This means that executing a query may not take into account changes made during that transaction in some objects. DataNucleus allows a convenience of calling flush() just before execution of queries so that all updates are taken into account. The property name is datanucleus.query.flushBeforeExecution and defaults to "false".

To do this on a per query basis for JDO you would do


You can also specify this for all queries using a persistence property datanucleus.query.flushBeforeExecution which would then apply to ALL queries for that PMF.

Controlling the execution : Vendor extensions

JDO's query API allows implementations to support extensions and provides a simple interface for enabling the use of such extensions on queries. You can set extensions one by one, or via a map.

q.extension("extension1", "value1").extension("extension2", "value2");
Map exts = new HashMap();
exts.put("extension1", value1);
exts.put("extension2", value2);

Named Query

With the JDO API you can either define a query at runtime, or define it in the MetaData/annotations for a class and refer to it at runtime using a symbolic name. This second option means that the method of invoking the query at runtime is much simplified. To demonstrate the process, lets say we have a class called Product (something to sell in a store). We define the JDO Meta-Data for the class in the normal way, but we also have some query that we know we will require, so we define the following in the Meta-Data.

    <package name="mydomain">
        <class name="Product">
            <query name="SoldOut" language="javax.jdo.query.JDOQL"><![CDATA[
            SELECT FROM mydomain.Product WHERE status == "Sold Out"

So we have a JDOQL query called "SoldOut" defined for the class Product that returns all Products (and subclasses) that have a status of "Sold Out". Out of interest, what we would then do in our application to execute this query woule be

Query<Product> q = pm.newNamedQuery(mydomain.Product.class,"SoldOut");
List<Product> results = q.executeList();

The above example was for the JDOQL object-based query language. We can do a similar thing using SQL, so we define the following in our MetaData for our Product class

    <package name="mydomain">
        <class name="Product">
            <query name="PriceBelowValue" language="javax.jdo.query.SQL"><![CDATA[

So here we have an SQL query that will return the names of all Products that have a price less than a specified value. This leaves us the flexibility to specify the value at runtime. So here we run our named query, asking for the names of all Products with price below 20 euros.

Query<Product> q = pm.newNamedQuery(mydomain.Product.class, "PriceBelowValue");
List<Product> results = q.executeList();

All of the examples above have been specifed within the <class> element of the MetaData. You can, however, specify queries below <jdo> in which case the query is not scoped by a particular candidate class. In this case you must put your queries in any of the following MetaData files


Saving a Query as a Named Query

DataNucleus JDO also allows you to create a query, and then save it as a "named" query for later reuse. You do this as follows

Query q = pm.newQuery("SELECT FROM Product p WHERE ...");

and you can thereafter access the query via

Query q = pm.newNamedQuery(Product.class, "MyQuery");

Controlling the execution : FetchPlan

When a Query is executed it executes in the datastore, which returns a set of results. DataNucleus could clearly read all results from this ResultSet in one go and return them all to the user, or could allow control over this fetching process. JDO provides a fetch size on the Fetch Plan to allow this control. You would set this as follows

Query q = pm.newQuery(...);

fetch size has 3 possible values.

  • FETCH_SIZE_OPTIMAL - allows DataNucleus full control over the fetching. In this case DataNucleus will fetch each object when they are requested, and then when the owning transaction is committed will retrieve all remaining rows (so that the Query is still usable after the close of the transaction).
  • FETCH_SIZE_GREEDY - DataNucleus will read all objects in at query execution. This can be efficient for queries with few results, and very inefficient for queries returning large result sets.
  • A positive value - DataNucleus will read this number of objects at query execution. Thereafter it will read the objects when requested.

In addition to the number of objects fetched, you can also control which fields are fetched for each object of the candidate type. This is controlled via the FetchPlan. For RDBMS any single-valued member will be fetched in the original SQL query, but with multiple-valued members this is not supported. However what will happen is that any collection field will be retrieved in a single SQL query for all candidate objects; this avoids the "N+1" problem, resulting in 1 original SQL query plus 1 SQL query per collection member. Note that you can disable this by either not putting multi-valued fields in the FetchPlan, or by setting the query extension "datanucleus.rdbms.query.multivaluedFetch" to "none" (default is "exists" using the single SQL per field). For non-RDBMS datastores the collection/map is stored by way of a Collection of ids of the related objects in a single "column" of the object and so is retrievable in the same query. See also Fetch Groups.

DataNucleus also allows an extension to give further control. As mentioned above, when the transaction containing the Query is committed, all remaining results are read so that they can then be accessed later (meaning that the query is still usable). Where you have a large result set and you don't want this behaviour you can turn it off by specifying a Query extension

q.extension("datanucleus.query.loadResultsAtCommit", "false");

so when the transaction is committed, no more results will be available from the query.

In some situations you don't want all FetchPlan fields retrieving, and DataNucleus provides an extension to turn this off, like this

q.extension("datanucleus.query.useFetchPlan", "false");

Control over locking of fetched objects

JDO allows control over whether objects found by a query are locked during that transaction so that other transactions can't update them in the meantime. To do this you would do

Query q = pm.newQuery(...);

You can also specify this for all queries for all PMs using a PMF property datanucleus.SerializeRead. In addition you can perform this on a per-transaction basis by doing


If the datastore in use doesn't support locking of objects then this will do nothing

Controlling the execution : timeout on datastore reads


Sets the timeout for this query (in milliseconds). Will throw a JDOUnsupportedOperationException if the query implementation doesn't support timeouts.

Controlling the execution : timeout on datastore writes


Sets the timeout for this query (in milliseconds) when it is a delete/update. Will throw a JDOUnsupportedOperationException if the query implementation doesn't support timeouts.

Large Result Sets : Loading Results at Commit()

When a transaction is committed by default all remaining results for a query are loaded so that the query is usable thereafter. With a large result set you clearly don't want this to happen. So in this case you should set the extension datanucleus.query.loadResultsAtCommit to false.

To do this on a per query basis for JDO you would do

query.addExtension("datanucleus.query.loadResultsAtCommit", "false");

Result Set : Caching of Results

When you execute a query, the query results are typically loaded when the user accesses each row. Results that have been read can then be cached locally. You can control this caching to optimise it for your memory requirements. You can set the query hint datanucleus.query.resultCacheType and it has the following possible values

  • weak : use a weak reference map for caching (default)
  • soft : use a soft reference map for caching
  • hard : use a Map for caching (objects not garbage collected)
  • none : no caching (hence uses least memory)

To do this on a per query basis, you would do

query.addExtension("datanucleus.query.resultCacheType", "weak");

Large Result Sets : Size

If you have a large result set you clearly don't want to instantiate all objects since this would hit the memory footprint of your application. To get the number of results many JDBC drivers, for example, will load all rows of the result set. This is to be avoided so DataNucleus provides control over the mechanism for getting the size of results. The persistence property datanucleus.query.resultSizeMethod has a default of last (which means navigate to the last object, hence hitting the JDBC driver problem). On RDBMS, if you set this to count then it will use a simple "count()" query to get the size.

To do this on a per query basis you would do

query.addExtension("datanucleus.query.resultSizeMethod", "count");

RDBMS : Result Set Type

For RDBMS datastores, java.sql.ResultSet defines three possible result set types.

  • forward-only : the result set is navegable forwards only
  • scroll-sensitive : the result set is scrollable in both directions and is sensitive to changes in the datastore
  • scroll-insensitive : the result set is scrollable in both directions and is insensitive to changes in the datastore

DataNucleus allows specification of this type as a query extension datanucleus.rdbms.query.resultSetType.

To do this on a per query basis you would do

query.addExtension("datanucleus.rdbms.query.resultSetType", "scroll-insensitive");

The default is forward-only. The benefit of the other two is that the result set will be scrollable and hence objects will only be read in to memory when accessed. So if you have a large result set you should set this to one of the scrollable values.

RDBMS : Result Set Control

DataNucleus RDBMS provides a useful extension allowing control over the ResultSet's that are created by queries. You have at your convenience some properties that give you the power to control whether the result set is read only, whether it can be read forward only, the direction of fetching etc.

To do this on a per query basis you would do

query.addExtension("datanucleus.rdbms.query.fetchDirection", "forward");
query.addExtension("datanucleus.rdbms.query.resultSetConcurrency", "read-only");

Alternatively you can specify these as persistence properties so that they apply to all queries for that PMF/EMF. Again, the properties are

  • datanucleus.rdbms.query.fetchDirection - controls the direction that the ResultSet is navigated. By default this is forwards only. Use this property to change that.
  • datanucleus.rdbms.query.resultSetConcurrency - controls whether the ResultSet is read only or updateable.

Bear in mind that not all RDBMS support all of the possible values for these options. That said, they do add a degree of control that is often useful.