JDO : Query API

Once you have persisted objects you need to query them. For example if you have a web application representing an online store, the user asks to see all products of a particular type, ordered by the price. This requires you to query the datastore for these products. JDO allows support for several query languages using its API. DataNucleus provides querying using

Note that for some datastores additional query languages may be available specific to that datastore - please check the datastores documentation. The query language you choose is your choice, typically dependent on the skillset of the developers of your application.

We recommend using JDOQL for queries wherever possible since it is object-based and datastore agnostic, giving you extra flexibility in the future. If not possible using JDOQL, only then use a language appropriate to the datastore in question

Creating a query

The principal ways of creating a query are

  • Specifying the query language, and using a single-string form of the query
    Query q = pm.newQuery("javax.jdo.query.JDOQL", 
        "SELECT FROM mydomain.MyClass WHERE field2 < threshold " +
        "PARAMETERS java.util.Date threshold");
    or alternatively
    Query q = pm.newQuery("SQL", "SELECT * FROM MYTABLE WHERE COL1 == 25);
  • A "named" query, (pre-)defined in metadata (refer to metadata docs).
    Query q = pm.newNamedQuery(MyClass.class, "MyQuery1");
  • JDOQL : Use the single-string form of the query
    Query q = pm.newQuery("SELECT FROM mydomain.MyClass WHERE field2 < threshold " +
        "PARAMETERS java.util.Date threshold");
  • JDOQL : Use the declarative API to define the query
    Query q = pm.newQuery(MyClass.class);
    q.setFilter("field2 < threshold");
    q.declareParameters("java.util.Date threshold");
  • JDOQL : Use the typesafe API to define the query
    TypesafeQuery<MyClass> q = pm.newTypesafeQuery(MyClass.class);
    QMyClass cand = QMyClass.candidate();
    List<Product> results = 
        q.filter(cand.field2.lt(q.doubleParameter("threshold"))).executeList();

Please note that with the query API you can also specify execution time information for the query, such as whether it executes in memory, or whether to apply a datastore timeout etc.


Compiling a query

An intermediate step once you have your query defined, if you want to check its validity is to compile it. You do this as follows

q.compile();

If the query is invalid, then a JDO exception will be thrown.


Executing a query

So we have set up our query. We now execute it

Object result = q.execute();

If we have parameters to pass in we can also do any of

Object result = q.execute(paramVal1);


Object result = q.execute(paramVal1, paramVal2);


Object result = q.executeWithArray(new Object[]{paramVal1, paramVal2});


Map paramMap = new HashMap();
paramMap("param1", paramVal1);
paramMap("param2", paramVal2);
Object result = q.executeWithMap(paramMap);

By default, when a query is executed, it will execute in the datastore with what is present in the datastore at that time. If there are outstanding changes waiting to be flushed then these will not feature in the results. To flush these changes before execution, set the following query "extension" before calling execute

q.addExtension("datanucleus.query.flushBeforeExecution","true");

Controlling the execution : Vendor extensions

JDO's query API allows implementations to support extensions and provides a simple interface for enabling the use of such extensions on queries.

q.addExtension("extension_name", "value");
HashMap exts = new HashMap();
exts.put("extension1", value1);
exts.put("extension2", value2);
q.setExtensions(exts);

Named Query

With the JDO API you can either define a query at runtime, or define it in the MetaData/annotations for a class and refer to it at runtime using a symbolic name. This second option means that the method of invoking the query at runtime is much simplified. To demonstrate the process, lets say we have a class called Product (something to sell in a store). We define the JDO Meta-Data for the class in the normal way, but we also have some query that we know we will require, so we define the following in the Meta-Data.

<jdo>
    <package name="mydomain">
        <class name="Product">
            ...
            <query name="SoldOut" language="javax.jdo.query.JDOQL"><![CDATA[
            SELECT FROM mydomain.Product WHERE status == "Sold Out"
            ]]></query>
        </class>
    </package>
</jdo>

So we have a JDOQL query called "SoldOut" defined for the class Product that returns all Products (and subclasses) that have a status of "Sold Out". Out of interest, what we would then do in our application to execute this query woule be

Query q = pm.newNamedQuery(mydomain.Product.class,"SoldOut");
Collection results = (Collection)q.execute();

The above example was for the JDOQL object-based query language. We can do a similar thing using SQL, so we define the following in our MetaData for our Product class

<jdo>
    <package name="mydomain">
        <class name="Product">
            ...
            <query name="PriceBelowValue" language="javax.jdo.query.SQL"><![CDATA[
            SELECT NAME FROM PRODUCT WHERE PRICE < ?
            ]]></query>
        </class>
    </package>
</jdo>

So here we have an SQL query that will return the names of all Products that have a price less than a specified value. This leaves us the flexibility to specify the value at runtime. So here we run our named query, asking for the names of all Products with price below 20 euros.

Query q = pm.newNamedQuery(mydomain.Product.class,"PriceBelowValue");
Collection results = (Collection)q.execute(20.0);

All of the examples above have been specifed within the <class> element of the MetaData. You can, however, specify queries below <jdo> in which case the query is not scoped by a particular candidate class. In this case you must put your queries in any of the following MetaData files

/META-INF/package.jdo
/WEB-INF/package.jdo
/package.jdo
/META-INF/package-{mapping}.orm
/WEB-INF/package-{mapping}.orm
/package-{mapping}.orm
/META-INF/package.jdoquery
/WEB-INF/package.jdoquery
/package.jdoquery

Saving a Query as a Named Query

DataNucleus also allows you to create a query, and then save it as a "named" query for later reuse. You do this as follows

Query q = pm.newQuery("SELECT FROM Product p WHERE ...");
((org.datanucleus.api.jdo.JDOQuery)q).saveAsNamedQuery("MyQuery");

and you can thereafter access the query via

Query q = pm.newNamedQuery(Product.class, "MyQuery");

Controlling the execution : FetchPlan

When a Query is executed it executes in the datastore, which returns a set of results. DataNucleus could clearly read all results from this ResultSet in one go and return them all to the user, or could allow control over this fetching process. JDO provides a fetch size on the Fetch Plan to allow this control. You would set this as follows

Query q = pm.newQuery(...);
q.getFetchPlan().setFetchSize(FetchPlan.FETCH_SIZE_OPTIMAL);

fetch size has 3 possible values.

  • FETCH_SIZE_OPTIMAL - allows DataNucleus full control over the fetching. In this case DataNucleus will fetch each object when they are requested, and then when the owning transaction is committed will retrieve all remaining rows (so that the Query is still usable after the close of the transaction).
  • FETCH_SIZE_GREEDY - DataNucleus will read all objects in at query execution. This can be efficient for queries with few results, and very inefficient for queries returning large result sets.
  • A positive value - DataNucleus will read this number of objects at query execution. Thereafter it will read the objects when requested.

In addition to the number of objects fetched, you can also control which fields are fetched for each object of the candidate type. This is controlled via the FetchPlan. For RDBMS any single-valued member will be fetched in the original SQL query, but with multiple-valued members this is not supported. However what will happen is that any collection field will be retrieved in a single SQL query for all candidate objects; this avoids the "N+1" problem, resulting in 1 original SQL query plus 1 SQL query per collection member. Note that you can disable this by either not putting multi-valued fields in the FetchPlan, or by setting the query extension "datanucleus.rdbms.query.multivaluedFetch" to "none" (default is "exists" using the single SQL per field). For non-RDBMS datastores the collection/map is stored by way of a Collection of ids of the related objects in a single "column" of the object and so is retrievable in the same query. See also Fetch Groups.


DataNucleus also allows an extension to give further control. As mentioned above, when the transaction containing the Query is committed, all remaining results are read so that they can then be accessed later (meaning that the query is still usable). Where you have a large result set and you don't want this behaviour you can turn it off by specifying a Query extension

q.addExtension("datanucleus.query.loadResultsAtCommit", "false");

so when the transaction is committed, no more results will be available from the query.


In some situations you don't want all FetchPlan fields retrieving, and DataNucleus provides an extension to turn this off, like this

q.addExtension("datanucleus.query.useFetchPlan", "false");

Control over locking of fetched objects

JDO allows control over whether objects found by a query are locked during that transaction so that other transactions can't update them in the meantime. To do this you would do

Query q = pm.newQuery(...);
q.setSerializeRead(true);

You can also specify this for all queries for all PMs using a PMF property datanucleus.SerializeRead. In addition you can perform this on a per-transaction basis by doing

tx.setSerializeRead(true);

If the datastore in use doesn't support locking of objects then this will do nothing


Flush changes before execution

When using optimistic transactions all updates to data are held until flush()/commit(). This means that executing a query may not take into account changes made during that transaction in some objects. DataNucleus allows a convenience of calling flush() just before execution of queries so that all updates are taken into account. The property name is datanucleus.query.flushBeforeExecution and defaults to "false".

To do this on a per query basis for JDO you would do

query.addExtension("datanucleus.query.flushBeforeExecution","true");

You can also specify this for all queries using a persistence property datanucleus.query.flushBeforeExecution which would then apply to ALL queries for that PMF.


Controlling the execution : timeout on datastore reads

q.setDatastoreReadTimeout(1000);

Sets the timeout for this query (in milliseconds). Will throw a JDOUnsupportedOperationException if the query implementation doesn't support timeouts.


Controlling the execution : timeout on datastore writes

q.setDatastoreWriteTimeout(1000);

Sets the timeout for this query (in milliseconds) when it is a delete/update. Will throw a JDOUnsupportedOperationException if the query implementation doesn't support timeouts.