JPA : JPQL SELECT Queries

The JPA specification defines JPQL (a pseudo-OO query language, with SQL-like syntax), for selecting objects from the datastore. To provide a simple example, this is what you would do

Query q = em.createQuery("SELECT p FROM Person p WHERE p.lastName = 'Jones'");
List results = (List)q.getResultList();

This finds all "Person" objects with surname of "Jones". You specify all details in the query.


SELECT Syntax

In JPQL queries you define the query in a single string, defining the result, the candidate class(es), the filter, any grouping, and the ordering. This string has to follow the following pattern

SELECT [<result>]
    [FROM <candidate-class(es)>]
    [WHERE <filter>]
    [GROUP BY <grouping>]
    [HAVING <having>]
    [ORDER BY <ordering>]

The "keywords" in the query are shown in UPPER CASE are case-insensitive.


Entity Name

In the example shown you note that we did not specify the full class name. We used Person p and thereafter could refer to p as the alias. The Person is called the entity name and in JPA MetaData this can be defined against each class in its definition. For example

<entity class="org.datanucleus.company.Person" name="Person">
    ...
</entity>

In strict JPA the entity name cannot be a MappedSuperclass entity name. That is, if you have an abstract superclass that is persistable, you cannot query for instances of that superclass and its subclasses. We consider this a significant shortcoming of the querying capability, and allow the entity name to also be of a MappedSuperclass. You are unlikely to find this supported in other JPA implementations, but then maybe that's why you chose DataNucleus?

From Clause

The FROM clause allows a user to add some explicit joins to related entities, and assign aliases to the joined entities. These are then usable in the filter/ordering/result etc. If you don't add any joins DataNucleus will add joins where they are implicit from the filter expression for example

Fetched Fields

By default a query will fetch fields according to their defined EAGER/LAZY setting, so fields like primitives, wrappers, Dates, and 1-1/N-1 relations will be fetched, whereas 1-N/M-N fields will not be fetched. JPQL allows you to include FETCH JOIN as a hint to include 1-N/M-N fields where possible. For RDBMS datastores any multi-valued field will be bulk-fetched if it is defined to be EAGER or is placed in the current EntityGraph. By bulk-fetched we mean that there will be a single SQL issued per collection field (hence avoiding the N+1 problem). Note that you can disable this by either not putting multi-valued fields in the FetchPlan, or by setting the query extension "datanucleus.multivaluedFetch" to "none" (default is "bulk-fetch" using the single SQL per field). All non-RDBMS datastores do respect this FETCH JOIN setting, since a collection/map is stored in a single "column" in the object and so is readily retrievable.

Note that you can also make use of Entity Graphs to have fuller control over what is retrieved from each query.

Filter

The most important thing to remember when defining the filter for JPQL is that think how you would write it in SQL, and its likely the same except for field names instead of column names. The filter has to be a boolean expression, and can include the candidate entity, fields/properties, literals, functions, parameters, operators and subqueries

Fields/Properties

In JPQL you refer to fields/properties in the query by referring to the field/bean name. For example, if you are querying a candidate entity called Product and it has a field "price", then you access it like this

price < 150.0

Note that if you want to refer to a field/property of an entity you can prefix the field by its alias

p.price < 150.0

You can also chain field references if you have an entity Product (alias = p) with a field of (persistable) type Inventory, which has a field name, so you could do

p.inventory.name = 'Backup'

Operators

The operators are listed below in order of decreasing precedence.

  • Navigation operator (.)
  • Arithmetic operators:
    • +, - unary
    • *, / multiplication and division
    • +, - addition and subtraction
  • Comparison operators : =, >, >=, <, <=, <> (not equal), [NOT] BETWEEN, [NOT] LIKE, [NOT] IN, IS [NOT] NULL, IS [NOT] EMPTY, [NOT] MEMBER [OF], [NOT] EXISTS
  • Logical operators:
    • NOT
    • AND
    • OR

Literals

JPQL supports literals of the following types : Number, boolean, character, String, NULL and temporal. When String literals are specified using single-string format they should be surrounded by single-quotes '. Please note that temporal literals are specified using JDBC escape syntax in String form, namely

{d 'yyyy-mm-dd'}                 - a Date
{t 'hh:mm:ss'}                   - a Time
{ts 'yyyy-mm-dd hh:mm:ss.f...'}  - a Timestamp
                

Input Parameters

In JPQL queries it is convenient to pass in parameters so we don't have to define the same query for different values. Let's take two examples

Named Parameters :
Query q = em.createQuery("SELECT p FROM Person p WHERE p.lastName = :surname AND o.firstName = :forename");
q.setParameter("surname", theSurname);
q.setParameter("forename", theForename");

Numbered Parameters :
Query q = em.createQuery("SELECT p FROM Person p WHERE p.lastName = ?1 AND p.firstName = ?2");
q.setParameter(1, theSurname);
q.setParameter(2, theForename);

So in the first case we have parameters that are prefixed by : (colon) to identify them as a parameter and we use that name when calling Query.setParameter(). In the second case we have parameters that are prefixed by ? (question mark) and are numbered starting at 1. We then use the numbered position when calling Query.setParameter().


CASE expressions

For particular use in the result clause, you can make use of a CASE expression where you want to return different things based on some condition(s). Like this

Query q = em.createQuery(
    "SELECT p.personNum, CASE WHEN p.age < 18 THEN 'Youth' WHEN p.age >= 18 AND p.age < 65 THEN 'Adult' ELSE 'Old' END FROM Person p");

So in this case the second result value will be a String, either "Youth", "Adult" or "Old" depending on the age of the person. The BNF structure of the JPQL CASE expression is

CASE WHEN conditional_expression THEN scalar_expression {WHEN conditional_expression THEN scalar_expression}* ELSE scalar_expression END
                

JPQL Functions

JPQL provides an SQL-like query language. Just as with SQL, JPQL also supports a range of functions to enhance the querying possibilities. The tables below also mark whether a particular method is supported for evaluation in-memory.

Please note that you can easily add support for other functions for evaluation "in-memory" using this DataNucleus plugin point

Please note that you can easily add support for other functions with RDBMS datastore using this DataNucleus plugin point

Aggregate Functions

There are a series of aggregate functions for aggregating the values of a field for all rows of the results.

Function Name Description Standard In-Memory
COUNT(field) Returns the aggregate count of the field (Long)
MIN(field) Returns the minimum value of the field (type of the field)
MAX(field) Returns the maximum value of the field (type of the field)
AVG(field) Returns the average value of the field (Double)
SUM(field) Returns the sum of the field value(s) (Long, Double, BigInteger, BigDecimal)

String Functions

There are a series of functions to be applied to String fields.

Function Name Description Standard In-Memory
CONCAT(str_field, str_field2 [, str_fieldX]) Returns the concatenation of the string fields
SUBSTRING(str_field, num1 [, num2]) Returns the substring of the string field starting at position num1, and optionally with the length of num2
TRIM([trim_spec] [trim_char] [FROM] str_field) Returns trimmed form of the string field
LOWER(str_field) Returns the lower case form of the string field
UPPER(str_field) Returns the upper case form of the string field
LENGTH(str_field) Returns the size of the string field (number of characters)
LOCATE(str_field1, str_field2 [, num]) Returns position of str_field2 in str_field1 optionally starting at num

Temporal Functions

There are a series of functions for use with temporal values

Function Name Description Standard In-Memory
CURRENT_DATE Returns the current date (day month year) of the datastore server
CURRENT_TIME Returns the current time (hour minute second) of the datastore server
CURRENT_TIMESTAMP Returns the current timestamp of the datastore server
YEAR(dateField) Returns the year of the specified date
MONTH(dateField) Returns the month of the specified date
DAY(dateField) Returns the day of the month of the specified date
HOUR(dateField) Returns the hour of the specified date
MINUTE(dateField) Returns the minute of the specified date
SECOND(dateField) Returns the second of the specified date

Collection Functions

There are a series of functions for use with collection values

Function Name Description Standard In-Memory
INDEX(field) Returns index number of the field element when that is the element of an indexed List field.
SIZE(collection_field) Returns the size of the collection field. Empty collection will return 0

Arithmetic Functions

There are a series of functions for arithmetic use

Function Name Description Standard In-Memory
ABS(numeric_field) Returns the absolute value of the numeric field
SQRT(numeric_field) Returns the square root of the numeric field
MOD(num_field1, num_field2) Returns the modulus of the two numeric fields (num_field1 % num_field2

Other Functions

You have a further function available

Function Name Description Standard In-Memory
FUNCTION(name, [arg1 [,arg2 ...]]) Executes the specified SQL function "name" with the defined arguments

Ordering of Results

By default your results will be returned in the order determined by the datastore, so don't rely on any particular order. You can, of course, specify the order yourself. You do this using field/property names and ASC/DESC keywords. For example

field1 ASC, field2 DESC

which will sort primarily by field1 in ascending order, then secondarily by field2 in descending order.

Although it is not (yet) standard JPQL, DataNucleus also supports specifying a directive for where NULL values of the ordered field/property go in the order, so the full syntax supported is

fieldName [ASC|DESC] [NULLS FIRST|NULLS LAST]

Note that this is only supported for a few RDBMS (H2, HSQLDB, PostgreSQL, DB2, Oracle, Derby, Firebird, SQLServer v11+).


Subqueries

With JPQL the user has a very flexible query syntax which allows for querying of the vast majority of data components in a single query. In some situations it is desirable for the query to utilise the results of a separate query in its calculations. JPQL also allows the use of subqueries. Here's an example

SELECT Object(e) FROM org.datanucleus.Employee e 
WHERE e.salary > (SELECT avg(f.salary) FROM org.datanucleus.Employee f)

So we want to find all Employees that have a salary greater than the average salary. The subquery must be in parentheses (brackets). Note that we have defined the subquery with an alias of "f", whereas in the outer query the alias is "e".

ALL/ANY Expressions

One use of subqueries with JPQL is where you want to compare with some or all of a particular expression. To give an example

SELECT emp FROM Employee emp 
WHERE emp.salary > ALL (SELECT m.salary FROM Manager m WHERE m.department = emp.department)

So this returns all employees that earn more than all managers in the same department! You can also compare with some/any, like this

SELECT emp FROM Employee emp 
WHERE emp.salary > ANY (SELECT m.salary FROM Manager m WHERE m.department = emp.department)

So this returns all employees that earn more than any one Manager in the same department.

EXISTS Expressions

Another use of subqueries in JPQL is where you want to check on the existence of a particular thing. For example

SELECT DISTINCT emp FROM Employee emp
WHERE EXISTS (SELECT emp2 FROM Employee emp2 WHERE emp2 = emp.spouse)

So this returns the employees that have a partner also employed.


Range of Results

With JPQL you can select the range of results to be returned. For example if you have a web page and you are paginating the results of some search, you may want to get the results from a query in blocks of 20 say, with results 0 to 19 on the first page, then 20 to 39, etc. You can facilitate this as follows

Query q = em.createQuery("SELECT p FROM Person p WHERE p.age > 20");
q.setFirstResult(0);
q.setMaxResults(20);

So with this query we get results 0 to 19 inclusive.


Query Result

Whilst the majority of the time you will want to return instances of a candidate class, JPQL also allows you to return customised results. Consider the following example

Query q = em.createQuery("SELECT p.firstName, p.lastName FROM Person p WHERE p.age > 20");
List<Object[]> results = q.getResultList();

this returns the first and last name for each Person meeting that filter. Obviously we may have some container class that we would like the results returned in, so if we change the query to this

Query<PersonName> q = em.createQuery(
    "SELECT p.firstName, p.lastName FROM Person p WHERE p.age > 20", PersonName.class);
List<PersonName> results = q.getResultList();

so each result is a PersonName, holding the first and last name. This result class needs to match one of the following structures

  • Constructor taking arguments of the same types and the same order as the result clause. An instance of the result class is created using this constructor. For example
    public class PersonName
    {
        protected String firstName = null;
        protected String lastName = null;
    
        public PersonName(String first, String last)
        {
            this.firstName = first;
            this.lastName = last;
        }
    
        ...
    }
  • Default constructor, and setters for the different result columns, using the alias name for each column as the property name of the setter. For example
    public class PersonName
    {
        protected String firstName = null;
        protected String lastName = null;
    
        public PersonName()
        {
        }
    
        public void setFirstName(String first) {this.firstName = first;}
        public void setLastName(String last) {this.lastName = last;}
    
        ...
    }

Note that if the setter property name doesn't match the query result component name, you should use AS {alias} in the query so they are the same.


Query Execution

There are two ways to execute a JPQL query. When you know it will return 0 or 1 results you call

Object result = query.getSingleResult();

If however you know that the query will return multiple results, or you just don't know then you would call

List results = query.getResultList();

JPQL In-Memory queries

The typical use of a JPQL query is to translate it into the native query language of the datastore and return objects matched by the query. For many datastores it is simply impossible to support the full JPQL syntax in the datastore native query language and so it is necessary to evaluate the query in-memory. This means that we evaluate as much as we can in the datastore and then instantiate those objects and evaluate further in-memory. Here we document the current capabilities of in-memory evaluation in DataNucleus.

  • Subqueries using ALL, ANY, SOME, EXISTS are not currently supported
  • MEMBER OF syntax is not currently supported.

To enable evaluation in memory you specify the query hint datanucleus.query.evaluateInMemory to true as follows

query.setHint("datanucleus.query.evaluateInMemory","true");

Named Query

With the JPA API you can either define a query at runtime, or define it in the MetaData/annotations for a class and refer to it at runtime using a symbolic name. This second option means that the method of invoking the query at runtime is much simplified. To demonstrate the process, lets say we have a class called Product (something to sell in a store). We define the JPA Meta-Data for the class in the normal way, but we also have some query that we know we will require, so we define the following in the Meta-Data.

    <entity class="Product">
        ...
        <named-query name="SoldOut"><![CDATA[
        SELECT p FROM Product p WHERE p.status == "Sold Out"
        ]]></named-query>
    </entity>

or using annotations

@Entity
@NamedQuery(name="SoldOut", query="SELECT p FROM Product p WHERE p.status == 'Sold Out'")
public class Product {...}

Note that DataNucleus also supports specifying this using annotations in non-Entity classes. This is beyond the JPA spec, but is very useful in real applications

So we have a JPQL query called "SoldOut" defined for the class Product that returns all Products (and subclasses) that have a status of "Sold Out". Out of interest, what we would then do in our application to execute this query woule be

Query query = em.createNamedQuery("SoldOut");
List<Product> results = query.getResultList();

JPQL DELETE Queries

The JPA specification defines a mode of JPQL for deleting objects from the datastore.

DELETE Syntax

The syntax for deleting records is very similar to selecting them

DELETE FROM [<candidate-class>]
    [WHERE <filter>]

The "keywords" in the query are shown in UPPER CASE are case-insensitive.

Query query = em.createQuery("DELETE FROM Person p WHERE firstName = 'Fred'");
int numRowsDeleted = query.executeUpdate();

JPQL UPDATE Queries

The JPA specification defines a mode of JPQL for updating objects in the datastore.

UPDATE Syntax

The syntax for updating records is very similar to selecting them

UPDATE [<candidate-class>] SET item1=value1, item2=value2
    [WHERE <filter>]

The "keywords" in the query are shown in UPPER CASE are case-insensitive.

Query query = em.createQuery("UPDATE Person p SET p.salary = 10000 WHERE age = 18");
int numRowsUpdated = query.executeUpdate();

JPQL BNF Notation

The BNF defining the JPQL query language is shown below.

QL_statement ::= select_statement | update_statement | delete_statement
select_statement ::= select_clause from_clause [where_clause] [groupby_clause] [having_clause] [orderby_clause]

update_statement ::= update_clause [where_clause]

delete_statement ::= delete_clause [where_clause]

from_clause ::= FROM identification_variable_declaration
    {, {identification_variable_declaration | collection_member_declaration}}*
identification_variable_declaration ::= range_variable_declaration { join | fetch_join }*
range_variable_declaration ::= entity_name [AS] identification_variable

join ::= join_spec join_association_path_expression [AS] identification_variable
fetch_join ::= join_spec FETCH join_association_path_expression
join_spec::= [ LEFT [OUTER] | INNER ] JOIN
join_association_path_expression ::= join_collection_valued_path_expression | join_single_valued_path_expression
join_collection_valued_path_expression::=
    identification_variable.{single_valued_embeddable_object_field.}*collection_valued_field
join_single_valued_path_expression::=
    identification_variable.{single_valued_embeddable_object_field.}*single_valued_object_field
collection_member_declaration ::=
    IN (collection_valued_path_expression) [AS] identification_variable
qualified_identification_variable ::= KEY(identification_variable) | VALUE(identification_variable) |
    ENTRY(identification_variable)
single_valued_path_expression ::= qualified_identification_variable |
    state_field_path_expression | single_valued_object_path_expression
general_identification_variable ::= identification_variable | KEY(identification_variable) | 
    VALUE(identification_variable)

state_field_path_expression ::= general_identification_variable.{single_valued_object_field.}*state_field
single_valued_object_path_expression ::=
    general_identification_variable.{single_valued_object_field.}* single_valued_object_field
collection_valued_path_expression ::=
    general_identification_variable.{single_valued_object_field.}*collection_valued_field

update_clause ::= UPDATE entity_name [[AS] identification_variable] SET update_item {, update_item}*
update_item ::= [identification_variable.]{state_field | single_valued_object_field} = new_value
new_value ::= scalar_expression | simple_entity_expression | NULL

delete_clause ::= DELETE FROM entity_name [[AS] identification_variable]

select_clause ::= SELECT [DISTINCT] select_item {, select_item}*
select_item ::= select_expression [[AS] result_variable]
select_expression ::= single_valued_path_expression | scalar_expression | aggregate_expression | 
    identification_variable | OBJECT(identification_variable) | constructor_expression
constructor_expression ::= NEW constructor_name ( constructor_item {, constructor_item}* )
constructor_item ::= single_valued_path_expression | scalar_expression | aggregate_expression |
    identification_variable

aggregate_expression ::= { AVG | MAX | MIN | SUM } ([DISTINCT] state_field_path_expression) |
    COUNT ([DISTINCT] identification_variable | state_field_path_expression |
    single_valued_object_path_expression)

where_clause ::= WHERE conditional_expression
groupby_clause ::= GROUP BY groupby_item {, groupby_item}*
groupby_item ::= single_valued_path_expression | identification_variable
having_clause ::= HAVING conditional_expression
orderby_clause ::= ORDER BY orderby_item {, orderby_item}*
orderby_item ::= state_field_path_expression | result_variable [ ASC | DESC ]

subquery ::= simple_select_clause subquery_from_clause [where_clause] [groupby_clause] [having_clause]
subquery_from_clause ::= FROM subselect_identification_variable_declaration
    {, subselect_identification_variable_declaration | collection_member_declaration}*

subselect_identification_variable_declaration ::= identification_variable_declaration |
    derived_path_expression [AS] identification_variable {join}*|
    derived_collection_member_declaration
derived_path_expression ::=
    superquery_identification_variable.{single_valued_object_field.}*collection_valued_field |
    superquery_identification_variable.{single_valued_object_field.}*single_valued_object_field
derived_collection_member_declaration ::=
    IN superquery_identification_variable.{single_valued_object_field.}*collection_valued_field
simple_select_clause ::= SELECT [DISTINCT] simple_select_expression
simple_select_expression::= single_valued_path_expression | scalar_expression | aggregate_expression |
    identification_variable
scalar_expression ::= simple_arithmetic_expression | string_primary | enum_primary |
    datetime_primary | boolean_primary | case_expression | entity_type_expression
conditional_expression ::= conditional_term | conditional_expression OR conditional_term
conditional_term ::= conditional_factor | conditional_term AND conditional_factor
conditional_factor ::= [ NOT ] conditional_primary
conditional_primary ::= simple_cond_expression | (conditional_expression)
simple_cond_expression ::= comparison_expression | between_expression |
    in_expression | like_expression | null_comparison_expression | 
    empty_collection_comparison_expression | collection_member_expression | exists_expression
between_expression ::= 
    arithmetic_expression [NOT] BETWEEN arithmetic_expression AND arithmetic_expression |
    string_expression [NOT] BETWEEN string_expression AND string_expression |
    datetime_expression [NOT] BETWEEN datetime_expression AND datetime_expression
in_expression ::= {state_field_path_expression | type_discriminator} [NOT] IN
    { ( in_item {, in_item}* ) | (subquery) | collection_valued_input_parameter }
in_item ::= literal | single_valued_input_parameter
like_expression ::= string_expression [NOT] LIKE pattern_value [ESCAPE escape_character]
null_comparison_expression ::= {single_valued_path_expression | input_parameter} IS [NOT] NULL

empty_collection_comparison_expression ::= collection_valued_path_expression IS [NOT] EMPTY
collection_member_expression ::= entity_or_value_expression [NOT] MEMBER [OF] collection_valued_path_expression
entity_or_value_expression ::= single_valued_object_path_expression | state_field_path_expression |
    simple_entity_or_value_expression
simple_entity_or_value_expression ::= identification_variable | input_parameter | literal
exists_expression::= [NOT] EXISTS (subquery)
all_or_any_expression ::= { ALL | ANY | SOME} (subquery)
comparison_expression ::=
    string_expression comparison_operator {string_expression | all_or_any_expression} |
    boolean_expression { =|<>} {boolean_expression | all_or_any_expression} |
    enum_expression { =|<>} {enum_expression | all_or_any_expression} |
    datetime_expression comparison_operator
    {datetime_expression | all_or_any_expression} |
    entity_expression { = | <>} {entity_expression | all_or_any_expression} |
    arithmetic_expression comparison_operator
    {arithmetic_expression | all_or_any_expression} |
    entity_type_expression { =|<>} entity_type_expression}
comparison_operator ::= = | > | >= | < | <= | <>
arithmetic_expression ::= simple_arithmetic_expression | (subquery)
simple_arithmetic_expression ::= arithmetic_term | simple_arithmetic_expression { + | - } arithmetic_term
arithmetic_term ::= arithmetic_factor | arithmetic_term { * | / } arithmetic_factor
arithmetic_factor ::= [{ + | - }] arithmetic_primary
arithmetic_primary ::= state_field_path_expression | numeric_literal |
    (simple_arithmetic_expression) | input_parameter | functions_returning_numerics |
     aggregate_expression | case_expression
string_expression ::= string_primary | (subquery)
string_primary ::= state_field_path_expression | string_literal |
    input_parameter | functions_returning_strings | aggregate_expression | case_expression
datetime_expression ::= datetime_primary | (subquery)

datetime_primary ::= state_field_path_expression | input_parameter | functions_returning_datetime |
    aggregate_expression | case_expression | date_time_timestamp_literal
boolean_expression ::= boolean_primary | (subquery)
boolean_primary ::= state_field_path_expression | boolean_literal | input_parameter |
    case_expression
enum_expression ::= enum_primary | (subquery)
enum_primary ::= state_field_path_expression | enum_literal | input_parameter | case_expression
entity_expression ::= single_valued_object_path_expression | simple_entity_expression
simple_entity_expression ::= identification_variable | input_parameter
entity_type_expression ::= type_discriminator | entity_type_literal | input_parameter
type_discriminator ::= TYPE(identification_variable | single_valued_object_path_expression |
    input_parameter)
functions_returning_numerics::= LENGTH(string_primary) |
    LOCATE(string_primary, string_primary[, simple_arithmetic_expression]) |
    ABS(simple_arithmetic_expression) |
    SQRT(simple_arithmetic_expression) |
    MOD(simple_arithmetic_expression, simple_arithmetic_expression) |
    SIZE(collection_valued_path_expression) |
    INDEX(identification_variable)
functions_returning_datetime ::= CURRENT_DATE | CURRENT_TIME | CURRENT_TIMESTAMP

functions_returning_strings ::=
    CONCAT(string_primary, string_primary {, string_primary}*) |
    SUBSTRING(string_primary, simple_arithmetic_expression [, simple_arithmetic_expression]) |
    TRIM([[trim_specification] [trim_character] FROM] string_primary) |
    LOWER(string_primary) |
    UPPER(string_primary)
trim_specification ::= LEADING | TRAILING | BOTH
case_expression ::= general_case_expression | simple_case_expression | coalesce_expression |
    nullif_expression
general_case_expression::= CASE when_clause {when_clause}* ELSE scalar_expression END
when_clause::= WHEN conditional_expression THEN scalar_expression
simple_case_expression::= 
    CASE case_operand simple_when_clause {simple_when_clause}*
    ELSE scalar_expression
    END
case_operand::= state_field_path_expression | type_discriminator
simple_when_clause::= WHEN scalar_expression THEN scalar_expression
coalesce_expression::= COALESCE(scalar_expression {, scalar_expression}+)
nullif_expression::= NULLIF(scalar_expression, scalar_expression)