Issue Details (XML | Word | Printable)

Key: NUCRDBMS-390
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Testcase Required Testcase Required
Assignee: Unassigned
Reporter: Areg Beketovski
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
DataNucleus Store RDBMS

"datanucleus.query.resultSizeMethod=count " produces COUNT() query without "DISTINCT" statement for JDOQL, resulting in wrong result size estimate

Created: 14/May/10 02:37 PM   Updated: 24/May/10 03:43 PM   Resolved: 18/May/10 02:20 PM
Component/s: Queries
Affects Version/s: 2.1.0.m2
Fix Version/s: 2.1.0.m3

File Attachments: 1. Zip Archive testcase_390_20100518.zip (4 kB)


Datastore: Microsoft SQL Server


 Description  « Hide
The following JDOQL query:

"SELECT FROM com.company.model.entity.Membership WHERE t.transactionDate >= s0 && t.transactionDate <= e0 && t.entity == this && t.reverseDate == null &&
(t.responseCode == reponseType || t.responseCode == null) &&
Math.abs(t.totalValue) >= 1000.0 VARIABLES Transaction t PARAMETERS java.util.Date s0, java.util.Date e0, String reponseType import com.company.model.transaction.Transaction"

is translated to SQL like this:

"SELECT DISTINCT T2.CREATIONDATE,T2.FOREIGNID,T2.ID,T2.LASTMODIFIEDDATE,T2."NUMBER",T0.CLOSEDATE,T0.DELETED,T0.MANUALPRIMARYADDRESS,T0.OPENDATE,T0.PEP,T0.STATUS,T0.TAXID,T0.MANUALRISK,T0.MEMBERSHIPTYPE,T0.DISCRIMINATOR
FROM ENTITY T0 CROSS JOIN "TRANSACTION" VAR_T INNER JOIN SUBJECT T2 ON T0.ID = T2.ID WHERE
((T0.DISCRIMINATOR = 'com.company.model.entity.Membership' OR T0.DISCRIMINATOR = 'com.company.model.entity.AggregatedMembership')) AND
VAR_T.TRANSACTIONDATE >= '2009-07-13 23:59:00.0' AND VAR_T.TRANSACTIONDATE <= '2009-07-20 23:59:00.0'
AND VAR_T.ENTITY_ID_OID = T0.ID AND VAR_T.REVERSEDATE IS NULL AND ((VAR_T.RESPONSECODE = 'accepted') OR (VAR_T.RESPONSECODE IS NULL)) AND
 ABS(VAR_T.TOTALVALUE) >= .1E4"

when setting "query.addExtension("datanucleus.query.resultSizeMethod", "count");" corresponding "COUNT" query generated is:

SELECT COUNT(T0.ID) FROM ENTITY T0 CROSS JOIN "TRANSACTION" VAR_T WHERE ((T0.DISCRIMINATOR = 'com.company.model.entity.Membership' OR T0.DISCRIMINATOR = 'com.company.model.entity.AggregatedMembership'))
AND VAR_T.TRANSACTIONDATE >= '2009-07-13 23:59:00.0' AND VAR_T.TRANSACTIONDATE <= '2009-07-20 23:59:00.0' AND VAR_T.ENTITY_ID_OID = T0.ID AND VAR_T.REVERSEDATE IS NULL AND ((VAR_T.RESPONSECODE = 'accepted')
OR (VAR_T.RESPONSECODE IS NULL)) AND ABS(VAR_T.TOTALVALUE) >= .1E4;

and the value returned by this query is incorrect (about 2000 items more than the original query returns). The correct query should use COUNT( DISTINCT T0.ID). The following query returns the correct number of items:

SELECT COUNT(DISTINCT T0.ID) FROM ENTITY T0 CROSS JOIN "TRANSACTION" VAR_T WHERE ((T0.DISCRIMINATOR = 'com.company.model.entity.Membership' OR T0.DISCRIMINATOR = 'com.company.model.entity.AggregatedMembership'))
AND VAR_T.TRANSACTIONDATE >= '2009-07-13 23:59:00.0' AND VAR_T.TRANSACTIONDATE <= '2009-07-20 23:59:00.0' AND VAR_T.ENTITY_ID_OID = T0.ID AND VAR_T.REVERSEDATE IS NULL AND ((VAR_T.RESPONSECODE = 'accepted')
OR (VAR_T.RESPONSECODE IS NULL)) AND ABS(VAR_T.TOTALVALUE) >= .1E4;



Sort Order: Ascending order - Click to sort in descending order
Areg Beketovski added a comment - 14/May/10 02:40 PM
Found the statment that generates "DISTINCT" for non-aggregate "SELECT" statements. This explicitly prevents "DISTINCT" for "COUNT()" statement (setting stmt.setDistinct(true) would be incorrect anyway - generates DISTINCT COUNT() which returns incorrect result).

org.datanucleus.store.rdbms.query.QueryToSQLMapper.compile()
....

else if (!explicitJoins && compilation.getExprResult() == null)
        {
            // Joins are made implicitly and no result so set distinct based on whether joining to other table groups
            if (stmt.getNumberOfTableGroups() > 1)
            {
                // Queries against an extent always consider only distinct candidate instances, regardless of
                // whether distinct is specified (JDO spec)
                stmt.setDistinct(true);
            }
        }
...

Wondering how to allow "COUNT" query generated in org.datanucleus.store.query.AbstractQueryResult.getSizeUsingMethod() to understand that "stmt.getNumberOfTableGroups()" and add "DISTINCT" inside the "COUNT",
without having reference to "stmt" object of the original query.

Areg Beketovski added a comment - 14/May/10 03:27 PM
Found possible solution - setting "resultDistinct" flag after "SQLStatement stmt" object is fully initialized in "org.datanucleus.store.rdbms.query.JDOQLQuery.compileQueryFull()":

....
        sqlMapper.compile();
        datastoreCompilation.setPrecompilable(sqlMapper.isPrecompilable());

        setResultDistinct(stmt.isDistinct());

        // Apply any range
        if (range != null)
...

Then in "org.datanucleus.store.query.AbstractQueryResult.getSizeUsingMethod()":

...
        Query countQuery = ec.getOMFContext().getQueryManager().newQuery("JDOQL", ec, query);
        countQuery.setResult("count( " + (query.getResultDistinct() ? "distinct" : "" ) + " this)");
        countQuery.setOrdering(null);
...


If this looks feasible, I will prepare a patch with trunk.

Andy Jefferson added a comment - 17/May/10 08:58 AM
Also please provide a testcase to demonstrate this, preferably a patch to test.jdo.datastore/test.jdo.application using our existing test samples

Areg Beketovski added a comment - 18/May/10 02:18 PM
Please see attached the testcase. This already works with trunk (great!), did not work on 2.1.0m3.

Andy Jefferson added a comment - 18/May/10 02:20 PM
> did not work on 2.1.0m3.

2.1.0.m3 is not released, so not sure how it could work on that;-)

Areg Beketovski added a comment - 18/May/10 03:19 PM
My fault, meant 2.1.0m3-SNAPSHOT from 17-May-2010 02:32 :-)