Issue Details (XML | Word | Printable)

Key: NUCCORE-588
Type: Improvement Improvement
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Andy Jefferson
Reporter: Tom Zurkan
Votes: 1
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
DataNucleus Core

Be able to cache relationships.

Created: 21/Oct/10 11:19 PM   Updated: 23/Jan/11 11:57 AM   Resolved: 10/Dec/10 11:08 AM
Component/s: None
Affects Version/s: None
Fix Version/s: 2.2.0.release

File Attachments: 1. Java Source File CacheRelationshipTest.java (4 kB)
2. Java Source File CacheRelationshipTest.java (3 kB)
3. File jdostatemanager.diff (14 kB)
4. Java Source File JDOStateManager2.java (168 kB)
5. Java Source File RelationshipTest.java (3 kB)


Datastore: MySQL


 Description  « Hide
What we would like to do is be able to cache relationship fields so that the first time the relationship is accessed, there will be a join against the owner id. On subsequent calls, the relationship would be cached with the id. The id would be used to look up in the cache and either pass back the object, load it by id, and cache it. The relationship is removed when a field is made dirty.

I have augmented the StateManager to accomplish this but I am not sure I have gone down the best path. I am submitting the enclosed augmented JDOStatemanager to show where I trying to go with this. One thing to look at it that I have to use the OID to get the class of the object id i have so that i can create hallow pms with statemanagers.

it is configurable via the cacheRelationships variable. <persistence-property name="datanucleus.CacheRelationships" value="false"/>

Sort Order: Ascending order - Click to sort in descending order
Andy Jefferson added a comment - 22/Oct/10 06:20 AM
Attach what was previously in the "description"!

Andy Jefferson added a comment - 22/Oct/10 06:21 AM
DataNucleus already caches "relationships". How is this different ? Also "patch" format is far easier to read to decide what has changed

Tom Zurkan added a comment - 22/Oct/10 07:10 PM - edited
sorry, should have attached it the first time.

i don't see relationships in level 2 cache at all. please explain. how do i turn that on and use it? where would it be implemented? we are seeing that anytime a relationship is accessed, it goes through the datastore. is it just not in a release yet?

Andy Jefferson added a comment - 22/Oct/10 07:30 PM
JDO defines annotations and XML metadata to mark fields as cacheable (or not).
http://www.datanucleus.org/products/accessplatform_2_2/jdo/cache.html#level2

Look at org.datanucleus.cache.CachedPC and the code already in JDOStateManagerImpl for putting objects in the L2 cache and getting them out. Has been in releases for a very long time, so if you don't see relationships being cached then *define a testcase that demonstrates it*

Tom Zurkan added a comment - 22/Oct/10 07:44 PM
So, collections and individual 1-1 relationships are cachable by default and you are saying that they should be in CachePC relationship? I can easly reproduce that as nothing ever goes to CachePC relationshipsFields map in our app. I will put together a test case today. thanks.

Tom Zurkan added a comment - 23/Oct/10 02:22 AM
ok, i created a junit test that showed that my pc object was put to level2 but the relations field was never updated. here is my class definition:


<class name="FTeam" persistence-capable-superclass="com.protrade.common.persistence.HBaseIdCreateTime">
<field name="league" null-value="exception">
</field>
<field name="owner" persistence-modifier="persistent" null-value="exception">
</field>
<field name="name" persistence-modifier="persistent">
<index />
</field>
<field name="rosterEntries" table="FTEAM_ROSTERENTRIES">
<collection element-type="FScorer" />
<join column="JDOID" />
<element column="ROSTERENTRIES_JDOID" />
</field>
</class>

The league is a 1-1 (M-1) here and the rosterEntries is a 1-M. I just had a simple junit test that went through these relationships as follows:

public void testTeamCache() {
try {
FLeague league = getFLeagueDao().loadByFieldNCS( FLeague.class, "name", "Bugbash 4" );

FootballLeagueRoster roster = league.getRoster();

FTeamId id = null;

for (FTeam team : league.getTeams()) {
if ("Your momma's so old".equals( team.getName() )) {
Set<FScorer> scorers = team.getRosterEntries();

team.getLeague();

id = team.getFTeamId();
}
}

FTeam team = getFLeagueDao().loadNCS( id );

team.getLeague();

team.getRosterEntries();
}
catch ( PersistenceException e ) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}


with an output from logging looking something like this:

2010-10-22 18:15:14,808 [main] DEBUG DataNucleus.Cache - CachePopulateFM.fetchObjectField this=com.protrade.ffootball.data.entities.team.FTeam@125b750 field=1 is having its SCO wrapper replaced prior to L2 caching.
2010-10-22 18:15:14,808 [main] DEBUG DataNucleus.Cache - Object "com.protrade.ffootball.data.entities.team.FTeam@125b750" (id="com.protrade.ffootball.data.entities.team.FTeam-12035695") added to Level 2 cache (loadedFlags="[YYNYNNY]", relationFields="null")


am i missing something?

thanks

 

Andy Jefferson added a comment - 23/Oct/10 08:12 AM
> loadedFlags="[YYNYNNY]", relationFields="null"
> am i missing something?

You mean the fact that half of the fields (field 2, 4, 5) aren't loaded, so consequently can't be cached ?


Why not just run the DN end-to-end tests "test.jdo.datastore" and look at the log, and you see log entries like

08:07:44,910 (main) DEBUG [DataNucleus.Cache] - Object "org.jpox.samples.types.container.ContainerItem@10fd7f6" (id="2[OID]org.jpox.samples.types.container.ContainerItem") added to Level 2 cache (loadedFlags="[YYYYY]", relationFields="null")
08:07:44,911 (main) DEBUG [DataNucleus.Cache] - Object "org.jpox.samples.types.linkedhashset.LinkedHashSet1@15f7107" (id="1[OID]org.jpox.samples.types.linkedhashset.LinkedHashSet1") added to Level 2 cache (loadedFlags="[YYY]", relationFields="[items]")
08:07:45,272 (main) DEBUG [DataNucleus.Cache] - Object "org.jpox.samples.types.linkedhashset.LinkedHashSet1@126d3df" (id="3[OID]org.jpox.samples.types.linkedhashset.LinkedHashSet1") updated in Level 2 cache (loadedFlags="[YYY]", relationFields="[items]")
08:07:45,476 (main) DEBUG [DataNucleus.Cache] - Object with id="6[OID]org.jpox.samples.types.linkedhashset.LinkedHashSet1" taken from Level 2 cache (loadedFlags="[YYY]", relationFields="[items]") [cache size = 6] - extracting into managed persistable object

08:07:50,425 (main) DEBUG [DataNucleus.Cache] - Object "org.jpox.samples.one_many.collection.ListHolder@10c6cfc" (id="10[OID]org.jpox.samples.one_many.collection.ListHolder") added to Level 2 cache (loadedFlags="[YYNYYYYYNYYYYYYY]", relationFields="[joinListPCShared2, joinListPCShared1, fkListPC2, fkListPCShared1, fkListPCShared2, fkListPC, joinListPC]")

Tom Zurkan added a comment - 25/Oct/10 07:36 PM
ok, so you are saying that all fields have to be loaded in order for relationship fields to be cached? if that is the case, then my request/bug is that i would like to be able to cache fields that are loaded when some fields are not.

Andy Jefferson added a comment - 26/Oct/10 08:01 AM
I'm saying that to cache a field (whether relation or not) it has to be loaded. This seems an obvious requirement, otherwise you force loading of a field just to cache it.

Tom Zurkan added a comment - 26/Oct/10 07:28 PM
yes, of course it has to be loaded and i understand what you are asking/saying. let me clarify. i used a bad example.

using the same jdo description look at it like this:
// i have an id and use it to load a object.
FTeamId id = new FTeamId(12035736L);

FTeam team = getFLeagueDao().loadNCS(id);
// at this point the team is obtained and added to level 1 and 2 cache. the relationship
// fields are not loaded but all primitives are:
// added to Level 2 cache (loadedFlags="[YYNYNNY]", relationFields="null")

System.out.println(team.getLeague().getName());
// at this point the league object is loaded using a join with the team oid.
// the league is added to level 1 cache and the team level 1 cache now has the league as well.
team = getFLeagueDao().loadNCS( team.getFTeamId() );

System.out.println(team.getLeague().getName());

What I am asking is that when the league is loaded, the L2 cache for the team->league relationship is updated and the league is added to level 2 cache. this means that subsequent calls after L1 has been cleared (such as after a request cycle) will use the relationship to find the OID and then look in L2 for that object or load it from the datastore directly without the join.

the example above does not put league in level2 cache and it does not update the relationship in CachePC

does that make sense?


Andy Jefferson added a comment - 26/Oct/10 08:05 PM
I see no transactions. The L2 cache is only current set-up to update at commit.
Define a testcase that actually demonstrates something, otherwise it is futile going back and forth.

Tom Zurkan added a comment - 26/Oct/10 08:54 PM
will do. but, what i am asking for is essentially L2 being used outside a transaction like the example above.

Fernando Padilla added a comment - 26/Oct/10 09:02 PM
Hey, I'm also following this bug. :)

I think this is the best way to summarize it.

If DN goes through the trouble of executing an SQL query to get some data, that data should be captured into L2 so that it doesn't have to run that query again.


Huge performance gains up and down the stack. Every time you save a db query it's huge both on the webserver's cpu/response time, but also all of the database resources you save.. really really good stuff. :)

Tom Zurkan added a comment - 26/Oct/10 11:04 PM
yes, fernando has said it succinctly. what i am looking for is that all accessed data should be cached.

Tom Zurkan added a comment - 28/Oct/10 09:16 PM
The state manager that I provided does this. But, I am not sure that the direction is correct. Essentially, the state manager holds the relationships from cache and when a field is requested and not loaded, it gets the id from the relationship, and then loads the field from the datastore or from cache.

Andy Jefferson added a comment - 29/Oct/10 07:32 AM
The JDO spec defines adequately the L2 cache's role. What I need, as said earlier, is a testcase that demonstrates that it isn't caching particular data in some situation. As also mentioned earlier, the provided StateManager doesn't tell me what has changed, which is why patch format is the only thing that makes sense ... going hand in hand with the provided testcase ...

Fernando Padilla added a comment - 29/Oct/10 08:02 AM
Even if the spec says something like that, it's definitely not the most efficient/effetive behavior.

My bet is that it says that you MUST update the L2 cache whenever an object is updated through transaction commit. But I would take a bet that does not say that you MUST NOT update the L2 cache otherwise.



Can you point me to where in the spec it talks about the L2 cache only being populated during transactions? I just downloaded the full spec and can't seem to find any prescription to how the L2 cache is "supposed to work".

( I was searching through the latest: http://jcp.org/aboutJava/communityprocess/mrel/jsr243/index3.html )

Andy Jefferson added a comment - 29/Oct/10 08:07 AM
What? I said the JDO spec defines what the L2 cache is for so there is no need to repeat things here. I also said I want to see a testcase that demonstrates some point where it is not putting fields in the L2 cache, because I don't have one. I also said I wrote the L2 cache code around testcases that are transactional (since that is what JPOX used to be, solely) ... i.e we have no testcases that are non-tx and test the L2 cache (we have non-tx tests but for other things). Nowhere have I said it isn't supposed to work non-tx.

So I return to my first question. I need a testcase that demonstrates something ... the prerequisite for *any* issue

Fernando Padilla added a comment - 29/Oct/10 08:16 AM
Ah Ok. so very sorry. :)

I got confused between you saying "The L2 cache is only current set-up to update at commit" and then later saying "JDO spec defines adequately the L2 cache's role" when we kept pushing about how we wanted it work..

sorry I see my mistake now.

I can see how you want to push back for us to give you a clear "unit test". That's rational, so I guess that's the compromise.



Can we give it to you in pseudo code? Or could you give us a hint, as to a particular unit test we could adapt for this? (sorry i haven't spent time with the unit test code yet).

Andy Jefferson added a comment - 29/Oct/10 08:25 AM
No problem. A testcase can be some sample classes (with some particular relations), some metadata (or annotations), and then sample persistence code (makePersistent, query) and something that says with this persistence operation field X of object Y is not being cached even though it was just loaded/accessed

[It wouldn't need necessarily to be a JUnit since to work out if a field is actually cached in the L2 cache could involve some work]

Fernando Padilla added a comment - 29/Oct/10 08:35 PM
So, do you want running code? like within DN's unit tests? Or just pseudo-like code that you will get running somehow?

ps - Either way, is there an easy way to print out a PC's loaded field array? So we can check it within the test?

Tom Zurkan added a comment - 30/Oct/10 01:00 AM
so, i svn diffed the jdostatemanager with what i had originally checked in as a pretty much straight copy. i'm hoping this gives you a good view of what i changed in the statemanager. let me know if it is helpful at all.

Andy Jefferson added a comment - 30/Oct/10 08:15 AM
Running tests ? As per
http://www.datanucleus.org/project/problem_jdo_testcase.html
is the preference, so either JUnit that uses our existing samples data from the test suites, if not that then a Main.java and the sample classes/metadata, and if not that then pseudo code.

Loaded/dirty fields, as per the foot of
http://www.datanucleus.org/products/accessplatform_2_2/jdo/object_lifecycle.html

Tom Zurkan added a comment - 03/Nov/10 02:56 AM - edited
ok, so i took your existing junit test for 1-1 relationships. i think it is pretty self-explanatory. i removed the transactions and load the preexisting loginAccount. then i access the login. then, i load the loginAccount again and get the login. you will notice that it does a join both times to get the login. please, let me know if you want further details. i did not include the datanucleus.properties. thanks.

Tom Zurkan added a comment - 08/Nov/10 08:15 PM
hey andy, i haven't heard anything since i posted the unit test. is this sufficient information for you?

Andy Jefferson added a comment - 08/Nov/10 10:01 PM - edited
Tom, not had much time recently. You don't say which test scenario you took that test from. test.jdo.datastore ? i.e does the sample have to use datastore id for it to work ? It doesn't persist any objects either so don't see how anything can work without knowing where it comes from

Tom Zurkan added a comment - 08/Nov/10 10:51 PM
yes, it is from test.jdo.datastore and yes, you would need to use a datastore id for it to work. alternatively, i can put the objects into the db as per the original unit test and then evict it from l2 cache before looking them up (and use a different pm so that it is not in L1). the point being that there will be a time where the insert happened at some other time, but, we need to read the data in without a transaction and have it go to L2 with relationships cached as well. i will adapt the original junit and send it as a patch since it sounds like that would help.

Andy Jefferson added a comment - 09/Nov/10 10:40 AM - edited
Ok, but I still don't know what is "DNLoginAccount", "DNLogin" ... presumably some copies of the "test.samples" "org.jpox.samples.one_one.unidir" classes. There is no "RelationshipTest" in "test.jdo.datastore" btw; it is in "test.jdo.orm.datastore".

Is this test a real JUnit test that will fail if the relations are not L2 cached ? If so it ought to go under "test.jdo.general" in CacheTest, since it is testing the cache. If not and just to create the situation in which log entries will show a problem then it doesn't matter what it's called.

Also, if the test is not going to "fail" in the JUnit sense, then put some statements to the log (using NucleusLogger.GENERAL.debug(...)) just before and just after where you say the field is not being L2 cached. This makes it clear what you're referring to

Tom Zurkan added a comment - 10/Nov/10 12:44 AM - edited
Ok Andy, I think I finally have it. This is a JUnit test that you can just drop into test.jdo.general. It shows the problem and fails as a JUnit test should. But, of course, no one would code something up like this except as a test. :) Let me know how you feel about this one. Thanks! CacheRelationshipTest.java

Tom Zurkan added a comment - 12/Nov/10 08:15 PM
hi andy, wondering if this last unit test highlights the issue. i know it only really tests if the relationship is in cache and not that it is used when the field is accessed. so, just wondering what your feedback is. thanks.

Tom Zurkan added a comment - 16/Nov/10 02:41 AM
ok, so now i am running off of your datanucleus trunk and i see that my junit test passes which is great news. is the relationship map being used when fetching a field? i am going to test that next and hopefully come up with a unit test that shows it. thanks for step one. :)

Tom Zurkan added a comment - 18/Nov/10 03:21 AM
ok, so, now i have added another unit test that will fail because the loginaccount.login call causes a datastore query. of course, i had to add the following line to my datanucleus.properties file: datanucleus.managedRuntime=true
Also added the plugin to the project so that i can get the query count:

<dependency>
   <groupId>org.datanucleus</groupId>
   <artifactId>datanucleus-management</artifactId>
   <version>1.0.2</version>
</dependency>

this covers both the adding of the field to cache and retrieving it. thanks!

Andy Jefferson added a comment - 10/Dec/10 11:08 AM
Presumed fixed in 2.2.0.release. All provided tests pass. If any other situation exists then please provide a further test