Issue Details (XML | Word | Printable)

Key: NUCMONGODB-67
Type: New Feature New Feature
Status: Open Open
Priority: Minor Minor
Assignee: Unassigned
Reporter: Chris Rued
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
DataNucleus Store MongoDB

Support @OneToMany with no List of ids stored at the "1" side.

Created: 20/Mar/12 07:37 PM   Updated: 15/Jun/12 09:10 AM
Component/s: Persistence, Query, Schema
Affects Version/s: None
Fix Version/s: None

File Attachments: 1. Text File NUCMONGODB-67-a.patch (3 kB)
2. Text File NUCMONGODB-67.patch (0.5 kB)


Datastore: MongoDB


 Description  « Hide
I'm not sure if this was a conscious decision for efficiency, or if this is a bug, but I found it surprising.

Suppose I have the following entity classes:

I have the following entity classes (only showing relevant portions):

    @Entity
    public class TestEntity implements Serializable {
        @OneToMany(mappedBy = "testEntity")
        private List<RelatedListElementEntity> relatedListElements;
    }

    @Entity
    public class RelatedListElementEntity implements Serializable {
        @ManyToOne
        private TestEntity testEntity;
    }




What I expected:
----------------
When I persist RelatedListElementEntity I expect to see something like:
    {
        "_id" : "8a998109363148060136314806f30002",
        "testEntity" : "8a998109363148060136314806f30001"
    } ,
    {
        "_id" : "8a99810936314806013631484c590003",
        "testEntity" : "8a998109363148060136314806f30001"
    }

When I persist TestEntity I expect to see something like:
    {
        "_id" : "8a998109363148060136314806f30001",
    }






What I got:
----------------
The RelatedListElementEntity looked good, but for TestEntity I see:
    {
        "_id" : "8a998109363148060136314806f30001",
        "relatedListElements" : [ "8a998109363148060136314806f30002", "8a99810936314806013631484c590003" ],
    }

(the list elements have been persisted into the document AND into the "owning" property).



Is this by design or a bug? If a bug, I could look into correcting it. It seems StoreFieldManager is the correct place to look, and we may want to add some checks into storeObjectField() to prevent this from being stored. One thought is to add some more checks into isStorable(), (which is called by storeObjectField()).

What do you think?


Andy Jefferson added a comment - 20/Mar/12 07:45 PM
The 1 side has a field that is a List of ***ids*** of the related objects. This means that they can be looked up easily. This is the same way we persist into HBase, and also GAE BigTable in its most recent version. Makes sense to me at least ;-)

Chris Rued added a comment - 20/Mar/12 07:51 PM
Simple patch that makes my small test case do the right thing:

Generated with the command:

    svn diff > NUCMONGODB-67.patch

From the directory:

    datanucleus\platform\store.mongodb\trunk\src\java\org\datanucleus\store\mongodb\fieldmanager

The patch simply adds an additional condition to isStorable so that false is returned if the mmd's getMappedBy property is non-null.

Chris Rued made changes - 20/Mar/12 07:51 PM
Field Original Value New Value
Attachment NUCMONGODB-67.patch [ 11611 ]
Chris Rued added a comment - 20/Mar/12 08:00 PM
Right. The way it is now, both sides have the ID's of the other side. The part that is confusing is that this seems redundant...I didn't expect the values from the non-owning side to be persisted at all (I thought this is more or less what the "owning" part meant).

The current implementation seems to prefer performance (when accessing the entities from the non-owning side) over document size and "normalization" (which may be a dirty word in Mongo circles(?) :-) )

In my use case this is just causing documents to be larger than necessary because this list can grow rather large, and I only ever access it from the "owning" side.



(now thinking about my patch, so far I've only worried about the storage of the relationship, not the retrieval...so it may well be incomplete).

Andy Jefferson added a comment - 20/Mar/12 08:04 PM
Yes, but what does that *achieve* ? It prevents storing of what is in the Collection/Map/array, so then when the user wants to retrieve it they get no data in that field.

I've no plans to change what is persisted; this is the default mode of persistence. This is what people have been using, and do that and you break people's data.

If however you want to provide another mode of persistence ... where perhaps you only store the info over in the related objects (and you have to add in indexes when storing Lists/arrays too otherwise you can't have ordering) like with RDBMS then fine ... but that means that you have to provide metadata processing to enable that mode of operation, and extra code in FetchFieldManager to instantiate the Collection on retrieval.

Andy Jefferson added a comment - 20/Mar/12 08:18 PM
Only other comments are
1. If you only access from the N side then simply don't put the Collection in the other side. QED

2. There is no problem with providing a different mode of persistence (for size, or whatever), the LDAP and RDBMS plugins do the same, to allow for some groups of users with differing requirements. Just need to think of all possible issues with each new "mode". For example, if you have a List then you can't have an element in the List more than once using your proposed mode (except if you have a join 'table' type feature too!). I can't see a simple way to support maps in this new way either.

Chris Rued added a comment - 20/Mar/12 08:21 PM - edited
Understood and agreed.

You don't want to break current installations. Storing the relationships redundantly isn't really too big a deal (disk space is pretty cheap). Plus ordering will be tricky (I think).

However, I have observed that it's possible in the current implementation for the two sides to become inconsistent by either:

   1. Doing rlee.setTestEntity(testEntitiy) without a corresponding testEntity.addRelatedListElementEntity(rlee); or
   2. Doing testEntity.addRelatedListElementEntity(rlee) without a corresponding rlee.setTestEntity(testEntitiy)

I suppose you could argue this is a programming error in the object model, but my (possibly faulty) understanding of the JPA mappedBy property was that a user should only have to set the owning side (and this will not work). It's nothing that I can't work around, just didn't match my expectation.

At this point, I don't know if it's worth the effort for me to do anything about it (I can work around it), and it sounds as though you feel the same. I'm happy to put in some time if you think it's worthwhile.

Thanks again!

[Edit: In case it wasn't clear, rlee above is short-hand for a persisted instance of RelatedListElementEntity and testEntity was a persisted instance of testEntity]

Chris Rued added a comment - 20/Mar/12 08:37 PM

    > 1. If you only access from the N side then simply don't put the Collection in the other side. QED

Yes, this was more-or-less my plan...we had some pre-existing code that set it on both sides (trying to keep the in-memory model correct so we can be a bit ignorant of the persistence), which is what started all of this for me :-)...



    > 2. There is no problem with providing a different mode of persistence (for size, or whatever), the LDAP and RDBMS plugins do the same,
    > to allow for some groups of users with differing requirements. Just need to think of all possible issues with each new "mode". For
    > example, if you have a List then you can't have an element in the List more than once using your proposed mode (except if you have a
    > join 'table' type feature too!). I can't see a simple way to support maps in this new way either.

Ideally I think I'd like to see something like this to provide more flexibility in the use cases supported, and to allow users to get expected behavior without having to understand the datastore (this may just be wishful thinking).

As far as I know, my current needs are met with what exists...we just need to be careful in a couple of cases to be sure we're storing the proper sides of the relationships we'd like to traverse.


Andy Jefferson made changes - 21/Mar/12 08:23 AM
Summary @OneToMany with mappedBy persisting collection on non-owning side Support @OneToMany with no List of ids stored at the "1" side.
Issue Type Bug [ 1 ] New Feature [ 2 ]
Component/s Persistence [ 10291 ]
Component/s Query [ 10290 ]
Component/s Schema [ 10313 ]
Chris Rued added a comment - 21/Mar/12 08:47 PM
A (very) partial patch which performs the retrieval from the non-owning side by executing a MongoDB query when the field is accessed. This is intended only as a suggestion of how it might be done, and is by no means complete enough to be merged.

Chris Rued made changes - 21/Mar/12 08:47 PM
Attachment NUCMONGODB-67-a.patch [ 11620 ]
Andy Jefferson made changes - 15/Jun/12 09:10 AM
Priority Major [ 3 ] Minor [ 4 ]