datanucleus-core is the base persistence implementation of DataNucleus. All other DataNucleus projects build on top of this, and so it is the pre-requisite for any DataNucleus-enabled application.
trunk can be checked out as follows
svn co https://svn.code.sf.net/p/datanucleus/code/platform/core/trunk core
datanucleus-core is downloadable as following
datanucleus-core provides the framework for the persistence process. Whether you are using optimistic or pessimistic transactions governs which persistence process is followed.
With the optimistic route, when a user tries to persist an object, the object will not reach the datastore immediately. ObjectManager will call StateManager.makePersistent and this will run reachability on the object to make this object and any reachable persistable objects provisionally persistent. This means that their lifecycle state is PERSISTENT_NEW, but they haven't been flushed to the datastore.
When a user tries to delete an object, the object will not be removed from the datastore immediately. ObjectManager will call StateManager.deletePersistent and this will run reachability on the object to make this object and any reachable persistable objects provisionally deleted. This means that their lifecycle state is PERSISTENT_DELETED, but they haven't been flushed to the datastore.
When the user calls flush or tx.commit then the list of objects with outstanding changes will be processed one by one. Any outstanding persists will effect a call to StorePersistenceHandler.insertObject for each object. Similarly any oustanding deletes will effect a call to StorePersistenceHandler.deleteObject for each object. The same goes for any updates. Obviously within each datastore persistence handler they can detect that the related object(s) are provisionally persistent or provisionally deleted and so not to cascade to them.
With the pessimistic route, when a user tries to persist an object, the object will reach the datastore immediately. ObjectManager will call StateManager.makePersistent and this will change the lifecycle state to PERSISTENT_NEW, and will relay the call to StorePersistenceHandler.insertObject. It is the responsibility of this method to handle cascading to related objects.
When a user tries to delete an object, the object will be removed from the datastore immediately. ObjectManager will call StateManager.deletePersistent and this will change the lifecycle state to PERSISTENT_DELETED, and relay the call to StorePersistenceHandler.deleteObject. It is the responsibility of this method to handle cascading to related objects.
When the user calls flush or tx.commit then the list of objects with outstanding changes will be processed one by one. In general for the pessimistic route there will be little to do here.
DataNucleus provides flexibility with logging. You can choose whether to use the popular Log4J or JDK1.4 logging for example. Moreover, DataNucleus allows you to log messages to various categories, allowing users to filter the logged messages by these categories.
The NucleusLogger class provides the central registry of logging categories used by DataNucleus. It provides an accessor for retrieving a particular logging category -- used as follows
NucleusLogger.METADATA.info("my log message");There are the various categories defined in the DataNucleusLogger class. Only add new ones after discussion with other developers. NucleusLogger will decide if Log4J, or JDK14 logging or other should be used - you don't need to do anything in your code.
NucleusLogger allows you to log messages at various severity levels. These are DEBUG, INFO, WARN, ERROR, FATAL. Each message is logged at a particular level to a category (as described above).
To log a message is very simple. See below for a few examples
NucleusLogger.DATASTORE_SCHEMA.info("my log message");
NucleusLogger.DATASTORE_SCHEMA.error("my log message");
if (NucleusLogger.DATASTORE_SCHEMA.isDebugEnabled())
{
NucleusLogger.DATASTORE_SCHEMA.debug("my log message at debug level");
}Please refer to Log4J Manual for details of what you can do with a Log4J Logger To see how you can use the logging from a users perspective, refer to the User Logging Guide for DataNucleus AccessPlatform.
The DataNucleus system is internationalisable hence messages (to log files or exceptions) can be displayed in multiple languages. Currently DataNucleus contains localisation files in the default locale (English), but can be extended easily by adding localisationfiles in languages such as Spanish, French, etc. The internationalisation operates around the org.datanucleus.util.Localiser class that is responsible for generating the messages in the specified locale. Each class needs to instantiate a Localiser
private static final Localiser LOCALISER=Localiser.getInstance("org.datanucleus.store.Localisation",
MyClass.class);
LOCALISER.msg("012345", schemaName, autoStartMechanism)
012345=Initialising Schema "{0}" using "{1}" auto-start option
012345=Inicializando la esquema "{0}" con la opción de empezar "{1}"If you want to extend this to another language and contribute the files for your language you need to find all files "Localisation.properties" and provide an alternative variant. The key ones are
Note that the second argument used in constructing the Localiser is important for OSGi. It has to be a class in the same OSGi bundle as the Localisation.properties file
You will find alternates in Spanish already present named "Localisation_es.properties", so if you wanted to create a French localisation then provide "Localisation_fr.properties".
Further references: International Components for Unicode for Java
DataNucleus provides a generic query processing engine. It provides for compilation of string-based query languages. Additionally it allows in-memory evaluation of these queries. This is very useful when providing support for new datastores which either don't have a native query language and so the only alternative is for DataNucleus to evaluate the queries, or where it will take some time to map the compiled query to the equivalent query in the native language of the datastore.
When a user invokes a query, using the JDO/JPA APIs, they are providing either
The first step is to convert these two forms into the constituent clauses. It is assumed that a string-based query is of the form
SELECT {resultClause} FROM {fromClause} WHERE {filterClause}
GROUP BY {groupingClause} HAVING {havingClause}
ORDER BY {orderClause}
The two primary supported query languages have helper classes to provide this migration from
the single-string query form into the individual clauses. These can be found in
org.datanucleus.query.JDOQLSingleStringParser
and org.datanucleus.query.JPQLSingleStringParser
So we have a series of clauses and we want to compile them. So what does this mean?
Well, in simple terms, we are going to convert the individual clauses from above into
expression tree(s) so that they can be evaluated. The end result of a compilation is
a org.datanucleus.query.compiler.QueryCompilation
So if you think about a typical query you may have
SELECT field1, field2 FROM MyClass
The query compilation of a particular clauses has 2 stages
and compilation is performed by a JavaQueryCompiler, so look at
org.datanucleus.query.compiler.JDOQLCompiler
and org.datanucleus.query.compiler.JPQLCompiler
These each have a Parser that performs the extraction of the different components of the
clauses and generation of the Node tree. Once a Node tree is generated it can then be converted
into the compiled Expression tree; this is handled inside the JavaQueryCompiler.
The other part of a query compilation is the
org.datanucleus.query.symbol.SymbolTable
which is a lookup table (map) of identifiers and their value. So, for example, an input
parameter will have a name, so has an entry in the table, and its value is stored there.
This is then used during evaluation.
Intuitively it is more efficient to evaluate a query within the datastore since it means that fewer actual result objects need instantiating in order to determine the result objects. To evaluate a compiled query in the datastore there needs to be a compiler for taking the generic expression compilation and converting it into a native query. Additionally it should be noted that you aren't forced to evaluate the whole of the query in the datastore, maybe just the filter clause. This would be done where the datastore native language maybe only provides a limited amount of query capabilities. For example with db4o we evaluate the filter and ordering in the datastore, using their SODA query language. The remaining clauses can be evaluated on the resultant objects in-memory (see below). Obviously for a datastore like RDBMS it should be possible to evaluate the whole query in-datastore.
Evaluation of queries in-memory assumes that we have a series of "candidate" objects.
These are either user-input to the query itself, or retrieved from the datastore. We then
use the in-memory evaluator
org.datanucleus.query.evaluator.memory.InMemoryExpressionEvaluator
.
This takes in each candidate object one-by-one and evaluates whichever of the query clauses
are desired to be evaluated. For example we could just evaluate the filter clause.
Evaluation makes use of the values of the fields of the candidate objects (and related objects)
and uses the SymbolTable for values of parameters etc. Where a candidate fails a particular
clause in the filter then it is excluded from the results.
There are two primary ways to return results to the user.
To make use of the second route, consider extending the class org.datanucleus.store.query.AbstractQueryResult and implement the key methods. Also, for the iterator, you can extend org.datanucleus.store.query.AbstractQueryResultIterator.
When a persistable class is persisted and has a field of a second-class type (Collection, Map, Date, etc) then DataNucleus needs to know when the user calls operations on it to change the contents of the object. To do this, at the first reference to the field once enlisted in a transaction, DataNucleus will replace the field value with a proxy wrapper wrapping the real object. This has no effect for the user in that the field is still castable to the same type as they had in that field, but all operations are intercepted.
By default when a container field is replaced by a second-class object (SCO) wrapper it will be enabled to cache the values in that field. This means that once the values are loaded in that field there will be no need to make any call to the datastore unless changing the container. This gives significant speed ups when compared to relaying all calls via the datastore. You can change to not use caching by setting either
This is implemented in a typical SCO proxy wrapper by using the SCOUtils method useContainerCache() which determines if caching is required, and by having a method load() on all proxy wrapper container classes.
JDO and JPA provide mechanisms for specifying whether fields are loaded lazily (when required) or whether they are loaded eagerly (when the object is first met). DataNucleus follows these specifications but also allows the user to override the lazy loading for a SCO container. For example if a collection field was marked as being part of the default fetch group it should be loaded eagerly which means that when the owning object is instantiated the collection is loaded up too. If the user overrides the lazy loading for that field in that situation to make it lazy, DataNucleus will instantiate the owning object and instantiate the collection but leave it marked as "to be loaded" and the elements will be loaded up when needed. You can change the lazy loading setting via
When DataNucleus is using an optimistic transaction it attempts to delay all datastore operations until commit is called on the transaction or flush is called on the PersistenceManager/EntityManager. This implies a change to operation of SCO proxy wrappers in that they must queue up all mutating operations (add, clear, remove etc) until such a time as they need to be sent to the datastore. All SCO proxy wrappers have a List of queued operations for this purpose.
All code for the actual queued operations are stored under org.datanucleus.sco.queued.
There are actually two sets of SCO wrappers in DataNucleus. The first set provide lazy loading, queueing, etc etc. The second set are simple wrappers that intercept operations and mark the field as dirty in the StateManager. This second set are for use with datastores such as db4o that don't utilise backing stores and just want to know when the field is dirty and hence should be written.
All code for the simple SCO wrappers are stored under org.datanucleus.sco.simple.
DataNucleus relies on classes implementing PersistenceCapable, and Detachable. Users could clearly do this manually but we provide the byte-code enhancement option. The DataNucleus Enhancer is structured to firstly determine from the input which classes are required to be enhanced, and secondly to enhance each class using the selected ClassEnhancer. DataNucleus has the JDOClassEnhancer providing enhancement to the JDO bytecode enhancement contract.
ASM is very lightweight and operates using the same pattern as a SAX Parser and much faster. It uses a Visitor pattern. First the class is visited, then fields and methods, and finally an "end" point where you can add on any new fields/methods etc. The JDOClassEnhancer uses the JDOClassVisitor to obtain information about a class to be enhanced and adds on all required fields/methods.
A very useful utility when developing with ASM is its "Bytecode Outline" Eclipse plugin. To install it simply add an "Eclipse Update site" to your Eclipse config as "http://download.forge.objectweb.org/eclipse-update/" and the name "ObjectWeb". You then install the "Bytecode Outline" plugin. Once you have it installed select "Window" -> "Show View" -> "Other" -> "Java : Bytecode". This provides a window showing the Java bytecode for the class being edited. If you click on the "ASM" button on this window it shows you the ASM commands you would need to create the class, or a particular method/field!. This makes developing new ASMClassMethod implementations a doddle - just create a class with the method you want generating and then cut and paste the ASM code in.
If you ever need to check the byte-code enhanced class for correctness you can always decompile it back to the Java file. This can be done with a bytecode decompiler such as JD. Unpack the JD-GUI download so that you have the following
and invoke the following command
jd-gui
and select "Open", choosing a class file, and it shows the java code