Issue Details (XML | Word | Printable)

Key: NUCHBASE-78
Type: New Feature New Feature
Status: Closed Closed
Resolution: Fixed
Priority: Minor Minor
Assignee: Unassigned
Reporter: nicolas
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
DataNucleus Store HBase

Support for BloomFilters, inMemory and some other options on column families when creating a table

Created: 07/Dec/12 06:36 PM   Updated: 01/Jan/13 09:22 AM   Resolved: 10/Dec/12 08:20 PM
Component/s: Schema
Affects Version/s: 3.2.0.m1
Fix Version/s: 3.2.0.m2

File Attachments: 1. XML File hbase.xml (7 kB)
2. Text File Store_HBase-addExtensionsToHBase.patch (46 kB)
3. Text File Test_added_extensions_for_HBase.patch (12 kB)

Environment: Any

Forum Thread URL: See thread http://www.datanucleus.org/servlet/forum/viewthread_thread,7300
Datastore: HBase
Severity: Development


 Description  « Hide

When creating a column family in HBase, it is not possible to specify any of the optional fields available:
HColumnDescriptor(byte[] familyName, int minVersions, int maxVersions, boolean keepDeletedCells, String compression, boolean encodeOnDisk, String dataBlockEncoding, boolean inMemory, boolean blockCacheEnabled, int blocksize, int timeToLive, String bloomFilter, int scope)

As such, it would be interesting to define those as extensions as per JDO.

Sort Order: Ascending order - Click to sort in descending order
nicolas added a comment - 07/Dec/12 06:40 PM
Patch for store.hbase.

Please review it and tell me if it is OK. Some more tests are most likely required for the parser. I can add those later if you agree with implementation.

Apply Store_HBase-addExtensionsToHBase.patch at store.hbase level.
Apply Test_added_extensions_for_HBase.patch at C:\projects\svn\datanucleus\test\accessplatform\trunk\test.jdo.hbase level.

Tell me if there is a problem with patches.

Not sure where to document this as documenting it in the HBaseUtils java doc would not expose this to the user. The HBase wiki maybe? Or do you have another proposal?

nicolas added a comment - 07/Dec/12 06:48 PM
Added support for the following attributes:
    inMemory,
    bloomFilter,
    maxVersions,
    keepDeletedCells,
    compression,
    blockCacheEnabled,
    timeToLive

In order to define these attributes on the @PersistenceCapable class, it is possible to use anootations. The format is similar to that of a properties file ie:

n order to indicate bloom filter is on ROW for column family then the @Exception annotation key would be extended such as:
hbase.columnFamily.<name of family>.<option>= <value>
Where:
- option: One of the above defined attributes (inMemory, bloomFilter,...)
- name of family: The column family affected.
- value: The corresponding value for this attribute.


Example usage:
@Extensions(
        {
                @Extension(vendorName = "datanucleus", key = "hbase.columnFamily.meta.bloomFilter", value = "ROW"),
                @Extension(vendorName = "datanucleus", key = "hbase.columnFamily.meta.inMemory", value = "true")
        }
)

Note that the "meta" being the column family, the above sets bloomFilter to "ROW" and inMemory to true.


Example of annotated entity:
@PersistenceCapable
@Extensions(
        {
                @Extension(vendorName = "datanucleus", key = "hbase.columnFamily.meta.bloomFilter", value = "ROW"),
                @Extension(vendorName = "datanucleus", key = "hbase.columnFamily.meta.inMemory", value = "true")
        }
)
public class ServiceProfile {

    @PrimaryKey
    private long id;

    // column family data, name of attribute blob
    @Column(name = "data:blob")
    private String blob;

    // column family meta, name of attribute firstName
    @Column(name = "meta:firstName")
    private String firstName;
   // column family meta, name of attribute firstName
    @Column(name = "meta:lastName")
    private String lastName;

Andy Jefferson added a comment - 08/Dec/12 11:03 AM
Thanks. A clarification : is the copyright notice for the new files supposed to be Ericsson, or yourself ? (one has Ericsson, and one your name) Any problem with just using the text of the Apache 2 license and putting the name against that ? doubt there's a problem if not though obviously simpler if all code is Apache 2.

nicolas added a comment - 08/Dec/12 11:50 AM - edited
My mistake... No problem. I work for Ericsson but did this myself. Apply the Apache 2 license.

If there is anything else I can do just say.

Andy Jefferson added a comment - 08/Dec/12 12:18 PM
SVN trunk now has your patches, thanks. I made the following minimal changes :-
1. put license as Apache2 on all new files with you as the original author.
2. renamed org.datanucleus.store.hbase.extension to org.datanucleus.store.hbase.metadata since these are metadata extensions (as opposed to query extensions, or other extensions). Also made use of MetaData.VENDOR_NAME.
3. not used HBaseUtilsTest since did nothing (though we could do with some tests for methods in that class, just never had the time).
4. didn't include all of the pom.xml changes since only needed junit (not jdo-api, datanucleus-test-framework).
5. put test entity under "src/java" to match other test suites.

All builds and runs ok.


The only thing needed now is to document the extensions, in the following file
http://datanucleus.svn.sourceforge.net/viewvc/datanucleus/documentation/accessplatform/trunk/src/site/xdoc/datastores/hbase.xml?revision=16046&view=markup

If you could provide either a patch to that (to add some basic info on the available extensions), or provide a block of text that I can put in. Thx.

nicolas added a comment - 08/Dec/12 01:27 PM
HBaseUtilsTest was there because I thought I would start with that for the unit tests. Then I forgot to remove it. Same reason for the additions to the pom.xml. Thanks for removing those.

Great that all is in svn!

Will do the documentation a bit later today or Monday.


nicolas added a comment - 10/Dec/12 06:57 PM
Here is the documentation.

\datanucleus\documentation\accessplatform\trunk\src\site\xdoc\datastores

Tell me if it is OK.

Andy Jefferson added a comment - 10/Dec/12 08:20 PM
All in SVN trunk now, thx

nicolas added a comment - 10/Dec/12 09:08 PM
Thank you!

Tell me if there is something I could help with.

Andy Jefferson added a comment - 11/Dec/12 02:59 PM
If you wanted to help with any other work on DataNucleus either look through JIRA if there's an issue of interest to you, otherwise look at how you use DataNucleus (and the datastores you use it with) and think about what other features it is lacking (and then add issues in JIRA to track any work).