com.bigdata.htree
Class HTree

java.lang.Object
  extended by com.bigdata.htree.AbstractHTree
      extended by com.bigdata.htree.HTree
All Implemented Interfaces:
ICheckpointProtocol, IIndexLocalCounter, ICounterSetAccess, ICommitter

public class HTree
extends AbstractHTree
implements IIndexLocalCounter

An mutable persistence capable extensible hash tree.

Version:
$Id$
Author:
Bryan Thompson
See Also:
A Robust Scheme for Multilevel Extendible Hashing by Sven Helmer, Thomas Neumann, Guido Moerkotte. ISCIS 2003: 220-227. TODO It should be possible to define an native int32 hash table in parallel to the unsigned byte[] hash table simply by having an alternative descent passing an int32 key all the way down and using the variant of getBits() method which operates on the int32 values. We could also optimize the storage and retrieval of the int32 keys, perhaps with a custom mutable bucket page and mutable bucket data implementation for int32 keys. This optimization is likely to be quite worth while as the majority of use cases for the hash tree use int32 keys. TODO It is quite possible to define a range query interface for the hash tree. You have to use an order preserving hash function, which is external to the HTree implementation. Internally, the HTree must either double-link the pages or crawl the directory structure. TODO The keys should be declared as a computed key based on the data fields in the record. The {@link HTree} supports arbitrary bit length keys, but can be optimized for int32 keys easily enough. TODO Instrument performance counters for structural modifications, insert, remove, etc. per {@link BTreeCounters}.

Nested Class Summary
 
Nested classes/interfaces inherited from class com.bigdata.htree.AbstractHTree
AbstractHTree.HTreePageStateException
 
Field Summary
protected  AtomicLong counter
          The mutable counter exposed by #getCounter()}.
protected  long nentries
          The #of entries in the HTree.
protected  long nleaves
          The #of BucketPages in the HTree.
protected  long nnodes
          The #of DirectoryPage in the HTree.
protected  long recordVersion
          The value of the record version number that will be assigned to the next node or leaf written onto the backing store.
 
Fields inherited from class com.bigdata.htree.AbstractHTree
addressBits, bucketSlots, dumpLog, ERROR_CLOSED, ERROR_LESS_THAN_ZERO, ERROR_READ_ONLY, ERROR_TOO_LARGE, ERROR_TRANSIENT, metadata, ndistinctOnWriteRetentionQueue, nodeSer, readOnly, root, store, writeRetentionQueue
 
Constructor Summary
HTree(IRawStore store, Checkpoint checkpoint, IndexMetadata metadata, boolean readOnly)
          Required constructor form for HTree and any derived subclasses.
 
Method Summary
protected  void _reopen()
          This method is responsible for setting up the root leaf (either new or read from the store), the bloom filter, etc.
 int activeBucketPages()
          Deprecated. 
 int activeDirectoryPages()
          Deprecated. 
 HTree asReadOnly()
          Returns an immutable view of this HTree.
 boolean contains(byte[] key)
          Return true iff there is at least one tuple in the hash tree having the specified key.
 boolean contains(int key)
           
 boolean contains(Object obj)
           
static HTree create(IRawStore store, HTreeIndexMetadata metadata)
          Create a new HTree or derived class.
static HTree createTransient(HTreeIndexMetadata metadata)
          Create a new HTree or derived class that is fully transient (NO backing IRawStore).
protected  void fireDirtyEvent()
          Fire an event to the listener (iff set).
 boolean flush()
          Flush the nodes of the BTree to the backing store.
 Checkpoint getCheckpoint()
          Returns the most recent Checkpoint record.
 ICounter getCounter()
          Returns an ICounter.
 IDirtyListener getDirtyListener()
          Return the IDirtyListener.
 long getEntryCount()
          The #of tuples in the HTree.
 long getLastCommitTime()
          The timestamp associated with the last IAtomicStore.commit() in which writes buffered by this index were made restart-safe on the backing store.
 long getLeafCount()
          The #of BucketPages in the HTree (not buddy hash buckets, but the pages on which they appear).
 long getMetadataAddr()
          The address at which the most recent IndexMetadata record was written.
 long getNodeCount()
          The #of DirectoryPages in the HTree (not buddy hash tables, but the pages on which they appear).
 String getPageInfo()
          Deprecated. 
 long getRecordVersion()
          The value of the record version number that will be assigned to the next node or leaf written onto the backing store.
 long getRevisionTimestamp()
           
 long getRootAddr()
          The address of the last written root of the persistent data structure -or- 0L if there is no root.
 long handleCommit(long commitTime)
          Flush dirty state to the store in preparation for an atomic commit and return the address from which the persistence capable data structure may be reloaded.
 byte[] insert(byte[] key, byte[] value)
          Insert a tuple into the hash tree.
 void insert(int key, byte[] val)
           
 void insert(Object obj)
           
protected  void insertRawTuple(com.bigdata.htree.BucketPage src, int slot)
          Insert a tuple into the hash tree.
static HTree load(IRawStore store, long addrCheckpoint, boolean readOnly)
          Load an instance of a HTree or derived class from the store.
protected  com.bigdata.htree.AbstractPage locatePageForKey(byte[] key)
           
 ITupleIterator lookupAll(byte[] key)
          Return an iterator which will visit each tuple in the index having the specified key.
 ITupleIterator lookupAll(int key)
          Return an iterator which will visit each tuple in the index having the specified key.
 byte[] lookupFirst(byte[] key)
          Return the first value for the key.
 byte[] lookupFirst(int key)
          Return the first value for the key.
 boolean needsCheckpoint()
          Return true iff changes would be lost unless the B+Tree is flushed to the backing store using writeCheckpoint().
 long rangeCount()
          The #of index entries.
protected  int recycle(long addr)
          Recycle (aka delete) the allocation.
 byte[] remove(byte[] key)
          Removes a single entry matching the key supplied.
 void removeAll()
          Remove all entries in the index.
 int removeAll(byte[] key)
           
 ICloseableIterator<?> scan()
          Visit all entries in the index in the natural order of the index.
protected  void setCheckpoint(Checkpoint checkpoint)
          Sets the checkpoint and initializes the mutable fields from the checkpoint record.
 void setDirtyListener(IDirtyListener listener)
          Set or clear the listener (there can be only one).
 void setIndexMetadata(HTreeIndexMetadata indexMetadata)
          Method updates the index metadata associated with this BTree.
 void setLastCommitTime(long lastCommitTime)
          Sets the lastCommitTime.
 long writeCheckpoint()
          Checkpoint operation must #flush() dirty nodes, dirty persistent data structures, etc, write a new Checkpoint record on the backing store, save a reference to the current Checkpoint, and return the address of that Checkpoint record.
 Checkpoint writeCheckpoint2()
          Checkpoint operation must #flush() dirty nodes, dirty persistent data structures, etc, write a new Checkpoint record on the backing store, save a reference to the current Checkpoint, and return the address of that Checkpoint record.
 
Methods inherited from class com.bigdata.htree.AbstractHTree
assertNotReadOnly, assertNotTransient, checkConsistency, checkConsistency, close, decodeRecordAddr, dump, dump, encodeRecordAddr, getAddressBits, getBtreeCounters, getCounters, getIndexMetadata, getRoot, getStore, isOpen, isReadOnly, isTransient, rangeIterator, readNodeOrLeaf, reopen, setBTreeCounters, toString, touch, values, writeNodeOrLeaf, writeNodeRecursive
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

nnodes

protected long nnodes
The #of DirectoryPage in the HTree. This is ONE (1) for a new HTree.


nleaves

protected long nleaves
The #of BucketPages in the HTree. This is one (1) for a new HTree (one directory page and one bucket page).


nentries

protected long nentries
The #of entries in the HTree. This is ZERO (0) for a new HTree.


recordVersion

protected long recordVersion
The value of the record version number that will be assigned to the next node or leaf written onto the backing store. This number is incremented each time a node or leaf is written onto the backing store. The initial value is ZERO (0). The first value assigned to a node or leaf will be ZERO (0).


counter

protected AtomicLong counter
The mutable counter exposed by #getCounter()}.

Note: This is protected so that it will be visible to Checkpoint which needs to write the actual value store in this counter into its serialized record (without the partition identifier).

Constructor Detail

HTree

public HTree(IRawStore store,
             Checkpoint checkpoint,
             IndexMetadata metadata,
             boolean readOnly)
Required constructor form for HTree and any derived subclasses. This constructor is used both to create a new HTree, and to load a HTree from the store using a Checkpoint record.

Parameters:
store - The store.
checkpoint - A Checkpoint record for that HTree.
metadata - The metadata record for that HTree.
readOnly - When true the HTree will be immutable.
See Also:
HTree#create(IRawStore, IndexMetadata), load(IRawStore, long, boolean)
Method Detail

getNodeCount

public final long getNodeCount()
Description copied from class: AbstractHTree
The #of DirectoryPages in the HTree (not buddy hash tables, but the pages on which they appear).

Specified by:
getNodeCount in class AbstractHTree

getLeafCount

public final long getLeafCount()
Description copied from class: AbstractHTree
The #of BucketPages in the HTree (not buddy hash buckets, but the pages on which they appear).

Specified by:
getLeafCount in class AbstractHTree

getEntryCount

public final long getEntryCount()
Description copied from class: AbstractHTree
The #of tuples in the HTree.

Specified by:
getEntryCount in class AbstractHTree

getCheckpoint

public final Checkpoint getCheckpoint()
Description copied from interface: ICheckpointProtocol
Returns the most recent Checkpoint record.

Specified by:
getCheckpoint in interface ICheckpointProtocol
Returns:
The most recent Checkpoint record and never null.

getRecordVersion

public final long getRecordVersion()
Description copied from interface: ICheckpointProtocol
The value of the record version number that will be assigned to the next node or leaf written onto the backing store. This number is incremented each time a node or leaf is written onto the backing store. The initial value is ZERO (0). The first value assigned to a node or leaf will be ZERO (0). TODO Nobody is actually incrementing this value right now.

Specified by:
getRecordVersion in interface ICheckpointProtocol

getMetadataAddr

public final long getMetadataAddr()
Description copied from interface: ICheckpointProtocol
The address at which the most recent IndexMetadata record was written.

Specified by:
getMetadataAddr in interface ICheckpointProtocol

getRootAddr

public final long getRootAddr()
Description copied from interface: ICheckpointProtocol
The address of the last written root of the persistent data structure -or- 0L if there is no root. A 0L return may be an indication that an empty data structure will be created on demand.

Specified by:
getRootAddr in interface ICheckpointProtocol

needsCheckpoint

public boolean needsCheckpoint()
Return true iff changes would be lost unless the B+Tree is flushed to the backing store using writeCheckpoint().

Note: In order to avoid needless checkpoints this method will return false if:

Returns:
true true iff changes would be lost unless the B+Tree was flushed to the backing store using writeCheckpoint().

setIndexMetadata

public final void setIndexMetadata(HTreeIndexMetadata indexMetadata)
Method updates the index metadata associated with this BTree. The new metadata record will be written out as part of the next index writeCheckpoint().

Note: this method should be used with caution.

Parameters:
indexMetadata - The new metadata description for the BTree.
Throws:
IllegalArgumentException - if the new value is null
IllegalArgumentException - if the new IndexMetadata record has already been written on the store - see IndexMetadata.getMetadataAddr()
UnsupportedOperationException - if the index is read-only.

handleCommit

public long handleCommit(long commitTime)
Description copied from interface: ICommitter
Flush dirty state to the store in preparation for an atomic commit and return the address from which the persistence capable data structure may be reloaded.

Specified by:
handleCommit in interface ICommitter
Parameters:
commitTime - The timestamp assigned to the commit.
Returns:
The address of the record from which the persistence capable data structure may be reloaded. If no changes have been made then the previous address should be returned as it is still valid.

getCounter

public ICounter getCounter()
Returns an ICounter. The ICounter is mutable iff the BTree is mutable. All ICounters returned by this method report and increment the same underlying counter.

Note: When the BTree is part of a scale-out index then the counter will assign values within a namespace defined by the partition identifier.

Specified by:
getCounter in interface IIndexLocalCounter

setCheckpoint

protected void setCheckpoint(Checkpoint checkpoint)
Sets the checkpoint and initializes the mutable fields from the checkpoint record. In order for this operation to be atomic, the caller must be synchronized on the HTree or otherwise guaranteed to have exclusive access, e.g., during the ctor or when the HTree is mutable and access is therefore required to be single-threaded.


_reopen

protected void _reopen()
Description copied from class: AbstractHTree
This method is responsible for setting up the root leaf (either new or read from the store), the bloom filter, etc. It is invoked by AbstractHTree.reopen() once AbstractHTree.root has been show to be null with double-checked locking. When invoked in this context, the caller is guaranteed to hold a lock on this. This is done to ensure that at most one thread gets to re-open the index from the backing store.

Specified by:
_reopen in class AbstractHTree

getLastCommitTime

public final long getLastCommitTime()
Description copied from class: AbstractHTree
The timestamp associated with the last IAtomicStore.commit() in which writes buffered by this index were made restart-safe on the backing store. The lastCommitTime is set when the index is loaded from the backing store and updated after each commit. It is ZERO (0L) when HTree is first created and will remain ZERO (0L) until the HTree is committed. If the backing store does not support atomic commits, then this value will always be ZERO (0L).

Specified by:
getLastCommitTime in class AbstractHTree

getRevisionTimestamp

public final long getRevisionTimestamp()

setLastCommitTime

public final void setLastCommitTime(long lastCommitTime)
Description copied from interface: ICheckpointProtocol
Sets the lastCommitTime.

Note: The lastCommitTime is set by a combination of the AbstractJournal and Name2Addr based on the actual commitTime of the commit during which an Name2Addr.Entry for that index was last committed. It is set for both historical index reads and unisolated index reads using Name2Addr.Entry.commitTime. The lastCommitTime for an unisolated index will advance as commits are performed with that index.

Specified by:
setLastCommitTime in interface ICheckpointProtocol
Parameters:
lastCommitTime - The timestamp of the last committed state of this index.

getDirtyListener

public final IDirtyListener getDirtyListener()
Description copied from interface: ICheckpointProtocol
Return the IDirtyListener.

Specified by:
getDirtyListener in interface ICheckpointProtocol

setDirtyListener

public final void setDirtyListener(IDirtyListener listener)
Description copied from interface: ICheckpointProtocol
Set or clear the listener (there can be only one).

Specified by:
setDirtyListener in interface ICheckpointProtocol
Parameters:
listener - The listener.

fireDirtyEvent

protected final void fireDirtyEvent()
Fire an event to the listener (iff set).


flush

public final boolean flush()
Flush the nodes of the BTree to the backing store. After invoking this method the root of the BTree will be clean.

Note: This does NOT flush all persistent state. See writeCheckpoint() which also handles the optional bloom filter, the IndexMetadata, and the Checkpoint record itself.

Returns:
true if anything was written.

asReadOnly

public HTree asReadOnly()
Returns an immutable view of this HTree. If BTree is already read-only, then this instance is returned. Otherwise, a read-only BTree is loaded from the last checkpoint and returned.

Throws:
IllegalStateException - If the BTree is dirty.
TODO:
The Checkpoint could hold a WeakReference singleton to the read-only view loaded from that checkpoint.

writeCheckpoint

public final long writeCheckpoint()
Checkpoint operation must #flush() dirty nodes, dirty persistent data structures, etc, write a new Checkpoint record on the backing store, save a reference to the current Checkpoint, and return the address of that Checkpoint record.

Note: A checkpoint by itself is NOT an atomic commit. The commit protocol is at the store level and uses Checkpoints to ensure that the state of the persistence capable data structure is current on the backing store.

Specified by:
writeCheckpoint in interface ICheckpointProtocol
Returns:
The address at which the Checkpoint record for the persistence capable was written onto the store. The data structure can be reloaded from this Checkpoint record.
See Also:
#load(IRawStore, long)

writeCheckpoint2

public final Checkpoint writeCheckpoint2()
Checkpoint operation must #flush() dirty nodes, dirty persistent data structures, etc, write a new Checkpoint record on the backing store, save a reference to the current Checkpoint, and return the address of that Checkpoint record.

Note: A checkpoint by itself is NOT an atomic commit. The commit protocol is at the store level and uses Checkpoints to ensure that the state of the persistence capable data structure is current on the backing store.

Specified by:
writeCheckpoint2 in interface ICheckpointProtocol
Returns:
The Checkpoint record for the persistent data structure which was written onto the store. The persistent data structure can be reloaded from this Checkpoint record.
See Also:
#load(IRawStore, long)

contains

public boolean contains(Object obj)

contains

public boolean contains(int key)

contains

public boolean contains(byte[] key)
Return true iff there is at least one tuple in the hash tree having the specified key.

Parameters:
key - The key.
Returns:
true iff the hash tree contains at least one tuple with that key.
Throws:
IllegalArgumentException - if the key is null. TODO Parallel to insert(), consider a contains() signature which permits testing for a specific value as well. Maybe in a wrapper class?

lookupFirst

public byte[] lookupFirst(int key)
Return the first value for the key.

Parameters:
key - The key.
Returns:
The first value for the key -or- null if there are no tuples in the index having that key. Note that the return value is not diagnostic if the application allows null values into the index.

lookupFirst

public byte[] lookupFirst(byte[] key)
Return the first value for the key.

Parameters:
key - The key.
Returns:
The first value for the key -or- null if there are no tuples in the index having that key. Note that the return value is not diagnostic if the application allows null values into the index.

lookupAll

public ITupleIterator lookupAll(int key)
Return an iterator which will visit each tuple in the index having the specified key.

Parameters:
key - The key.
Returns:
The iterator and never null.

lookupAll

public ITupleIterator lookupAll(byte[] key)
Return an iterator which will visit each tuple in the index having the specified key.

Parameters:
key - The key.
Returns:
The iterator and never null.

insert

public void insert(Object obj)

insert

public void insert(int key,
                   byte[] val)

insert

public byte[] insert(byte[] key,
                     byte[] value)
Insert a tuple into the hash tree. Tuples with duplicate keys and even tuples with duplicate keys and values are allowed and will result in multiple tuples.

Parameters:
key - The key.
value - The value.
Returns:
null (always). TODO If the application wants to restrict the hash tree to such that it does not contain duplicate tuples then it must first search in the tree for an exact match (key and value). It is easier to do that from within the insert logic so expand the method signature to pass an insert enum {ALLDUPS,DUPKEYS,NODUPS}.

insertRawTuple

protected void insertRawTuple(com.bigdata.htree.BucketPage src,
                              int slot)
Insert a tuple into the hash tree. Tuples with duplicate keys and even tuples with duplicate keys and values are allowed and will result in multiple tuples.

Parameters:
src - The source BucketPage src
slot - The slot in the source BucketPage whose tuple will be inserted (really copied).

remove

public byte[] remove(byte[] key)
Removes a single entry matching the key supplied. The HTree provides no guarantee that the order in which a value with a duplicate key is added will be the order in which it would be removed - FIFO. It is more likely to be in LIFO

Parameters:
key -
Returns:
the value removed or null if none found

removeAll

public int removeAll(byte[] key)

locatePageForKey

protected com.bigdata.htree.AbstractPage locatePageForKey(byte[] key)

scan

public final ICloseableIterator<?> scan()
Description copied from interface: ICheckpointProtocol
Visit all entries in the index in the natural order of the index.

Specified by:
scan in interface ICheckpointProtocol

removeAll

public void removeAll()
Description copied from interface: ICheckpointProtocol
Remove all entries in the index.

Specified by:
removeAll in interface ICheckpointProtocol

rangeCount

public long rangeCount()
Description copied from class: AbstractHTree
The #of index entries.

Specified by:
rangeCount in interface ICheckpointProtocol
Specified by:
rangeCount in class AbstractHTree
Returns:
The #of tuples in the index.
See Also:
IRangeQuery.rangeCount()

create

public static HTree create(IRawStore store,
                           HTreeIndexMetadata metadata)
Create a new HTree or derived class. This method works by writing the IndexMetadata record on the store and then loading the HTree from the IndexMetadata record.

Parameters:
store - The store.
metadata - The metadata record.
Returns:
The newly created HTree.
Throws:
IllegalStateException - If you attempt to create two HTree objects from the same metadata record since the metadata address will have already been noted on the IndexMetadata object. You can use IndexMetadata.clone() to obtain a new copy of the metadata object with the metadata address set to 0L.
IllegalStateException - if the IndexTypeEnum in the supplied IndexMetadata object is not IndexTypeEnum.BTree.
See Also:
load(IRawStore, long, boolean)

createTransient

public static HTree createTransient(HTreeIndexMetadata metadata)
Create a new HTree or derived class that is fully transient (NO backing IRawStore).

Fully transient HTrees provide the functionality of a HTree without a backing persistence store. Internally, reachable nodes and leaves of the transient HTree use hard references to ensure that remain strongly reachable. Deleted nodes and leaves simply clear their references and will be swept by the garbage collector shortly thereafter.

Operations which attempt to write on the backing store will fail.

While nodes and leaves are never persisted, the keys and values of the transient HTree are unsigned byte[]s. This means that application keys and values are always converted into unsigned byte[]s before being stored in the HTree. Hence if an object that is inserted into the HTree and then looked up using the HTree API, you WILL NOT get back the same object reference.

Note: CLOSING A TRANSIENT INDEX WILL DISCARD ALL DATA!

Parameters:
metadata - The metadata record.
Returns:
The transient HTree.

load

public static HTree load(IRawStore store,
                         long addrCheckpoint,
                         boolean readOnly)
Load an instance of a HTree or derived class from the store. The HTree or derived class MUST declare a constructor with the following signature: className(IRawStore store, Checkpoint checkpoint, IndexMetadata metadata, boolean readOnly)

Parameters:
store - The store.
addrCheckpoint - The address of a Checkpoint record for the index.
readOnly - When true the BTree will be marked as read-only. Marking has some advantages relating to the locking scheme used by Node.getChild(int) since the root node is known to be read-only at the time that it is allocated as per-child locking is therefore in place for all nodes in the read-only BTree. It also results in much higher concurrency for AbstractBTree.touch(AbstractNode).
Returns:
The HTree or derived class loaded from that Checkpoint record.
Throws:
IllegalArgumentException - if store is null.

recycle

protected int recycle(long addr)
Recycle (aka delete) the allocation. This method also adjusts the #of bytes released in the BTreeCounters.

Parameters:
addr - The address to be recycled.
Returns:
The #of bytes which were recycled and ZERO (0) if the address is IAddressManager.NULL.

activeBucketPages

@Deprecated
public final int activeBucketPages()
Deprecated. 

Move to test suite.


activeDirectoryPages

@Deprecated
public final int activeDirectoryPages()
Deprecated. 

Move to test suite.


getPageInfo

@Deprecated
public String getPageInfo()
Deprecated. 

Move to test suite.



Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.