com.bigdata.mdi
Class MetadataIndex

java.lang.Object
  extended by com.bigdata.btree.AbstractBTree
      extended by com.bigdata.btree.BTree
          extended by com.bigdata.mdi.MetadataIndex
All Implemented Interfaces:
IAutoboxBTree, IIndex, ILinearList, ILocalBTreeView, IRangeQuery, ISimpleBTree, ICommitter, IMetadataIndex

public class MetadataIndex
extends BTree
implements IMetadataIndex

A metadata index for the partitions of a distributed index. There is one metadata index for each distributed index. The keys of the metadata index are the first key that would be directed into the corresponding index segment, e.g., a separator key (this is just the standard btree semantics). The values are serialized PartitionLocator objects.

Note: At this time the recommended scale-out approach for the metadata index is to place the metadata indices on a MetadataService (the same MetadataService may be used for an arbitrary #of scale-out indices) and to replicate the state for the MetadataService onto failover MetadataServices. Since the MetadataIndex may grow without bound, you simply need to have enough disk on hand for it (the size requirements are quite modest). Further, the MetadataService MUST NOT be used to hold the data for the scale-out indices themselves since the MetadataIndex can not undergo IResourceManager.overflow().

One advantage of this approach is that the MetadataIndex is guaranteed to hold all historical states of the partition definitions for each index - effectively it is an immortal store for the partition metadata. On the other hand it is not possible to compact the metadata index without taking the database offline.

Version:
$Id: MetadataIndex.java 2265 2009-10-26 12:51:06Z thompsonbry $
Author:
Bryan Thompson
TODO:
The MetadataIndex does NOT support either overflow (it may NOT be a FusedView) NOR key-range splits. There are several issues involved:

(a) How to track the next partition identifier to be assigned to an index partition for the managed index. Currently this value is written in the MetadataIndex.MetadataIndexCheckpoint record and is propagated to the new backing store on overflow. However, if the metadata index is split into partitions then additional care MUST be taken to use only the value of that field on the 'meta-meta' index.

(b) how to locate the partitions of the metadata index itself., one way to locate the metadata-index partitions is to hash partition the metadata index and range queries can be flooded to all partitions. the #of metadata service nodes can be changed by a suitable broadcast event in which clients have to change to the new hash basis. this feature can be generalized to provide hash partitioned indices as well as key-range partitioned indices., A metadata index can be recovered by a distributed process running over the data services. Each data service reports all index partitions. The reports are collected and the index is rebuilt from the reports. Much like a map/reduce job.


Nested Class Summary
static class MetadataIndex.MetadataIndexCheckpoint
          Extends the Checkpoint record to store the next partition identifier to be assigned by the metadata index.
static class MetadataIndex.MetadataIndexMetadata
          Extends the IndexMetadata record to hold the metadata template for the managed scale-out index.
static class MetadataIndex.PartitionLocatorTupleSerializer
          Used to (de-)serialize PartitionLocators in the MetadataIndex.
 
Nested classes/interfaces inherited from class com.bigdata.btree.BTree
BTree.Counter, BTree.LeafCursor, BTree.NodeFactory, BTree.PartitionedCounter, BTree.Stack
 
Field Summary
 
Fields inherited from class com.bigdata.btree.BTree
counter, height, nentries, nleaves, nnodes
 
Fields inherited from class com.bigdata.btree.AbstractBTree
branchingFactor, debug, DEBUG, dumpLog, ERROR_CLOSED, ERROR_LESS_THAN_ZERO, ERROR_READ_ONLY, ERROR_TOO_LARGE, ERROR_TRANSIENT, INFO, log, metadata, ndistinctOnWriteRetentionQueue, nodeSer, root, store, storeCache, writeRetentionQueue
 
Fields inherited from interface com.bigdata.btree.IRangeQuery
ALL, CURSOR, DEFAULT, DELETED, FIXED_LENGTH_SUCCESSOR, KEYS, NONE, PARALLEL, READONLY, REMOVEALL, REVERSE, VALS
 
Constructor Summary
MetadataIndex(IRawStore store, Checkpoint checkpoint, IndexMetadata metadata)
          Required ctor.
 
Method Summary
static MetadataIndex create(IRawStore store, UUID indexUUID, IndexMetadata managedIndexMetadata)
          Create a new MetadataIndex.
 PartitionLocator find(byte[] key)
          Find and return the partition spanning the given key.
 PartitionLocator get(byte[] key)
          The partition with that separator key or null (exact match on the separator key).
 MetadataIndex.MetadataIndexMetadata getIndexMetadata()
          Returns the metadata record for this btree.
 IndexMetadata getScaleOutIndexMetadata()
          The metadata template for the scale-out index managed by this metadata index.
 int incrementAndGetNextPartitionId()
          Returns the value to be assigned to the next partition created on this MetadataIndex and then increments the counter.
 boolean needsCheckpoint()
          Extended to require a checkpoint if incrementAndGetNextPartitionId() has been invoked.
 void staleLocator(PartitionLocator locator)
          Passes the notice along to the view.
 
Methods inherited from class com.bigdata.btree.BTree
_reopen, create, createTransient, fireDirtyEvent, flush, getBloomFilter, getCheckpoint, getCounter, getDirtyListener, getEntryCount, getHeight, getLastCommitTime, getLeafCount, getMutableBTree, getNodeCount, getSourceCount, getSources, getStore, handleCommit, isReadOnly, load, load, newLeafCursor, newLeafCursor, readBloomFilter, removeAll, setDirtyListener, setIndexMetadata, setLastCommitTime, setReadOnly, writeCheckpoint, writeCheckpoint2
 
Methods inherited from class com.bigdata.btree.AbstractBTree
assertNotReadOnly, assertNotTransient, close, contains, contains, dump, dump, getBranchingFactor, getBtreeCounters, getContainsTuple, getCounters, getDynamicCounterSet, getLookupTuple, getNodeSerializer, getResourceMetadata, getRightMostNode, getRoot, getRootOrFinger, getStaticCounterSet, getUtilization, getWriteTuple, indexOf, insert, insert, insert, isOpen, isTransient, keyAt, lookup, lookup, lookup, rangeCheck, rangeCopy, rangeCount, rangeCount, rangeCount, rangeCountExact, rangeCountExactWithDeleted, rangeIterator, rangeIterator, rangeIterator, rangeIterator, rangeIterator, readNodeOrLeaf, remove, remove, remove, reopen, setBTreeCounters, submit, submit, submit, toString, touch, valueAt, valueAt, writeNodeOrLeaf, writeNodeRecursive
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface com.bigdata.btree.IRangeQuery
rangeCount, rangeCount, rangeCountExact, rangeCountExactWithDeleted, rangeIterator, rangeIterator, rangeIterator
 
Methods inherited from interface com.bigdata.btree.IIndex
getCounters, getResourceMetadata, submit, submit, submit
 
Methods inherited from interface com.bigdata.btree.ISimpleBTree
contains, insert, lookup, remove
 
Methods inherited from interface com.bigdata.btree.IAutoboxBTree
contains, insert, lookup, remove
 

Constructor Detail

MetadataIndex

public MetadataIndex(IRawStore store,
                     Checkpoint checkpoint,
                     IndexMetadata metadata)
Required ctor.

Parameters:
store -
checkpoint -
metadata -
Method Detail

getIndexMetadata

public MetadataIndex.MetadataIndexMetadata getIndexMetadata()
Description copied from class: AbstractBTree
Returns the metadata record for this btree.

Note: If the B+Tree is read-only then the metadata object will be cloned to avoid potential modification. However, only a single cloned copy of the metadata record will be shared between all callers for a given instance of this class.

Specified by:
getIndexMetadata in interface IIndex
Specified by:
getIndexMetadata in interface IMetadataIndex
Overrides:
getIndexMetadata in class AbstractBTree
Returns:
The metadata record for this btree and never null.
See Also:
IMetadataIndex.getScaleOutIndexMetadata()

getScaleOutIndexMetadata

public IndexMetadata getScaleOutIndexMetadata()
Description copied from interface: IMetadataIndex
The metadata template for the scale-out index managed by this metadata index.

Specified by:
getScaleOutIndexMetadata in interface IMetadataIndex

incrementAndGetNextPartitionId

public int incrementAndGetNextPartitionId()
Returns the value to be assigned to the next partition created on this MetadataIndex and then increments the counter. The counter will be made restart-safe iff the index is dirty, the index is registered as an ICommitter, and the store on which the index is stored is committed.

Note: The metadata index uses a 32-bit partition identifier rather than the BTree.getCounter(). The reason is that the Counter uses the partition identifier in the high word and a partition local counter in the low word. Therefore we have to centralize the assignment of the partition identifier, even when the metadata index is itself split into partitions. Requests for partition identifiers need to be directed to the root partition (L0) for the MetadataIndex.


create

public static MetadataIndex create(IRawStore store,
                                   UUID indexUUID,
                                   IndexMetadata managedIndexMetadata)
Create a new MetadataIndex.

Parameters:
store - The backing store.
indexUUID - The unique identifier for the metadata index.
managedIndexMetadata - The metadata template for the managed scale-out index.

needsCheckpoint

public boolean needsCheckpoint()
Extended to require a checkpoint if incrementAndGetNextPartitionId() has been invoked.

Overrides:
needsCheckpoint in class BTree
Returns:
true true iff changes would be lost unless the B+Tree was flushed to the backing store using BTree.writeCheckpoint().

get

public PartitionLocator get(byte[] key)
Description copied from interface: IMetadataIndex
The partition with that separator key or null (exact match on the separator key).

Specified by:
get in interface IMetadataIndex
Parameters:
key - The separator key (the first key that would go into that partition).
Returns:
The partition with that separator key or null.

find

public PartitionLocator find(byte[] key)
Description copied from interface: IMetadataIndex
Find and return the partition spanning the given key.

Specified by:
find in interface IMetadataIndex
Parameters:
key - A key (optional). When null the locator for the last index partition will be returned.
Returns:
The partition spanning the given key or null if there are no partitions defined.

staleLocator

public void staleLocator(PartitionLocator locator)
Passes the notice along to the view. It caches de-serialized locators and needs to drop them from its cache if they become stale.

Specified by:
staleLocator in interface IMetadataIndex
Parameters:
locator - The locator.


Copyright © 2006-2009 SYSTAP, LLC. All Rights Reserved.