|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.bigdata.btree.AbstractBTree
com.bigdata.btree.BTree
com.bigdata.mdi.MetadataIndex
public class MetadataIndex
A metadata index for the partitions of a distributed index. There is one
metadata index for each distributed index. The keys of the metadata index are
the first key that would be directed into the corresponding index segment,
e.g., a separator key (this is just the standard btree semantics).
The values are serialized PartitionLocator objects.
Note: At this time the recommended scale-out approach for the metadata index
is to place the metadata indices on a MetadataService (the same
MetadataService may be used for an arbitrary #of scale-out indices)
and to replicate the state for the MetadataService onto
failover MetadataServices. Since the MetadataIndex may grow
without bound, you simply need to have enough disk on hand for it (the size
requirements are quite modest). Further, the MetadataService MUST NOT
be used to hold the data for the scale-out indices themselves since the
MetadataIndex can not undergo IResourceManager.overflow().
One advantage of this approach is that the MetadataIndex is
guaranteed to hold all historical states of the partition definitions for
each index - effectively it is an immortal store for the partition metadata.
On the other hand it is not possible to compact the metadata index without
taking the database offline.
MetadataIndex does NOT support either overflow (it may NOT
be a FusedView) NOR key-range splits. There are several issues
involved:
(a) How to track the next partition identifier to be assigned to an
index partition for the managed index. Currently this value is written
in the MetadataIndex.MetadataIndexCheckpoint record and is propagated to the
new backing store on overflow. However, if the metadata index is split
into partitions then additional care MUST be taken to use only the
value of that field on the 'meta-meta' index.
(b) how to locate the partitions of the metadata index itself., one way to locate the metadata-index partitions is to hash partition the metadata index and range queries can be flooded to all partitions. the #of metadata service nodes can be changed by a suitable broadcast event in which clients have to change to the new hash basis. this feature can be generalized to provide hash partitioned indices as well as key-range partitioned indices., A metadata index can be recovered by a distributed process running over the data services. Each data service reports all index partitions. The reports are collected and the index is rebuilt from the reports. Much like a map/reduce job.
| Nested Class Summary | |
|---|---|
static class |
MetadataIndex.MetadataIndexCheckpoint
Extends the Checkpoint record to store the next partition
identifier to be assigned by the metadata index. |
static class |
MetadataIndex.MetadataIndexMetadata
Extends the IndexMetadata record to hold the metadata template
for the managed scale-out index. |
static class |
MetadataIndex.PartitionLocatorTupleSerializer
Used to (de-)serialize PartitionLocators in the
MetadataIndex. |
| Nested classes/interfaces inherited from class com.bigdata.btree.BTree |
|---|
BTree.Counter, BTree.LeafCursor, BTree.NodeFactory, BTree.PartitionedCounter, BTree.Stack |
| Field Summary |
|---|
| Fields inherited from class com.bigdata.btree.BTree |
|---|
counter, height, nentries, nleaves, nnodes |
| Fields inherited from class com.bigdata.btree.AbstractBTree |
|---|
branchingFactor, debug, DEBUG, dumpLog, ERROR_CLOSED, ERROR_LESS_THAN_ZERO, ERROR_READ_ONLY, ERROR_TOO_LARGE, ERROR_TRANSIENT, INFO, log, metadata, ndistinctOnWriteRetentionQueue, nodeSer, root, store, storeCache, writeRetentionQueue |
| Fields inherited from interface com.bigdata.btree.IRangeQuery |
|---|
ALL, CURSOR, DEFAULT, DELETED, FIXED_LENGTH_SUCCESSOR, KEYS, NONE, PARALLEL, READONLY, REMOVEALL, REVERSE, VALS |
| Constructor Summary | |
|---|---|
MetadataIndex(IRawStore store,
Checkpoint checkpoint,
IndexMetadata metadata)
Required ctor. |
|
| Method Summary | |
|---|---|
static MetadataIndex |
create(IRawStore store,
UUID indexUUID,
IndexMetadata managedIndexMetadata)
Create a new MetadataIndex. |
PartitionLocator |
find(byte[] key)
Find and return the partition spanning the given key. |
PartitionLocator |
get(byte[] key)
The partition with that separator key or null (exact match
on the separator key). |
MetadataIndex.MetadataIndexMetadata |
getIndexMetadata()
Returns the metadata record for this btree. |
IndexMetadata |
getScaleOutIndexMetadata()
The metadata template for the scale-out index managed by this metadata index. |
int |
incrementAndGetNextPartitionId()
Returns the value to be assigned to the next partition created on this MetadataIndex and then increments the counter. |
boolean |
needsCheckpoint()
Extended to require a checkpoint if incrementAndGetNextPartitionId() has been
invoked. |
void |
staleLocator(PartitionLocator locator)
Passes the notice along to the view. |
| Methods inherited from class com.bigdata.btree.BTree |
|---|
_reopen, create, createTransient, fireDirtyEvent, flush, getBloomFilter, getCheckpoint, getCounter, getDirtyListener, getEntryCount, getHeight, getLastCommitTime, getLeafCount, getMutableBTree, getNodeCount, getSourceCount, getSources, getStore, handleCommit, isReadOnly, load, load, newLeafCursor, newLeafCursor, readBloomFilter, removeAll, setDirtyListener, setIndexMetadata, setLastCommitTime, setReadOnly, writeCheckpoint, writeCheckpoint2 |
| Methods inherited from class com.bigdata.btree.AbstractBTree |
|---|
assertNotReadOnly, assertNotTransient, close, contains, contains, dump, dump, getBranchingFactor, getBtreeCounters, getContainsTuple, getCounters, getDynamicCounterSet, getLookupTuple, getNodeSerializer, getResourceMetadata, getRightMostNode, getRoot, getRootOrFinger, getStaticCounterSet, getUtilization, getWriteTuple, indexOf, insert, insert, insert, isOpen, isTransient, keyAt, lookup, lookup, lookup, rangeCheck, rangeCopy, rangeCount, rangeCount, rangeCount, rangeCountExact, rangeCountExactWithDeleted, rangeIterator, rangeIterator, rangeIterator, rangeIterator, rangeIterator, readNodeOrLeaf, remove, remove, remove, reopen, setBTreeCounters, submit, submit, submit, toString, touch, valueAt, valueAt, writeNodeOrLeaf, writeNodeRecursive |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface com.bigdata.btree.IRangeQuery |
|---|
rangeCount, rangeCount, rangeCountExact, rangeCountExactWithDeleted, rangeIterator, rangeIterator, rangeIterator |
| Methods inherited from interface com.bigdata.btree.IIndex |
|---|
getCounters, getResourceMetadata, submit, submit, submit |
| Methods inherited from interface com.bigdata.btree.ISimpleBTree |
|---|
contains, insert, lookup, remove |
| Methods inherited from interface com.bigdata.btree.IAutoboxBTree |
|---|
contains, insert, lookup, remove |
| Constructor Detail |
|---|
public MetadataIndex(IRawStore store,
Checkpoint checkpoint,
IndexMetadata metadata)
store - checkpoint - metadata - | Method Detail |
|---|
public MetadataIndex.MetadataIndexMetadata getIndexMetadata()
AbstractBTreeNote: If the B+Tree is read-only then the metadata object will be cloned to avoid potential modification. However, only a single cloned copy of the metadata record will be shared between all callers for a given instance of this class.
getIndexMetadata in interface IIndexgetIndexMetadata in interface IMetadataIndexgetIndexMetadata in class AbstractBTreenull.IMetadataIndex.getScaleOutIndexMetadata()public IndexMetadata getScaleOutIndexMetadata()
IMetadataIndex
getScaleOutIndexMetadata in interface IMetadataIndexpublic int incrementAndGetNextPartitionId()
MetadataIndex and then increments the counter. The counter will
be made restart-safe iff the index is dirty, the index is registered as
an ICommitter, and the store on which the index is stored is
committed.
Note: The metadata index uses a 32-bit partition identifier rather than
the BTree.getCounter(). The reason is that the Counter uses
the partition identifier in the high word and a partition local counter
in the low word. Therefore we have to centralize the assignment of the
partition identifier, even when the metadata index is itself split into
partitions. Requests for partition identifiers need to be directed to the
root partition (L0) for the MetadataIndex.
public static MetadataIndex create(IRawStore store,
UUID indexUUID,
IndexMetadata managedIndexMetadata)
MetadataIndex.
store - The backing store.indexUUID - The unique identifier for the metadata index.managedIndexMetadata - The metadata template for the managed scale-out index.public boolean needsCheckpoint()
incrementAndGetNextPartitionId() has been
invoked.
needsCheckpoint in class BTreetrue true iff changes would be lost unless the
B+Tree was flushed to the backing store using
BTree.writeCheckpoint().public PartitionLocator get(byte[] key)
IMetadataIndexnull (exact match
on the separator key).
get in interface IMetadataIndexkey - The separator key (the first key that would go into that
partition).
null.public PartitionLocator find(byte[] key)
IMetadataIndex
find in interface IMetadataIndexkey - A key (optional). When null the locator for the
last index partition will be returned.
null if
there are no partitions defined.public void staleLocator(PartitionLocator locator)
view. It caches de-serialized
locators and needs to drop them from its cache if they become stale.
staleLocator in interface IMetadataIndexlocator - The locator.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||