|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.bigdata.service.ndx.AbstractSplitter
public abstract class AbstractSplitter
Basic implementation - you only need to provide resolution for the
IMetadataIndex.
| Field Summary | |
|---|---|
protected static org.apache.log4j.Logger |
log
|
| Constructor Summary | |
|---|---|
AbstractSplitter()
|
|
| Method Summary | |
|---|---|
protected abstract IMetadataIndex |
getMetadataIndex(long ts)
Return the IMetadataIndex that will be used to compute the
Splits |
LinkedList<Split> |
splitKeys(long ts,
int fromIndex,
int toIndex,
byte[][] keys)
Identify the Splits for an ordered array of keys such that there
is one Split per index partition spanned by the data. |
LinkedList<Split> |
splitKeys(long ts,
int fromIndex,
int toIndex,
KVO[] a)
Reshape the data into an unsigned byte[][] and then invoke splitKeys(long, int, int, byte[][]). |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected static final transient org.apache.log4j.Logger log
| Constructor Detail |
|---|
public AbstractSplitter()
| Method Detail |
|---|
protected abstract IMetadataIndex getMetadataIndex(long ts)
IMetadataIndex that will be used to compute the
Splits
ts - The timestamp of the IMetadataIndex view.
IMetadataIndex.
public LinkedList<Split> splitKeys(long ts,
int fromIndex,
int toIndex,
byte[][] keys)
Splits for an ordered array of keys such that there
is one Split per index partition spanned by the data.
Find the partition for the first key. Check the last key, if it is in the
same partition then then this is the simplest case and we can just send
the data along.
Otherwise, perform a binary search on the remaining keys looking for the index of the first key GTE the right separator key for that partition. The batch for this partition is formed from all keys from the first key for that partition up to but excluding the index position identified by the binary search (if there is a match; if there is a miss, then the binary search result needs to be converted into a key index and that will be the last key for the current partition).
Examine the next key and repeat the process until all keys have been allocated to index partitions.
Note: Split points MUST respect the "row" identity for a sparse row store, but we get that constraint by maintaining the index partition boundaries in agreement with the split point constraints for the index.
Note: The splitter always detect keys out of order and will throw an
IllegalArgumentException. This is done since it is otherwise too
easy for applications to produce unordered data which would then quietly
violate this expectation if we relied on asserts.
splitKeys in interface ISplitterts - The timestamp for the IMetadataIndex view that will be
applied to choose the Splits.fromIndex - The index of the first key in keys to be processed
(inclusive).toIndex - The index of the last key in keys to be processed.keys - An array of keys. Each key is an interpreted as an unsigned
byte[]. All keys must be non-null. The keys must be in sorted
order.
Splits that you can use to form requests based on
the identified first/last key and partition identified by this
process.Arrays.sort(Object[], int, int, java.util.Comparator),
BytesUtil.compareBytes(byte[], byte[])IMetadataIndex.find(byte[]) since that operation will be an
RMI in a distributed federation. The find(byte[] key) operation is
difficult to cache since it locates the index partition that would
span the key and many, many different keys could fit into that same
index partition. The only effective cache technique may be an LRU
that scans ~10 caches locators to see if any of them is a match
before reaching out to the remote IMetadataService. Or
perhaps the locators can be cached in a local BTree and a miss
there would result in a read through to the remote
IMetadataService but then we have the problem of figuring
out when to release locators if the client is long-lived.
public LinkedList<Split> splitKeys(long ts,
int fromIndex,
int toIndex,
KVO[] a)
splitKeys(long, int, int, byte[][]).
splitKeys in interface ISplitterts - The timestamp for the IMetadataIndex view that will be
applied to choose the Splits.fromIndex - The index of the first key in keys to be processed
(inclusive).toIndex - The index of the last key in keys to be processed.
Splits that you can use to form requests based on
the identified first/last key and partition identified by this
process.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||