com.bigdata.relation.accesspath
Class AbstractAccessPath<R>

java.lang.Object
  extended by com.bigdata.relation.accesspath.AbstractAccessPath<R>
All Implemented Interfaces:
IAccessPath<R>, Iterable<R>
Direct Known Subclasses:
MagicAccessPath, SPOAccessPath

public abstract class AbstractAccessPath<R>
extends Object
implements IAccessPath<R>

Abstract base class for type-specific IAccessPath implementations.

Version:
$Id: AbstractAccessPath.java 2265 2009-10-26 12:51:06Z thompsonbry $
Author:
Bryan Thompson
TODO:
This needs to be more generalized so that you can use a index that is best without being optimal by specifying a low-level filter to be applied to the index. That requires a means to dynamically filter out the elements we do not want from the key-range scan - the filtering should of course be done at the IDataService.

Nested Class Summary
static class AbstractAccessPath.ElementFilter<R>
          Align the predicate's IElementFilter constraint with ITupleFilter so that the IElementFilter can be evaluated close to the data by an ITupleIterator.
 
Field Summary
protected  int chunkCapacity
           
protected  int chunkOfChunksCapacity
           
protected  FilterConstructor<R> filter
          The filter derived from the IElementFilter.
protected  int flags
          Iterator flags.
protected  int fullyBufferedReadThreshold
           
protected  IIndexManager indexManager
          Access to the index, resource locator, executor service, etc.
protected  IKeyOrder<R> keyOrder
          Index order (the relation namespace plus the index order and the option partitionId constraint on the predicate identify the index).
protected static org.apache.log4j.Logger log
           
protected static int MAX_FULLY_BUFFERED_READ_LIMIT
          The maximum limit that is allowed for a fully-buffered read.
protected  IIndex ndx
          The index.
protected  IPredicate<R> predicate
          Predicate (the resource name on the predicate is the relation namespace).
protected  long timestamp
          Timestamp of the view.
 
Constructor Summary
protected AbstractAccessPath(IIndexManager indexManager, long timestamp, IPredicate<R> predicate, IKeyOrder<R> keyOrder, IIndex ndx, int flags, int chunkOfChunksCapacity, int chunkCapacity, int fullyBufferedReadThreshold)
           
 
Method Summary
protected  void assertInitialized()
           
protected  IChunkedOrderedIterator<R> asynchronousIterator(Iterator<R> src)
          Asynchronous read using a BlockingBuffer.
 int getChunkCapacity()
           
 int getChunkOfChunksCapacity()
           
 byte[] getFromKey()
          The key corresponding to the inclusive lower bound for the IAccessPath null if there is no lower bound.
 IIndex getIndex()
          The index selected for the access path.
 IIndexManager getIndexManager()
           
 IKeyOrder<R> getKeyOrder()
          The order in which the elements will be visited.
 IPredicate<R> getPredicate()
          The constraints on the IAccessPath.
 long getTimestamp()
           
 byte[] getToKey()
          The key corresponding to the exclusive upper bound for the IAccessPath -or- null if there is no upper bound.
 AbstractAccessPath<R> init()
          Required post-ctor initialization.
 boolean isEmpty()
          True iff the access path is empty (there are no matches for the IPredicate) This is more conclusive than #rangeCount() since you MAY have a non-zero range count when the key range is in fact empty (there may be "deleted" index entries within the key range).
 boolean isFullyBoundForKey()
          true iff all elements in the predicate which are required to generate the key are bound to constants.
 IChunkedOrderedIterator<R> iterator()
          An iterator visiting elements using the natural order of the index selected for the IPredicate.
 IChunkedOrderedIterator<R> iterator(int limit, int capacity)
          An iterator visiting elements using the natural order of the index selected for the IPredicate.
 IChunkedOrderedIterator<R> iterator(long offset, long limit, int capacity)
          An iterator visiting elements using the natural order of the index selected for the IPredicate.
 long rangeCount(boolean exact)
          Note: When there is a IPredicate.getConstraint() on the IPredicate the exact range count will apply that constraint as a filter during traversal.
 ITupleIterator<R> rangeIterator()
          The raw iterator for traversing the selected index within the key range implied by IPredicate.
protected  ITupleIterator<R> rangeIterator(int capacity, int flags, IFilterConstructor<R> filter)
           
 long removeAll()
          This implementation removes all tuples that would be visited by the access path from the backing index.
protected  void setFromKey(byte[] fromKey)
           
protected  void setToKey(byte[] toKey)
           
protected  IChunkedOrderedIterator<R> synchronousIterator(long offset, long limit, Iterator<R> src)
          Fully buffers all elements that would be visited by the IAccessPath iterator.
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

log

protected static final org.apache.log4j.Logger log

indexManager

protected final IIndexManager indexManager
Access to the index, resource locator, executor service, etc.


timestamp

protected final long timestamp
Timestamp of the view.


predicate

protected final IPredicate<R> predicate
Predicate (the resource name on the predicate is the relation namespace).


keyOrder

protected final IKeyOrder<R> keyOrder
Index order (the relation namespace plus the index order and the option partitionId constraint on the predicate identify the index).


ndx

protected final IIndex ndx
The index.


flags

protected final int flags
Iterator flags.


chunkOfChunksCapacity

protected final int chunkOfChunksCapacity

chunkCapacity

protected final int chunkCapacity

fullyBufferedReadThreshold

protected final int fullyBufferedReadThreshold

MAX_FULLY_BUFFERED_READ_LIMIT

protected static final int MAX_FULLY_BUFFERED_READ_LIMIT
The maximum limit that is allowed for a fully-buffered read. The asynchronousIterator(Iterator) will always be used above this limit.

See Also:
Constant Field Values

filter

protected final FilterConstructor<R> filter
The filter derived from the IElementFilter.

Constructor Detail

AbstractAccessPath

protected AbstractAccessPath(IIndexManager indexManager,
                             long timestamp,
                             IPredicate<R> predicate,
                             IKeyOrder<R> keyOrder,
                             IIndex ndx,
                             int flags,
                             int chunkOfChunksCapacity,
                             int chunkCapacity,
                             int fullyBufferedReadThreshold)
Parameters:
indexManager - Access to the indices, resource locators, executor service, etc.
timestamp - The timestamp of the index view.
predicate - The constraints on the access path.
keyOrder - The order in which the elements would be visited for this access path.
ndx - The index on which the access path is reading.
flags - The default IRangeQuery flags.
chunkOfChunksCapacity - The #of chunks that can be held by an IBuffer that is the target or one or more producers. This is generally a small number on the order of the #of parallel producers that might be writing on the IBuffer since the capacity of the UnsynchronizedArrayBuffers is already quite large (10k or better elements, defining a single "chunk" from a single producer).
chunkCapacity - The maximum size for a single chunk (generally 10k or better).
fullyBufferedReadThreshold - If the estimated remaining rangeCount for an iterator(long, long, int) is LTE this threshold then we will do a fully buffered (synchronous) read. Otherwise we will do an asynchronous read.
Method Detail

isFullyBoundForKey

public boolean isFullyBoundForKey()
true iff all elements in the predicate which are required to generate the key are bound to constants.


getChunkCapacity

public int getChunkCapacity()
See Also:
AbstractResource.getChunkCapacity()

getChunkOfChunksCapacity

public int getChunkOfChunksCapacity()
See Also:
AbstractResource.getChunkOfChunksCapacity()

getFromKey

public byte[] getFromKey()
The key corresponding to the inclusive lower bound for the IAccessPath null if there is no lower bound.

This MUST be set by the concrete subclass using setFromKey(byte[]) BEFORE calling init() - it MAY be set to a null value.


getToKey

public byte[] getToKey()
The key corresponding to the exclusive upper bound for the IAccessPath -or- null if there is no upper bound.

This MUST be set by the concrete subclass using setFromKey(byte[]) BEFORE calling init() - it MAY be set to a null value.


setFromKey

protected void setFromKey(byte[] fromKey)

setToKey

protected void setToKey(byte[] toKey)

getKeyOrder

public IKeyOrder<R> getKeyOrder()
Description copied from interface: IAccessPath
The order in which the elements will be visited.

Specified by:
getKeyOrder in interface IAccessPath<R>

toString

public String toString()
Overrides:
toString in class Object

assertInitialized

protected final void assertInitialized()
Throws:
IllegalStateException - unless init() has been invoked.

init

public AbstractAccessPath<R> init()
Required post-ctor initialization.

Returns:
this

getIndexManager

public IIndexManager getIndexManager()

getTimestamp

public long getTimestamp()

getPredicate

public IPredicate<R> getPredicate()
Description copied from interface: IAccessPath
The constraints on the IAccessPath.

Specified by:
getPredicate in interface IAccessPath<R>

getIndex

public IIndex getIndex()
Description copied from interface: IAccessPath
The index selected for the access path.

Note: The access path may incorporate additional constraints from the specified IPredicate that are not present on the IIndex returned by this method.

Specified by:
getIndex in interface IAccessPath<R>

isEmpty

public boolean isEmpty()
Description copied from interface: IAccessPath
True iff the access path is empty (there are no matches for the IPredicate) This is more conclusive than #rangeCount() since you MAY have a non-zero range count when the key range is in fact empty (there may be "deleted" index entries within the key range).

Specified by:
isEmpty in interface IAccessPath<R>
TODO:
for scale-out, it may be better to implement isEmpty() without specifying a capacity of ONE (1) and then caching the returned iterator. This could avoid an expensive RMI test if we invoke iterator() shortly after isEmpty() returns false.

iterator

public final IChunkedOrderedIterator<R> iterator()
Description copied from interface: IAccessPath
An iterator visiting elements using the natural order of the index selected for the IPredicate. This is equivalent to
 iterator(0L, 0L, 0)
 
since an offset of ZERO (0L) means no offset, a limit of ZERO (0L) means no limit and a capacity of ZERO (0) means whatever is the default capacity.

Note: Filters should be specified when the IAccessPath is constructed so that they will be evaluated on the IDataService rather than materializing the elements and then filtering then. This can be accomplished by adding the filter as an IElementFilter on the IPredicate when requesting IAccessPath.

Specified by:
iterator in interface IAccessPath<R>
Specified by:
iterator in interface Iterable<R>
Returns:
The iterator.
See Also:
IRelation.getAccessPath(IPredicate)

iterator

public final IChunkedOrderedIterator<R> iterator(int limit,
                                                 int capacity)
Description copied from interface: IAccessPath
An iterator visiting elements using the natural order of the index selected for the IPredicate.

Specified by:
iterator in interface IAccessPath<R>
Parameters:
limit - The maximum #of elements that will be visited -or- ZERO (0) if there is no limit.
capacity - The maximum capacity for the buffer used by the iterator. When ZERO(0), a default capacity will be used. When a limit is specified, the capacity will never exceed the limit.
Returns:
The iterator.

iterator

public final IChunkedOrderedIterator<R> iterator(long offset,
                                                 long limit,
                                                 int capacity)
Description copied from interface: IAccessPath
An iterator visiting elements using the natural order of the index selected for the IPredicate.

The offset and limit together describe an optional slice that will be visited by the iterator. When a slice is specified, the iterator will count off the elements accepted by the IPredicate up to the offset, but not materialize them. Elements by the IPredicate starting with the offset and up to (but not including) offset+limit will be materialized for the client. The iterator will halt processing after observing offset+limit accepted elements. Note that slices for JOINs (vs a simple IAccessPath scan) are handled by IQueryOptions for an IRule.

The meaning of "accepted" is that: (a) the elements lie in the key-range constraint implied by the IPredicate; and (b) the elements pass any optional constraints that the IPredicate imposes.

Specified by:
iterator in interface IAccessPath<R>
Parameters:
offset - The first element accepted by the iterator that it will visit (materialize for the client). The offset must be non-negative. This is ZERO (0L) to visit all accepted elements.
limit - The last element accepted by the iterator that it will visit (materialize for the client). The limit must be non-negative. This is ZERO (0L) to visit all accepted elements (the value Long.MAX_VALUE is interpreted exactly like ZERO(0L)).
capacity - The maximum capacity for the buffer used by the iterator. When ZERO(0), a default capacity will be used. When a limit is specified, the capacity will never exceed the limit.
Returns:
The iterator. FIXME The offset and limit should probably be rolled into the predicate and removed from the IAccessPath. This way they will be correctly applied when IAccessPath.isEmpty() is implemented using the IAccessPath.iterator() to determine if any elements can be visited.
Throws:
RejectedExecutionException - if the iterator is run asynchronously and the ExecutorService is shutdown or has a maximum capacity and is saturated. FIXME Support both offset and limit for asynchronous iterators. right now this will force the use of the synchronousIterator(long, long, Iterator) when the offset or limit are non-zero, but that is only permitted up to a limit of MAX_FULLY_BUFFERED_READ_LIMIT. FIXME in order to support large limits we need to verify that the asynchronous iterator can correctly handle REMOVEALL and that incremental materialization up to the [limit] will not effect the semantics for REMOVEALL or the other iterator flags (per above). (In fact, the asynchronous iterator does not support either [offset] or [limit] at this time). FIXME write unit tests for slice handling by this method and modify the SAIL integration to use it for SLICE on an IAccessPath scan. Note that there are several IAccessPath implementations and they all need to be tested with SLICE. Those tests should be located in com.bigdata.rdf.spo.TestSPOAccessPath. FIXME The offset and limit should probably be rolled into the predicate and removed from the IAccessPath. This way they will be correctly applied when isEmpty() is implemented using the iterator() to determine if any

synchronousIterator

protected final IChunkedOrderedIterator<R> synchronousIterator(long offset,
                                                               long limit,
                                                               Iterator<R> src)
Fully buffers all elements that would be visited by the IAccessPath iterator.

Parameters:
accessPath - The access path (including the triple pattern).
offset - The first element that will be materialized (non-negative).
limit - The maximum #of elements that will be materialized (must be positive, so use a range count before calling this method if there was no limit specified by the caller). FIXME pass the offset and limit into the source iterator and remove them from this method's signature. This will require a change to the IRangeQuery API and ITupleIterator impls.

asynchronousIterator

protected final IChunkedOrderedIterator<R> asynchronousIterator(Iterator<R> src)
Asynchronous read using a BlockingBuffer.

Parameters:
src - The source iterator.
Returns:
Throws:
RejectedExecutionException - if the ExecutorService is shutdown or has a maximum capacity and is saturated.

rangeCount

public final long rangeCount(boolean exact)
Note: When there is a IPredicate.getConstraint() on the IPredicate the exact range count will apply that constraint as a filter during traversal. However, the constraint is ignored for the fast range count.

Specified by:
rangeCount in interface IAccessPath<R>
Parameters:
exact - When true, the result will be an exact count and may require a key-range scan. When false, the result will be an upper bound IFF delete markers are provisioned for the backing index (delete markers are required for transactions and for scale-out indices).
See Also:
IRangeQuery

rangeIterator

public final ITupleIterator<R> rangeIterator()
Description copied from interface: IAccessPath
The raw iterator for traversing the selected index within the key range implied by IPredicate.

Note: The access path may incorporate additional constraints from the specified IPredicate that are not present on the raw ITupleIterator returned by this method.

Specified by:
rangeIterator in interface IAccessPath<R>

rangeIterator

protected ITupleIterator<R> rangeIterator(int capacity,
                                          int flags,
                                          IFilterConstructor<R> filter)

removeAll

public long removeAll()
This implementation removes all tuples that would be visited by the access path from the backing index.

Note: If you are maintaining multiple indices then you MUST override this method to remove the data from each of those indices.

Specified by:
removeAll in interface IAccessPath<R>
Returns:
The #of elements that were removed.


Copyright © 2006-2009 SYSTAP, LLC. All Rights Reserved.