com.bigdata.btree
Class AbstractChunkedTupleIterator<E>

java.lang.Object
  extended by com.bigdata.btree.AbstractChunkedTupleIterator<E>
All Implemented Interfaces:
ITupleIterator<E>, Iterator<ITuple<E>>
Direct Known Subclasses:
ChunkedLocalRangeIterator, RawDataServiceTupleIterator

public abstract class AbstractChunkedTupleIterator<E>
extends Object
implements ITupleIterator<E>

A chunked iterator that proceeds a ResultSet at a time. This introduces the concept of a continuationQuery() so that the iterator can materialize the tuples using a sequence of queries that progresses through the index until all tuples in the key range have been visited.

Version:
$Id: AbstractChunkedTupleIterator.java 2265 2009-10-26 12:51:06Z thompsonbry $
Author:
Bryan Thompson

Field Summary
protected  int capacity
          This controls the #of results per data service query.
protected static boolean DEBUG
           
protected static String ERR_NO_KEYS
          Error message used by #getKey() when the iterator was not provisioned to request keys from the data service.
protected static String ERR_NO_VALS
          Error message used by #getValue() when the iterator was not provisioned to request values from the data service.
protected  boolean exhausted
          When true, the entire key range specified by the client has been visited and the iterator is exhausted (i.e., all done).
protected  IFilterConstructor filter
          Optional filter.
protected  int flags
          These flags control whether keys and/or values are requested.
protected  byte[] fromKey
          The first key to visit -or- null iff no lower bound.
protected static boolean INFO
           
protected  int lastVisited
          The index of the last entry visited in the current ResultSet.
protected  byte[] lastVisitedKeyInPriorResultSet
          This gets set by continuationQuery() to the value of the key for the then current tuple.
protected static org.apache.log4j.Logger log
           
protected  int nqueries
          The #of range query operations executed.
protected  long nvisited
          The #of enties visited so far.
protected  ResultSet rset
          The current result set.
protected  byte[] toKey
          The first key to NOT visit -or- null iff no upper bound.
 
Constructor Summary
AbstractChunkedTupleIterator(byte[] fromKey, byte[] toKey, int capacity, int flags, IFilterConstructor filter)
           
 
Method Summary
protected  void continuationQuery()
          Issues a "continuation" query against the same index.
protected  void deleteBehind()
           
protected abstract  void deleteBehind(int n, Iterator<byte[]> keys)
          Batch delete the index entries identified by keys and clear the list.
protected abstract  void deleteLast(byte[] key)
          Delete the index entry identified by key.
 void flush()
          Method flushes any queued deletes.
protected  long getCommitTime()
          The timestamp returned by the initial ResultSet.
protected  int getDefaultCapacity()
          The capacity used by default when the caller specified 0 as the capacity for the iterator.
 int getQueryCount()
          The #of queries issued so far.
protected abstract  boolean getReadConsistent()
          When true the getCommitTime() will be used to ensure that continuationQuery()s run against the same commit point for the local index partition thereby producing a read consistent view even when the iterator is ITx.READ_COMMITTED.
protected  long getReadTime()
          Return the timestamp used for continuationQuery()s.
protected abstract  ResultSet getResultSet(long timestamp, byte[] fromKey, byte[] toKey, int capacity, int flags, IFilterConstructor filter)
          Abstract method must return the next ResultSet based on the supplied parameter values.
protected abstract  long getTimestamp()
          The timestamp for the operation as specified by the ctor (this is used for remote index queries but when running against a local index).
 long getVisitedCount()
          The #of entries visited so far (not the #of entries scanned, which can be much greater if a filter is in use).
 boolean hasNext()
          There are three levels at which we need to test in order to determine if the total iterator is exhausted.
 ITuple<E> next()
          Advance the iterator and return the ITuple from which you can extract the data and metadata for next entry.
protected  void rangeQuery()
          Issues the original range query.
protected abstract  IBlock readBlock(int sourceIndex, long addr)
          Return an object that may be used to read the block from the backing store per the contract for ITuple.readBlock(long)
 void remove()
          Queues a request to remove the entry under the most recently visited key.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final transient org.apache.log4j.Logger log

INFO

protected static final transient boolean INFO

DEBUG

protected static final transient boolean DEBUG

ERR_NO_KEYS

protected static final transient String ERR_NO_KEYS
Error message used by #getKey() when the iterator was not provisioned to request keys from the data service.

See Also:
Constant Field Values

ERR_NO_VALS

protected static final transient String ERR_NO_VALS
Error message used by #getValue() when the iterator was not provisioned to request values from the data service.

See Also:
Constant Field Values

fromKey

protected final byte[] fromKey
The first key to visit -or- null iff no lower bound.


toKey

protected final byte[] toKey
The first key to NOT visit -or- null iff no upper bound.


capacity

protected final int capacity
This controls the #of results per data service query.


flags

protected final int flags
These flags control whether keys and/or values are requested. If neither keys nor values are requested, then this is just a range count operation and you might as well use rangeCount instead.


filter

protected final IFilterConstructor filter
Optional filter.


nqueries

protected int nqueries
The #of range query operations executed.


rset

protected ResultSet rset
The current result set. For each index partition spanned by the overall key range supplied by the client, we will issue at least one range query against that index partition. Once all entries in a result set have been consumed by the client, we test the result set to see whether or not it exhausted the entries that could be matched for that index partition. If not, then we will issue "continuation" query against the same index position. If we are scanning forward, then the continuation query will start (toKey) from the successor of the last key scanned (if we are scanning backwards, then toKey will be the leftSeparator of the index partition and fromKey will be the last key scanned).

Note: A result set will be empty if there are no entries (after filtering) that lie within the key range in a given index partition. It is possible for any of the result sets to be empty. Consider a case of static partitioning of an index into N partitions. When the index is empty, a range query of the entire index will still query each of the N partitions. However, since the index is empty none of the partitions will have any matching entries and all result sets will be empty.

See Also:
rangeQuery(), continuationQuery()
TODO:
it would be useful if the ResultSet reported the maximum length for the keys and for the values. This could be used to right size the buffers which otherwise we have to let grow until they are of sufficient capacity.

nvisited

protected long nvisited
The #of enties visited so far.


lastVisited

protected int lastVisited
The index of the last entry visited in the current ResultSet. This is reset to -1 each time we obtain a new ResultSet.


exhausted

protected boolean exhausted
When true, the entire key range specified by the client has been visited and the iterator is exhausted (i.e., all done).


lastVisitedKeyInPriorResultSet

protected byte[] lastVisitedKeyInPriorResultSet
This gets set by continuationQuery() to the value of the key for the then current tuple. This is used by remove() in the edge case where lastVisited is -1 because a continuation query has been issued but next() has not yet been invoked. It is cleared by next() so that it does not hang around.

Constructor Detail

AbstractChunkedTupleIterator

public AbstractChunkedTupleIterator(byte[] fromKey,
                                    byte[] toKey,
                                    int capacity,
                                    int flags,
                                    IFilterConstructor filter)
Method Detail

getTimestamp

protected abstract long getTimestamp()
The timestamp for the operation as specified by the ctor (this is used for remote index queries but when running against a local index).


getCommitTime

protected long getCommitTime()
The timestamp returned by the initial ResultSet.


getReadConsistent

protected abstract boolean getReadConsistent()
When true the getCommitTime() will be used to ensure that continuationQuery()s run against the same commit point for the local index partition thereby producing a read consistent view even when the iterator is ITx.READ_COMMITTED. When false continuationQuery()s will use whatever value is returned by getTimestamp(). Read-consistent semantics for a partitioned index are achieved using the timestamp returned by IIndexStore.getLastCommitTime() rather than ITx.READ_COMMITTED.


getReadTime

protected final long getReadTime()
Return the timestamp used for continuationQuery()s. The value returned depends on whether or not getReadConsistent() is true. When consistent reads are required the timestamp will be the ResultSet.getCommitTime() for the initial ResultSet. Otherwise it is the value returned by getTimestamp().

Throws:
IllegalStateException - if getReadConsistent() is true and the initial ResultSet has not been read since the commitTime for that ResultSet is not yet available.

getQueryCount

public int getQueryCount()
The #of queries issued so far.


getVisitedCount

public long getVisitedCount()
The #of entries visited so far (not the #of entries scanned, which can be much greater if a filter is in use).


getDefaultCapacity

protected int getDefaultCapacity()
The capacity used by default when the caller specified 0 as the capacity for the iterator.


getResultSet

protected abstract ResultSet getResultSet(long timestamp,
                                          byte[] fromKey,
                                          byte[] toKey,
                                          int capacity,
                                          int flags,
                                          IFilterConstructor filter)
Abstract method must return the next ResultSet based on the supplied parameter values.

Parameters:
timestamp -
fromKey -
toKey -
capacity -
flags -
filter -
Returns:

rangeQuery

protected void rangeQuery()
Issues the original range query.


continuationQuery

protected void continuationQuery()
Issues a "continuation" query against the same index. This is invoked iff there are no entries left to visit in the current ResultSet but ResultSet.isExhausted() is [false], indicating that there is more data available.


hasNext

public boolean hasNext()
There are three levels at which we need to test in order to determine if the total iterator is exhausted. First, we need to test to see if there are more entries remaining in the current ResultSet. If not and the ResultSet is NOT exhausted, then we issue a continuationQuery() against the same index partition. If the ResultSet is exhausted, then we test to see whether or not we have visited all index partitions. If so, then the iterator is exhausted. Otherwise we issue a range query against the #nextPartition().

Specified by:
hasNext in interface Iterator<ITuple<E>>
Returns:
True iff the iterator is not exhausted.

next

public ITuple<E> next()
Description copied from interface: ITupleIterator
Advance the iterator and return the ITuple from which you can extract the data and metadata for next entry.

Note: An ITupleIterators will generally return the same ITuple reference on on each invocation of this method. The caller is responsible for copying out any data or metadata of interest before calling ITupleIterator.next() again. See TupleFilter which is aware of this and can be used to stack filters safely.

Specified by:
next in interface ITupleIterator<E>
Specified by:
next in interface Iterator<ITuple<E>>
Returns:
The ITuple containing the data and metadata for the current index entry.

remove

public void remove()
Queues a request to remove the entry under the most recently visited key. If the iterator is exhausted then the entry will be deleted immediately. Otherwise the requests will be queued until the current ResultSet is exhausted and then a batch delete will be done for the queue.

Specified by:
remove in interface Iterator<ITuple<E>>

flush

public void flush()
Method flushes any queued deletes. You MUST do this if you are only processing part of the buffered capacity of the iterator and you are are deleting some index entries. Failure to flush() under these circumstances will result in some buffered deletes never being applied.


deleteBehind

protected void deleteBehind()

deleteBehind

protected abstract void deleteBehind(int n,
                                     Iterator<byte[]> keys)
Batch delete the index entries identified by keys and clear the list.

Parameters:
n - The #of keys to be deleted.
keys - The keys to be deleted.

deleteLast

protected abstract void deleteLast(byte[] key)
Delete the index entry identified by key.

Parameters:
key - A key.

readBlock

protected abstract IBlock readBlock(int sourceIndex,
                                    long addr)
Return an object that may be used to read the block from the backing store per the contract for ITuple.readBlock(long)

Parameters:
sourceIndex - The value from ITuple.getSourceIndex().
addr - The value supplied to ITuple.readBlock(long).


Copyright © 2006-2009 SYSTAP, LLC. All Rights Reserved.