com.bigdata.striterator
Class DistinctFilter<E>

java.lang.Object
  extended by com.bigdata.striterator.DistinctFilter<E>
All Implemented Interfaces:
IChunkConverter<E,E>

Deprecated. by JVMDistinctBindingSetsOp

public abstract class DistinctFilter<E>
extends Object
implements IChunkConverter<E,E>

A filter that imposes a DISTINCT constraint on the ISolutions generated by an IRule. The filter is optimized if only a single chunk is visited by the source iterator. Otherwise, the filter is implemented using a BTree backed by a TemporaryStore.

When more than one chunk is processed, ISolutions are transformed into unsigned byte[] keys. The BTree is tested for each such key. If the key is NOT found, then it is inserted into the BTree and the solution is passed by the filter. Otherwise the solution is rejected by the filter. The backing BTree is closed when the filter is finalized, but it will hold a hard reference to the TemporaryStore until then. Solutions are processed in chunks for efficient ordered reads and writes on the BTree.

Version:
$Id: DistinctFilter.java 6130 2012-03-15 10:31:25Z thompsonbry $
Author:
Bryan Thompson
TODO:
A statistical distinct filter can be implemented using bloom filter INSTEAD of a BTree but the bloom filter parameters MUST be chosen so as to make the possibility of a false positive sufficiently unlikely to satisfy the application criteria. However, such a filter will always have a non-zero chance of incorrectly rejecting a solution when that solution has NOT been seen by the filter. Since the bloom filter can under-generate, it could only be applied in very specialized circumstances, e.g., it might be OK for text search.

Constructor Summary
DistinctFilter(IIndexManager indexManager)
          Deprecated.  
 
Method Summary
 E[] convert(IChunkedOrderedIterator<E> src)
          Deprecated. Convert the next chunk of element(s) from the source iterator into target element(s).
protected abstract  byte[] getSortKey(E e)
          Deprecated. Return an unsigned byte[] key that is a representation of the visited element.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DistinctFilter

public DistinctFilter(IIndexManager indexManager)
Deprecated. 
Parameters:
indexManager - Used to lazily obtain a TemporaryStore.
Method Detail

convert

public E[] convert(IChunkedOrderedIterator<E> src)
Deprecated. 
Description copied from interface: IChunkConverter
Convert the next chunk of element(s) from the source iterator into target element(s).

Note: This method will only be invoked if ChunkedConvertingIterator.hasNext() reports true for the source iterator.

Note: Iterators are single-threaded so the implementation of this method does not need to be thread-safe.

Specified by:
convert in interface IChunkConverter<E,E>
Parameters:
src - The source iterator.
Returns:
The target chunk (not null, but may be empty).

getSortKey

protected abstract byte[] getSortKey(E e)
Deprecated. 
Return an unsigned byte[] key that is a representation of the visited element. Elements are judged for distinctness in terms of the generated sort key.

Parameters:
e - The visited element.
Returns:
The unsigned byte[] key.


Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.