com.bigdata.rdf.lexicon
Class LexiconRelation

java.lang.Object
  extended by com.bigdata.relation.AbstractResource<IRelation<E>>
      extended by com.bigdata.relation.AbstractRelation<BigdataValue>
          extended by com.bigdata.rdf.lexicon.LexiconRelation
All Implemented Interfaces:
IDatatypeURIResolver, IMutableRelation<BigdataValue>, IMutableResource<IRelation<BigdataValue>>, IRelation<BigdataValue>, ILocatableResource<IRelation<BigdataValue>>

public class LexiconRelation
extends AbstractRelation<BigdataValue>
implements IDatatypeURIResolver

The LexiconRelation handles all things related to the indices mapping external RDF Values onto IVs (internal values)s and provides methods for efficient materialization of external RDF Values from IVs.

Version:
$Id: LexiconRelation.java 6352 2012-06-14 18:08:09Z thompsonbry $
Author:
Bryan Thompson

Nested Class Summary
 
Nested classes/interfaces inherited from class com.bigdata.relation.AbstractResource
AbstractResource.Options
 
Field Summary
static String NAME_LEXICON_RELATION
          Constant for the LexiconRelation namespace component.
 
Constructor Summary
LexiconRelation(IIndexManager indexManager, String namespace, Long timestamp, Properties properties)
          Note: The term:id and id:term indices MUST use unisolated write operation to ensure consistency without write-write conflicts.
 
Method Summary
 long addTerms(BigdataValue[] values, int numTerms, boolean readOnly)
          Batch insert of terms into the database.
 Iterator<Value> blobsIterator()
          Visits all RDF Values in the LexiconKeyOrder.BLOBS index in BlobIV order (efficient index scan).
 void buildSubjectCentricTextIndex()
          Utility method to (re-)build the subject-based full text index.
 void create()
          Create any logically contained resources (relations, indices).
 long delete(IChunkedOrderedIterator<BigdataValue> itr)
          Note : this method is part of the mutation api.
 void destroy()
          Destroy any logically contained resources (relations, indices).
protected  Class<IExtensionFactory> determineExtensionFactoryClass()
           
protected  Class<ISubjectCentricTextIndexer> determineSubjectCentricTextIndexerClass()
           
protected  Class<IValueCentricTextIndexer> determineTextIndexerClass()
           
protected  Class<BigdataValueFactory> determineValueFactoryClass()
           
 boolean exists()
           
 IIndex getBlobsIndex()
           
protected  IndexMetadata getBlobsIndexMetadata(String name)
          Return the IndexMetadata for the TERMS index.
 int getBlobsThreshold()
          Return the threshold at which a literal would be stored in the LexiconKeyOrder.BLOBS index.
 AbstractTripleStore getContainer()
          Strengthens the return type.
 Class<BigdataValue> getElementClass()
          Return the class for the generic type of this relation.
 IIndex getId2TermIndex()
           
protected  IndexMetadata getId2TermIndexMetadata(String name)
          Return the IndexMetadata for the ID2TERM index.
 IIndex getIndex(IKeyOrder<? extends BigdataValue> keyOrder)
          Overridden to use local cache of the index reference.
 Set<String> getIndexNames()
          Return the fully qualified name of each index maintained by this relation.
 TimeZone getInlineDateTimesTimeZone()
          Return the default time zone to be used for inlining.
 IV getInlineIV(Value value)
          Attempt to convert the value to an inline internal value.
 IV getIV(Value value)
          Deprecated. Not even the unit tests should be doing this.
 IKeyOrder<BigdataValue> getKeyOrder(IPredicate<BigdataValue> p)
          Return the IKeyOrder for the predicate corresponding to the perfect access path.
 Iterator<IKeyOrder<BigdataValue>> getKeyOrders()
          Return the IKeyOrders corresponding to the registered indices for this relation.
 ILexiconConfiguration<BigdataValue> getLexiconConfiguration()
          Return the lexiconConfiguration instance.
 int getMaxInlineStringLength()
          Return the maximum length a string value which may be inlined into the statement indices.
 LexiconKeyOrder getPrimaryKeyOrder()
          Return the IKeyOrder for the primary index for the relation.
 IValueCentricTextIndexer<?> getSearchEngine()
          A factory returning the softly held singleton for the FullTextIndex.
 ISubjectCentricTextIndexer<?> getSubjectCentricSearchEngine()
          A factory returning the softly held singleton for the FullTextIndex representing the subject-centric full text index.
 BigdataValue getTerm(IV iv)
          Note: BNodes are not stored in the reverse lexicon and are recognized using AbstractTripleStore#isBNode(long).
 IIndex getTerm2IdIndex()
           
protected  IndexMetadata getTerm2IdIndexMetadata(String name)
          Return the IndexMetadata for the TERM2ID index.
 int getTermIdBitsToReverse()
          The #of low bits from the term identifier that are reversed and rotated into the high bits when it is assigned.
 Map<IV<?,?>,BigdataValue> getTerms(Collection<IV<?,?>> ivs)
          Batch resolution of internal values to BigdataValues.
 Map<IV<?,?>,BigdataValue> getTerms(Collection<IV<?,?>> ivs, int termsChunksSize, int blobsChunkSize)
          Batch resolution of internal values to BigdataValues.
 BigdataValueFactory getValueFactory()
          The canonical BigdataValueFactoryImpl reference (JVM wide) for the lexicon namespace.
 LexiconRelation init()
          The default implementation only logs the event.
 long insert(IChunkedOrderedIterator<BigdataValue> itr)
          Note : this method is part of the mutation api.
 boolean isBlob(Value v)
          Return true iff this Value would be stored in the LexiconKeyOrder.BLOBS index.
 boolean isInlineDateTimes()
          Return true if xsd:datetime literals are being inlined into the statement indices.
 boolean isInlineLiterals()
          Return true if datatype literals are being inlined into the statement indices.
 boolean isStoreBlankNodes()
          true iff blank nodes are being stored in the lexicon's forward index.
 boolean isSubjectCentricTextIndex()
          true iff the subject-centric full text index is enabled.
 boolean isTextIndex()
          true iff the full text index is enabled.
 IAccessPath<BigdataValue> newAccessPath(IIndexManager localIndexManager, IPredicate<BigdataValue> predicate, IKeyOrder<BigdataValue> keyOrder)
          Necessary for lexicon joins, which are injected into query plans as necessary by the query planner.
 BigdataValue newElement(List<BOp> a, IBindingSet bindingSet)
          Note : this method is part of the mutation api.
 Iterator<IV> prefixScan(Literal lit)
          A scan of all literals having the given literal as a prefix.
 Iterator<IV> prefixScan(Literal[] lits)
          A scan of all literals having any of the given literals as a prefix.
 void rebuildTextIndex()
          Utility method to (re-)build the full text index.
 BigdataURI resolve(URI uri)
          Returns a fully resolved datatype URI with the IV set.
 
Methods inherited from class com.bigdata.relation.AbstractRelation
getAccessPath, getAccessPath, getAccessPath, getFQN, getFQN, getIndex, getIndex, newIndexMetadata
 
Methods inherited from class com.bigdata.relation.AbstractResource
acquireExclusiveLock, getBareProperties, getChunkCapacity, getChunkOfChunksCapacity, getChunkTimeout, getCommitTime, getContainerNamespace, getExecutorService, getFullyBufferedReadThreshold, getIndexManager, getMaxParallelSubqueries, getNamespace, getProperties, getProperty, getProperty, getTimestamp, isForceSerialExecution, toString, unlock
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface com.bigdata.relation.IRelation
getExecutorService, getIndexManager
 
Methods inherited from interface com.bigdata.relation.locator.ILocatableResource
getContainerNamespace, getNamespace, getTimestamp
 

Field Detail

NAME_LEXICON_RELATION

public static final transient String NAME_LEXICON_RELATION
Constant for the LexiconRelation namespace component.

Note: To obtain the fully qualified name of an index in the LexiconRelation you need to append a "." to the relation's namespace, then this constant, then a "." and then the local name of the index.

See Also:
AbstractRelation.getFQN(IKeyOrder), Constant Field Values
Constructor Detail

LexiconRelation

public LexiconRelation(IIndexManager indexManager,
                       String namespace,
                       Long timestamp,
                       Properties properties)
Note: The term:id and id:term indices MUST use unisolated write operation to ensure consistency without write-write conflicts. The only exception would be a read-historical view.

Parameters:
indexManager -
namespace -
timestamp -
properties -
Method Detail

determineValueFactoryClass

protected Class<BigdataValueFactory> determineValueFactoryClass()

determineTextIndexerClass

protected Class<IValueCentricTextIndexer> determineTextIndexerClass()

determineSubjectCentricTextIndexerClass

protected Class<ISubjectCentricTextIndexer> determineSubjectCentricTextIndexerClass()

determineExtensionFactoryClass

protected Class<IExtensionFactory> determineExtensionFactoryClass()

getValueFactory

public BigdataValueFactory getValueFactory()
The canonical BigdataValueFactoryImpl reference (JVM wide) for the lexicon namespace.


getContainer

public AbstractTripleStore getContainer()
Strengthens the return type.

Overrides:
getContainer in class AbstractResource<IRelation<BigdataValue>>
Returns:
The container -or- null if there is no container.

exists

public boolean exists()

init

public LexiconRelation init()
Description copied from class: AbstractResource
The default implementation only logs the event.

Specified by:
init in interface ILocatableResource<IRelation<BigdataValue>>
Overrides:
init in class AbstractResource<IRelation<BigdataValue>>

create

public void create()
Description copied from interface: IMutableResource
Create any logically contained resources (relations, indices). There is no presumption that ILocatableResource.init() is suitable for invocation from IMutableResource.create(). Instead, you are responsible for invoking ILocatableResource.init() from this method IFF it is appropriate to reuse its initialization logic.

Specified by:
create in interface IMutableResource<IRelation<BigdataValue>>
Overrides:
create in class AbstractResource<IRelation<BigdataValue>>

destroy

public void destroy()
Description copied from interface: IMutableResource
Destroy any logically contained resources (relations, indices).

Specified by:
destroy in interface IMutableResource<IRelation<BigdataValue>>
Overrides:
destroy in class AbstractResource<IRelation<BigdataValue>>

isInlineLiterals

public final boolean isInlineLiterals()
Return true if datatype literals are being inlined into the statement indices.


getMaxInlineStringLength

public final int getMaxInlineStringLength()
Return the maximum length a string value which may be inlined into the statement indices.


isInlineDateTimes

public final boolean isInlineDateTimes()
Return true if xsd:datetime literals are being inlined into the statement indices.


getInlineDateTimesTimeZone

public final TimeZone getInlineDateTimesTimeZone()
Return the default time zone to be used for inlining.


getTermIdBitsToReverse

public final int getTermIdBitsToReverse()
The #of low bits from the term identifier that are reversed and rotated into the high bits when it is assigned.

See Also:
AbstractTripleStore.Options#TERMID_BITS_TO_REVERSE

isStoreBlankNodes

public final boolean isStoreBlankNodes()
true iff blank nodes are being stored in the lexicon's forward index.

See Also:
AbstractTripleStore.Options#STORE_BLANK_NODES

isTextIndex

public final boolean isTextIndex()
true iff the full text index is enabled.

See Also:
AbstractTripleStore.Options#TEXT_INDEX

isSubjectCentricTextIndex

public final boolean isSubjectCentricTextIndex()
true iff the subject-centric full text index is enabled.

See Also:
AbstractTripleStore.Options#SUBJECT_CENTRIC_TEXT_INDEX

getIndex

public IIndex getIndex(IKeyOrder<? extends BigdataValue> keyOrder)
Overridden to use local cache of the index reference.

Specified by:
getIndex in interface IRelation<BigdataValue>
Overrides:
getIndex in class AbstractRelation<BigdataValue>
Parameters:
keyOrder - The natural index order.
Returns:
The index -or- null iff the index does not exist as of the timestamp for this view of the relation.
See Also:
AbstractRelation.getIndex(String)

getTerm2IdIndex

public final IIndex getTerm2IdIndex()

getId2TermIndex

public final IIndex getId2TermIndex()

getBlobsIndex

public final IIndex getBlobsIndex()

getSearchEngine

public IValueCentricTextIndexer<?> getSearchEngine()
A factory returning the softly held singleton for the FullTextIndex.

See Also:
AbstractTripleStore.Options#TEXT_INDEX
TODO:
replace with the use of the IResourceLocator since it already imposes a canonicalizing mapping within for the index name and timestamp inside of a JVM.

getSubjectCentricSearchEngine

public ISubjectCentricTextIndexer<?> getSubjectCentricSearchEngine()
A factory returning the softly held singleton for the FullTextIndex representing the subject-centric full text index.

See Also:
AbstractTripleStore.Options#TEXT_INDEX
TODO:
replace with the use of the IResourceLocator since it already imposes a canonicalizing mapping within for the index name and timestamp inside of a JVM.

getTerm2IdIndexMetadata

protected IndexMetadata getTerm2IdIndexMetadata(String name)
Return the IndexMetadata for the TERM2ID index.

Parameters:
name - The name of the index.
Returns:
The IndexMetadata.

getId2TermIndexMetadata

protected IndexMetadata getId2TermIndexMetadata(String name)
Return the IndexMetadata for the ID2TERM index.

Parameters:
name - The name of the index.
Returns:
The IndexMetadata.
See Also:
Load, closure and query performance in 1.1.x versus 1.0.x

getBlobsIndexMetadata

protected IndexMetadata getBlobsIndexMetadata(String name)
Return the IndexMetadata for the TERMS index.

Parameters:
name - The name of the index.
Returns:
The IndexMetadata.

getIndexNames

public Set<String> getIndexNames()
Description copied from interface: IRelation
Return the fully qualified name of each index maintained by this relation.

Specified by:
getIndexNames in interface IRelation<BigdataValue>
Returns:
An immutable set of the index names for the relation.

getKeyOrders

public Iterator<IKeyOrder<BigdataValue>> getKeyOrders()
Description copied from interface: IRelation
Return the IKeyOrders corresponding to the registered indices for this relation. [rather than getIndexNames?]

Specified by:
getKeyOrders in interface IRelation<BigdataValue>

getPrimaryKeyOrder

public LexiconKeyOrder getPrimaryKeyOrder()
Description copied from interface: IRelation
Return the IKeyOrder for the primary index for the relation.

Specified by:
getPrimaryKeyOrder in interface IRelation<BigdataValue>

newElement

public BigdataValue newElement(List<BOp> a,
                               IBindingSet bindingSet)
Note : this method is part of the mutation api. it is primarily (at this point, only) invoked by the rule execution layer and, at present, no rules can entail terms into the lexicon.

Specified by:
newElement in interface IRelation<BigdataValue>
Parameters:
a - An ordered list of variables and/or constants.
bindingSet - A set of bindings.
Returns:
The new element.
Throws:
UnsupportedOperationException

getElementClass

public Class<BigdataValue> getElementClass()
Description copied from interface: IRelation
Return the class for the generic type of this relation. This information is used to dynamically create arrays of that generic type.

Specified by:
getElementClass in interface IRelation<BigdataValue>

delete

public long delete(IChunkedOrderedIterator<BigdataValue> itr)
Note : this method is part of the mutation api. it is primarily (at this point, only) invoked by the rule execution layer and, at present, no rules can entail terms into the lexicon.

Specified by:
delete in interface IMutableRelation<BigdataValue>
Parameters:
itr - An iterator visiting the elements to be removed. Existing elements in the relation having a key equal to the key formed from the visited elements will be removed from the relation.
Returns:
The #of elements that were actually removed from the relation.
Throws:
UnsupportedOperationException

insert

public long insert(IChunkedOrderedIterator<BigdataValue> itr)
Note : this method is part of the mutation api. it is primarily (at this point, only) invoked by the rule execution layer and, at present, no rules can entail terms into the lexicon.

Specified by:
insert in interface IMutableRelation<BigdataValue>
Parameters:
itr - An iterator visiting the elements to be written.
Returns:
The #of elements that were actually written on the relation.
Throws:
UnsupportedOperationException

prefixScan

public Iterator<IV> prefixScan(Literal lit)
A scan of all literals having the given literal as a prefix.

Parameters:
lit - A literal.
Returns:
An iterator visiting the term identifiers for the matching Literals. TODO Prefix scan only visits the TERM2ID index (blobs and inline literals will not be observed). This should be mapped onto a free text index query instead. In order to have the same semantics we must also verify that (a) the prefix match is at the start of the literal; and (b) the match is contiguous.

prefixScan

public Iterator<IV> prefixScan(Literal[] lits)
A scan of all literals having any of the given literals as a prefix.

Parameters:
lits - An array of literals.
Returns:
An iterator visiting the term identifiers for the matching Literals. TODO Prefix scan only visits the TERM2ID index (blobs and inline literals will not be observed). This should be mapped onto a free text index query instead. In order to have the same semantics we must also verify that (a) the prefix match is at the start of the literal; and (b) the match is contiguous.

resolve

public BigdataURI resolve(URI uri)
Returns a fully resolved datatype URI with the IV set.

IExtensions handle encoding and decoding of inline literals for custom datatypes, however to do so they need the IV for the custom datatype. By passing an instance of this interface to the IExtension, it will be able to resolve its datatype URI(s) and cache them for future use.

The URIs used by IExtensions MUST be pre-declared by the Vocabulary.

This interface is implemented by the LexiconRelation.

Specified by:
resolve in interface IDatatypeURIResolver
Returns:
The fully resolved term
See Also:
IDatatypeURIResolver

isBlob

public boolean isBlob(Value v)
Return true iff this Value would be stored in the LexiconKeyOrder.BLOBS index.

Parameters:
v - The value.
Returns:
true if it is a "large value" according to the configuration of the lexicon.
See Also:
AbstractTripleStore.Options#BLOBS_THRESHOLD

getBlobsThreshold

public int getBlobsThreshold()
Return the threshold at which a literal would be stored in the LexiconKeyOrder.BLOBS index.

See Also:
AbstractTripleStore.Options#BLOBS_THRESHOLD

addTerms

public long addTerms(BigdataValue[] values,
                     int numTerms,
                     boolean readOnly)
Batch insert of terms into the database.

Note: Duplicate BigdataValue references and BigdataValues that already have an assigned term identifiers are ignored by this operation.

Note: This implementation is designed to use unisolated batch writes on the terms and ids index that guarantee consistency.

If the full text index is enabled, then the terms will also be inserted into the full text index.

Parameters:
terms - An array whose elements [0:nterms-1] will be inserted.
numTerms - The #of terms to insert.
readOnly - When true, unknown terms will not be inserted into the database. Otherwise unknown terms are inserted into the database.
Returns:
The #of distinct terms lacking a pre-assigned term identifier. If writes were permitted, then this is also the #of terms written onto the index. TODO If we refactor the search index shortly to use a [token,S,P,O,(C)] key then search will become co-threaded with the assertion and retraction of statements (writes on the statement indices) rather than with ID2TERM writes.

rebuildTextIndex

public void rebuildTextIndex()
Utility method to (re-)build the full text index. This is a high latency operation for a database of any significant size. You must be using the unisolated view of the AbstractTripleStore for this operation. AbstractTripleStore.Options#TEXT_INDEX must be enabled. This operation is only supported when the IValueCentricTextIndexer uses the FullTextIndex class.


getTerms

public final Map<IV<?,?>,BigdataValue> getTerms(Collection<IV<?,?>> ivs)
Batch resolution of internal values to BigdataValues.

Parameters:
ivs - An collection of internal values
Returns:
A map from internal value to the BigdataValue. If an internal value was not resolved then the map will not contain an entry for that internal value.

buildSubjectCentricTextIndex

public void buildSubjectCentricTextIndex()
Utility method to (re-)build the subject-based full text index. This is a high latency operation for a database of any significant size. You must be using the unisolated view of the AbstractTripleStore for this operation. AbstractTripleStore.Options#TEXT_INDEX must be enabled. This operation is only supported when the ITextIndexer uses the FullTextIndex class.

The subject-based full text index is one that rolls up normal object-based full text index into a similarly structured index that captures relevancy across subjects. Instead of (t,s) => s.len, termWeight Where s is the subject's IV. The term weight has the same interpretation, but it is across all literals which are linked to that subject and which contain the given token. This index basically pre-computes the (?s ?p ?o) join that sometimes follows the (?o bd:search "xyz") request.

Truth Maintenance

We will need to perform truth maintenance on the subject-centric text index, that is - the index will need to be updated as statements are added and removed (to the extent that those statements involving a literal in the object position). Adding a statement is the easier case because we will never need to remove entries from the index, we can simply write over them with new relevance values. All that is involved with truth maintenance for adding a statement is taking a post- commit snapshot of the subject in the statement and running it through the indexer (a "subject-refresh").

The same "subject-refresh" will be necessary for truth maintenance for removal, but an additional step will be necessary beforehand - the index entries associated with the deleted subject/object (tokens+subject) will need to be removed in case the token appears only in the removed literal. After this pruning step the subject can be refreshed in the index exactly the same as for truth maintenance on add.

It looks like the right place to hook in truth maintenance for add is AbstractTripleStore.addStatements(AbstractTripleStore, boolean, IChunkedOrderedIterator, com.bigdata.relation.accesspath.IElementFilter) after the ISPOs are added to the SPORelation. Likewise, the place to hook in truth maintenance for delete is AbstractTripleStore.removeStatements(IChunkedOrderedIterator, boolean) after the ISPOs are removed from the SPORelation.


getTerms

public final Map<IV<?,?>,BigdataValue> getTerms(Collection<IV<?,?>> ivs,
                                                int termsChunksSize,
                                                int blobsChunkSize)
Batch resolution of internal values to BigdataValues.

Parameters:
ivs - An collection of internal values
Returns:
A map from internal value to the BigdataValue. If an internal value was not resolved then the map will not contain an entry for that internal value.
See Also:
getTerms(Collection)

getTerm

public final BigdataValue getTerm(IV iv)
Note: BNodes are not stored in the reverse lexicon and are recognized using AbstractTripleStore#isBNode(long).

Note: Statement identifiers (when enabled) are not stored in the reverse lexicon and are recognized using AbstractTripleStore#isStatement(IV). If the term identifier is recognized as being, in fact, a statement identifier, then it is externalized as a BNode. This fits rather well with the notion in a quad store that the context position may be either a URI or a BNode and the fact that you can use BNodes to "stamp" statement identifiers.

Note: Handles both unisolatable and isolatable indices.

Note: Sets BigdataValue.getIV() as a side-effect.

Note: this always mints a new BNode instance when the term identifier identifies a BNode or a Statement.

Returns:
The BigdataValue -or- null iff there is no BigdataValue for that term identifier in the lexicon.

getIV

public final IV getIV(Value value)
Deprecated. Not even the unit tests should be doing this.

WARNING DO NOT USE OUTSIDE OF THE UNIT TESTS: This method is extremely inefficient for scale-out as it does one RMI per request!

Note: If BigdataValue.getIV() is set, then returns that value immediately. Next, try to get an inline internal value for the value. Otherwise looks up the termId in the index and sets the term identifier as a side-effect.

See Also:
#getTerms(Collection), Use this method to resolve {@link Value} to their {@link IV}s efficiently.

getInlineIV

public final IV getInlineIV(Value value)
Attempt to convert the value to an inline internal value. If the caller provides a BigdataValue and this method is successful, then the IV will be set as a side-effect on the BigdataValue.

Parameters:
value - The value to convert
Returns:
The inline internal value, or null if it cannot be converted
See Also:
ILexiconConfiguration.createInlineIV(Value)

blobsIterator

public Iterator<Value> blobsIterator()
Visits all RDF Values in the LexiconKeyOrder.BLOBS index in BlobIV order (efficient index scan).


getLexiconConfiguration

public ILexiconConfiguration<BigdataValue> getLexiconConfiguration()
Return the lexiconConfiguration instance. Used to determine how to encode and decode terms in the key space.


getKeyOrder

public IKeyOrder<BigdataValue> getKeyOrder(IPredicate<BigdataValue> p)
Return the IKeyOrder for the predicate corresponding to the perfect access path. A perfect access path is one where the bound values in the predicate form a prefix in the key space of the corresponding index.

This implementation examines the predicate, looking at the LexiconKeyOrder.SLOT_IV and LexiconKeyOrder.SLOT_TERM slots and chooses the appropriate index based on the IV and/or Value which it founds bound. When both slots are bound it prefers the index for the IV => Value mapping as that index will be faster (ID2TERM has a shorter key and higher fan-out than TERM2ID).

Specified by:
getKeyOrder in interface IRelation<BigdataValue>
Returns:
The IKeyOrder for the perfect access path -or- null if there is no index which provides a perfect access path for that predicate.

newAccessPath

public IAccessPath<BigdataValue> newAccessPath(IIndexManager localIndexManager,
                                               IPredicate<BigdataValue> predicate,
                                               IKeyOrder<BigdataValue> keyOrder)
Necessary for lexicon joins, which are injected into query plans as necessary by the query planner. You can use a LexPredicate to perform either a forward (BigdataValue to IV) or reverse ( IV to BigdataValue) lookup. Either lookup will cache the BigdataValue on the IV as a side effect.

Note: If you query with IV or BigdataValue which is already cached (either on one another or in the termsCache) then the cached value will be returned (fast path).

Note: Blank nodes will not unify with themselves unless you are using told blank node semantics.

Note: This has the side effect of caching materialized BigdataValues on IVs using IVCache.setValue(BigdataValue) for use in downstream operators that need materialized values to evaluate properly. The query planner is responsible for managing when we materialize and cache values. This keeps us from wiring BigdataValue onto IVs all the time.

The lexicon has a single TERMS index. The keys are BlobIVs formed from the VTE of the BigdataValue, BigdataValue#hashCode(), and a collision counter. The value is the BigdataValue as serialized by the BigdataValueSerializer.

There are four possible ways to query this index using the LexPredicate.

lex(-BigdataValue,+IV)
The IV is given and its BigdataValue will be sought.
lex(+BigdataValue,-IV)
The BigdataValueis given and its IV will be sought. This case requires a key-range scan with a filter. It has to scan the collision bucket and filter for the specified Value. We get the collision bucket by creating a prefix key for the Value (using its VTE and hashCode). This will either return the IV for that Value or nothing.
lex(+BigdataValue,+IV)
The predicate is fully bound. In this case we can immediately verify that the Value is consistent with the IV (same VTE and hashCode) and then do a point lookup on the IV.

Overrides:
newAccessPath in class AbstractRelation<BigdataValue>
Parameters:
localIndexManager - The local index manager (optional, except when there is a request for a shard local access path in scale-out).
predicate - The predicate used to request the access path.
keyOrder - The index which the access path will use.
Returns:
The access path.
See Also:
LexAccessPatternEnum, LexPredicate, LexiconKeyOrder


Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.