|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.bigdata.relation.AbstractResource<IRelation<E>>
com.bigdata.relation.AbstractRelation<BigdataValue>
com.bigdata.rdf.lexicon.LexiconRelation
public class LexiconRelation
The LexiconRelation handles all things related to the indices mapping
RDF Values onto internal 64-bit term identifiers.
The term2id index has all the distinct terms ever asserted. Those "terms"
include {s:p:o} keys for statements IFF statement identifiers are in use.
However, BNodes are NOT stored in the forward index, even though the
forward index is used to assign globally unique term identifiers for blank
nodes. See BigdataValueFactoryImpl.createBNode().
The id2term index only has URIs and Literals. It CAN NOT
used to resolve either BNodes or statement identifiers. In fact,
there is NO means to resolve either a statement identifier or a blank node.
Both are always assigned (consistently) within a context in which their
referent (if any) is defined. For a statement identifier the referent MUST be
defined by an instance of the statement itself. The RIO parser integration
and the IStatementBuffer implementations handle all of this stuff.
See KeyBuilder.Options for properties that control how the sort keys
are generated for the URIs and Literals.
| Nested Class Summary |
|---|
| Nested classes/interfaces inherited from class com.bigdata.relation.AbstractResource |
|---|
AbstractResource.Options |
| Field Summary | |
|---|---|
protected static org.apache.log4j.Logger |
log
|
static String |
NAME_LEXICON_RELATION
Constant for the LexiconRelation namespace component. |
| Constructor Summary | |
|---|---|
LexiconRelation(IIndexManager indexManager,
String namespace,
Long timestamp,
Properties properties)
Note: The term:id and id:term indices MUST use unisolated write operation to ensure consistency without write-write conflicts. |
|
| Method Summary | |
|---|---|
void |
addStatementIdentifiers(ISPO[] a,
int n)
Assign unique statement identifiers to triples. |
void |
addTerms(BigdataValue[] terms,
int numTerms,
boolean readOnly)
Batch insert of terms into the database. |
void |
create()
Create any logically contained resources (relations, indices). |
long |
delete(IChunkedOrderedIterator<BigdataValue> itr)
Note : this method is part of the mutation api. |
void |
destroy()
Destroy any logically contained resources (relations, indices). |
StringBuilder |
dumpTerms()
Dumps the lexicon in a variety of ways. |
boolean |
exists()
|
KVO<BigdataValue>[] |
generateSortKeys(LexiconKeyBuilder keyBuilder,
BigdataValue[] terms,
int numTerms)
Generate the sort keys for the terms. |
IAccessPath<BigdataValue> |
getAccessPath(IPredicate<BigdataValue> predicate)
Return the best IAccessPath for a relation given a predicate with
zero or more unbound variables. |
AbstractTripleStore |
getContainer()
Strengthens the return type. |
Class<BigdataValue> |
getElementClass()
Return the class for the generic type of this relation. |
IIndex |
getId2TermIndex()
|
protected IndexMetadata |
getId2TermIndexMetadata(String name)
|
IIndex |
getIndex(IKeyOrder<? extends BigdataValue> keyOrder)
Overridden to return the hard reference for the index. |
Set<String> |
getIndexNames()
Return the fully qualified name of each index maintained by this relation. |
FullTextIndex |
getSearchEngine()
A factory returning the softly held singleton for the FullTextIndex. |
BigdataValue |
getTerm(long id)
Note: BNodes are not stored in the reverse lexicon and are
recognized using AbstractTripleStore.isBNode(long). |
IIndex |
getTerm2IdIndex()
|
protected IndexMetadata |
getTerm2IdIndexMetadata(String name)
|
long |
getTermId(Value value)
Note: If BigdataValue.getTermId() is set, then returns that value
immediately. |
int |
getTermIdBitsToReverse()
The #of low bits from the term identifier that are reversed and rotated into the high bits when it is assigned. |
Map<Long,BigdataValue> |
getTerms(Collection<Long> ids)
Batch resolution of term identifiers to BigdataValues. |
BigdataValueFactoryImpl |
getValueFactory()
The canonical BigdataValueFactoryImpl reference (JVM wide) for the
lexicon namespace. |
Iterator<Value> |
idTermIndexScan()
Iterator visits all terms in order by their assigned term identifiers (efficient index scan, but the terms are not in term order). |
protected void |
indexTermText(int capacity,
Iterator<BigdataValue> itr)
Add the terms to the full text index so that we can do fast lookup of the corresponding term identifiers. |
long |
insert(IChunkedOrderedIterator<BigdataValue> itr)
Note : this method is part of the mutation api. |
boolean |
isStoreBlankNodes()
true iff blank nodes are being stored in the lexicon's
forward index. |
boolean |
isTextIndex()
true iff the full text index is enabled. |
BigdataValue |
newElement(IPredicate<BigdataValue> predicate,
IBindingSet bindingSet)
Note : this method is part of the mutation api. |
Iterator<Long> |
prefixScan(Literal lit)
A scan of all literals having the given literal as a prefix. |
Iterator<Long> |
prefixScan(Literal[] lits)
A scan of all literals having any of the given literals as a prefix. |
Iterator<Long> |
termIdIndexScan()
Iterator visits all term identifiers in order by the term key (efficient index scan). |
Iterator<Value> |
termIterator()
Visits all terms in term key order (random index operation). |
| Methods inherited from class com.bigdata.relation.AbstractRelation |
|---|
getFQN, getIndex, newIndexMetadata |
| Methods inherited from class com.bigdata.relation.AbstractResource |
|---|
acquireExclusiveLock, getChunkCapacity, getChunkOfChunksCapacity, getChunkTimeout, getContainerNamespace, getExecutorService, getFullyBufferedReadThreshold, getIndexManager, getMaxParallelSubqueries, getNamespace, getProperties, getProperty, getProperty, getTimestamp, isForceSerialExecution, isNestedSubquery, toString, unlock |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
| Methods inherited from interface com.bigdata.relation.IRelation |
|---|
getExecutorService, getIndexManager |
| Methods inherited from interface com.bigdata.relation.locator.ILocatableResource |
|---|
getContainerNamespace, getNamespace, getTimestamp |
| Field Detail |
|---|
protected static final org.apache.log4j.Logger log
public static final transient String NAME_LEXICON_RELATION
LexiconRelation namespace component.
Note: To obtain the fully qualified name of an index in the
LexiconRelation you need to append a "." to the relation's
namespace, then this constant, then a "." and then the local name of the
index.
AbstractRelation.getFQN(IKeyOrder),
Constant Field Values| Constructor Detail |
|---|
public LexiconRelation(IIndexManager indexManager,
String namespace,
Long timestamp,
Properties properties)
indexManager - namespace - timestamp - properties - | Method Detail |
|---|
public BigdataValueFactoryImpl getValueFactory()
BigdataValueFactoryImpl reference (JVM wide) for the
lexicon namespace.
public AbstractTripleStore getContainer()
getContainer in class AbstractResource<IRelation<BigdataValue>>null if there is no container.public boolean exists()
public void create()
IMutableResource
create in interface IMutableResource<IRelation<BigdataValue>>create in class AbstractResource<IRelation<BigdataValue>>public void destroy()
IMutableResource
destroy in interface IMutableResource<IRelation<BigdataValue>>destroy in class AbstractResource<IRelation<BigdataValue>>public final int getTermIdBitsToReverse()
AbstractTripleStore.Options#TERMID_BITS_TO_REVERSEpublic final boolean isStoreBlankNodes()
true iff blank nodes are being stored in the lexicon's
forward index.
AbstractTripleStore.Options#STORE_BLANK_NODESpublic final boolean isTextIndex()
true iff the full text index is enabled.
AbstractTripleStore.Options#TEXT_INDEXpublic IIndex getIndex(IKeyOrder<? extends BigdataValue> keyOrder)
getIndex in class AbstractRelation<BigdataValue>keyOrder - The natural index order.
null iff the index does not exist
as of the timestamp for this view of the relation.FIXME For efficiency the concrete implementations need to override this
saving a hard reference to the index and then using a switch like
construct to return the correct hard reference. This behavior should be
encapsulated.public final IIndex getTerm2IdIndex()
public final IIndex getId2TermIndex()
public FullTextIndex getSearchEngine()
FullTextIndex.
Options#TEXT_INDEXIResourceLocator since it
already imposes a canonicalizing mapping within for the index name
and timestamp inside of a JVM.protected IndexMetadata getTerm2IdIndexMetadata(String name)
protected IndexMetadata getId2TermIndexMetadata(String name)
public Set<String> getIndexNames()
IRelation
public IAccessPath<BigdataValue> getAccessPath(IPredicate<BigdataValue> predicate)
IRelationIAccessPath for a relation given a predicate with
zero or more unbound variables.
If there is an IIndex that directly corresponeds to the natural
order implied by the variable pattern on the predicate then the access
path should use that index. Otherwise you should choose the best index
given the constraints and make sure that the IAccessPath
incorporates additional filters that will allow you to filter out the
irrelevant ITuples during the scan - this is very important when
the index is remote!
If there are any IElementFilters then the access path MUST
incorporate those constraints such that only elements that satisify the
constraints may be visited.
Whether the constraints arise because of the lack of a perfect index for
the access path or because they were explicitly specified for the
IPredicate, those constraints should be translated into
constraints imposed on the underlying ITupleIterator and sent
with it to be evaluated local to the data.
Note: Filters should be specified when the IAccessPath is
constructed so that they will be evalated on the data service rather than
materializing the elements and then filtering then. This can be
accomplished by adding the filter as a constraint on the predicate when
specifying the access path.
predicate - The constraint on the elements to be visited.
IAccessPath for that IPredicate.
UnsupportedOperationExceptionLexiconRelation.
public BigdataValue newElement(IPredicate<BigdataValue> predicate,
IBindingSet bindingSet)
predicate - The predicate that is the head of some IRule.bindingSet - A set of bindings for that IRule.
UnsupportedOperationExceptionpublic Class<BigdataValue> getElementClass()
IRelation
public long delete(IChunkedOrderedIterator<BigdataValue> itr)
itr - An iterator visiting the elements to be removed. Existing
elements in the relation having a key equal to the key formed
from the visited elements will be removed from the relation.
UnsupportedOperationExceptionpublic long insert(IChunkedOrderedIterator<BigdataValue> itr)
itr - An iterator visiting the elements to be written.
UnsupportedOperationExceptionpublic Iterator<Long> prefixScan(Literal lit)
lit - A literal.
Literals.public Iterator<Long> prefixScan(Literal[] lits)
lits - An array of literals.
Literals.IElementFilter
applied to the lexicon. This would let it be used directly from
IRules. (There is no direct dependency on this class other
than for access to the index, and the rules already provide that).
public final KVO<BigdataValue>[] generateSortKeys(LexiconKeyBuilder keyBuilder,
BigdataValue[] terms,
int numTerms)
keyBuilder - The object used to generate the sort keys.terms - The terms whose sort keys will be generated.numTerms - The #of terms in that array.
Note that KVO.val is null until we know
that we need to write it on the reverse index.
LexiconKeyBuilder
public void addTerms(BigdataValue[] terms,
int numTerms,
boolean readOnly)
Note: Duplicate BigdataValue references and BigdataValues
that already have an assigned term identifiers are ignored by this
operation.
Note: This implementation is designed to use unisolated batch writes on the terms and ids index that guarantee consistency.
If the full text index is enabled, then the terms will also be inserted into the full text index.
terms - An array whose elements [0:nterms-1] will be inserted.numTerms - The #of terms to insert.readOnly - When true, unknown terms will not be inserted
into the database. Otherwise unknown terms are inserted into
the database.
public void addStatementIdentifiers(ISPO[] a,
int n)
Each distinct StatementEnum.Explicit {s,p,o} is assigned a unique
statement identifier using the LexiconKeyOrder.TERM2ID index. The
assignment of statement identifiers is consistent using an
unisolated atomic write operation similar to
addTerms(BigdataValue[], int, boolean)
Note: Statement identifiers are NOT inserted into the reverse (id:term)
index. Instead, they are written into the values associated with the
{s,p,o} in each of the statement indices. That is handled by
AbstractTripleStore.addStatements(AbstractTripleStore, boolean, IChunkedOrderedIterator, IElementFilter)
, which is also responsible for invoking this method in order to have the
statement identifiers on hand before it writes on the statement indices.
Note: The caller's ISPO[] is sorted into SPO order as a
side-effect.
Note: The statement identifiers are assigned to the ISPOs as a
side-effect.
Note: SIDs are NOT supported for quads, so this code is never executed for quads.
protected void indexTermText(int capacity,
Iterator<BigdataValue> itr)
Add the terms to the full text index so that we can do fast lookup of the
corresponding term identifiers. Literals that have a language code
property are parsed using a tokenizer appropriate for the specified
language family. Other literals and URIs are tokenized using the default
Locale.
itr - Iterator visiting the terms to be indexed.#textSearch(String, String)public final Map<Long,BigdataValue> getTerms(Collection<Long> ids)
BigdataValues.
ids - An collection of term identifiers.
BigdataValue. If a
term identifier was not resolved then the map will not contain an
entry for that term identifier.public final BigdataValue getTerm(long id)
BNodes are not stored in the reverse lexicon and are
recognized using AbstractTripleStore.isBNode(long).
Note: Statement identifiers (when enabled) are not stored in the reverse
lexicon and are recognized using
AbstractTripleStore.isStatement(long). If the term identifier is
recognized as being, in fact, a statement identifier, then it is
externalized as a BNode. This fits rather well with the notion
in a quad store that the context position may be either a URI or
a BNode and the fact that you can use BNodes to "stamp"
statement identifiers.
Note: Handles both unisolatable and isolatable indices.
Note: Sets BigdataValue.getTermId() as a side-effect.
Note: this always mints a new BNode instance when the term
identifier identifies a BNode or a Statement.
BigdataValue -or- null iff there is no
BigdataValue for that term identifier in the lexicon.public final long getTermId(Value value)
BigdataValue.getTermId() is set, then returns that value
immediately. Otherwise looks up the termId in the index and
sets the term identifier as a
side-effect.
public Iterator<Value> idTermIndexScan()
termIdIndexScan(),
termIterator()public Iterator<Long> termIdIndexScan()
public Iterator<Value> termIterator()
Note: While this operation visits the terms in their index order it is
significantly less efficient than idTermIndexScan(). This is
because the keys in the term:id index are formed using an un-reversable
technique such that it is not possible to re-materialize the term from
the key. Therefore visiting the terms in term order requires traversal of
the term:id index (so that you are in term order) plus term-by-term
resolution against the id:term index (to decode the term). Since the two
indices are not mutually ordered, that resolution will result in random
hits on the id:term index.
public StringBuilder dumpTerms()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||