com.bigdata.rdf.lexicon
Class Term2IdWriteProc

java.lang.Object
  extended by com.bigdata.btree.proc.AbstractIndexProcedure
      extended by com.bigdata.btree.proc.AbstractKeyArrayIndexProcedure
          extended by com.bigdata.rdf.lexicon.Term2IdWriteProc
All Implemented Interfaces:
IIndexProcedure, IKeyArrayIndexProcedure, IParallelizableIndexProcedure, IMutableRelationIndexWriteProcedure, Externalizable, Serializable

public class Term2IdWriteProc
extends AbstractKeyArrayIndexProcedure
implements IParallelizableIndexProcedure, IMutableRelationIndexWriteProcedure

This unisolated operation inserts terms into the term:id index, assigning identifiers to terms as a side-effect. The use of this operation MUST be followed by the the use of Id2TermWriteProc to ensure that the reverse mapping from id to term is defined before any statements are inserted using the assigned term identifiers. The client MUST NOT make assertions using the assigned term identifiers until the corresponding Id2TermWriteProc operation has succeeded.

In order for the lexicon to remain consistent if the client fails for any reason after the forward mapping has been made restart-safe and before the reverse mapping has been made restart-safe clients MUST always use a successful Term2IdWriteProc followed by a successful Id2TermWriteProc before inserting statements using term identifiers into the statement indices. In particular, a client MUST NOT treat lookup against the terms index as satisfactory evidence that the term also exists in the reverse mapping.

Note that it is perfectly possible that a concurrent client will overlap in the terms being inserted. The results will always be fully consistent if the rules of the road are observed since (a) unisolated operations are single-threaded; (b) term identifiers are assigned in an unisolated atomic operation by Term2IdWriteProc; and (c) the reverse mapping is made consistent with the assignments made/discovered by the forward mapping.

Note: The Term2IdWriteProc and Id2TermWriteProc operations may be analyzed as a batch variant of the following pseudo code.

  
  for each term:
  
  termId = null;
  
  synchronized (ndx) {
    
    counter = ndx.getCounter();
  
    termId = ndx.lookup(term.key);
    
    if(termId == null) {
 
       termId = counter.inc();
       
       ndx.insert(term.key,termId);
       
       }
  
  }
  
 
In addition, the actual operations against scale-out indices are performed on index partitions rather than on the whole index.

Version:
$Id: Term2IdWriteProc.java 5062 2011-08-20 23:37:29Z mrpersonick $
Author:
Bryan Thompson
See Also:
Serialized Form

Nested Class Summary
static class Term2IdWriteProc.Result
          Object encapsulates the discovered / assigned term identifiers and provides efficient serialization for communication of those data to the client.
static class Term2IdWriteProc.Term2IdWriteProcConstructor
           
 
Nested classes/interfaces inherited from class com.bigdata.btree.proc.AbstractKeyArrayIndexProcedure
AbstractKeyArrayIndexProcedure.ResultBitBuffer, AbstractKeyArrayIndexProcedure.ResultBitBufferCounter, AbstractKeyArrayIndexProcedure.ResultBitBufferHandler, AbstractKeyArrayIndexProcedure.ResultBuffer, AbstractKeyArrayIndexProcedure.ResultBufferHandler
 
Field Summary
protected static org.apache.log4j.Logger log
           
 
Fields inherited from class com.bigdata.btree.proc.AbstractKeyArrayIndexProcedure
DEBUG
 
Constructor Summary
  Term2IdWriteProc()
          De-serialization constructor.
protected Term2IdWriteProc(IRabaCoder keySer, int fromIndex, int toIndex, byte[][] keys, boolean readOnly, boolean storeBlankNodes, int scaleOutTermIdBitsToReverse)
           
 
Method Summary
 Object apply(IIndex ndx)
          For each term whose serialized key is mapped to the current index partition, lookup the term in the terms index.
 boolean isReadOnly()
          Return true iff the procedure asserts that it will not write on the index.
 boolean isStoreBlankNodes()
           
protected  void readMetadata(ObjectInput in)
          Reads metadata written by AbstractKeyArrayIndexProcedure.writeMetadata(ObjectOutput).
static VTE VTE(byte code)
           
protected  void writeMetadata(ObjectOutput out)
          Writes metadata (not the keys or values, but just other metadata used by the procedure).
 
Methods inherited from class com.bigdata.btree.proc.AbstractKeyArrayIndexProcedure
getKey, getKeyCount, getKeys, getValue, getValues, readExternal, writeExternal
 
Methods inherited from class com.bigdata.btree.proc.AbstractIndexProcedure
getKeyBuilder
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.log4j.Logger log
Constructor Detail

Term2IdWriteProc

public Term2IdWriteProc()
De-serialization constructor.


Term2IdWriteProc

protected Term2IdWriteProc(IRabaCoder keySer,
                           int fromIndex,
                           int toIndex,
                           byte[][] keys,
                           boolean readOnly,
                           boolean storeBlankNodes,
                           int scaleOutTermIdBitsToReverse)
Method Detail

isReadOnly

public final boolean isReadOnly()
Description copied from interface: IIndexProcedure
Return true iff the procedure asserts that it will not write on the index. When true, the procedure may be run against a view of the index that is read-only or which allows concurrent processes to read on the same index object. When false the procedure will be run against a mutable view of the index (assuming that the procedure is executed in a context that has access to a mutable index view).

Specified by:
isReadOnly in interface IIndexProcedure

isStoreBlankNodes

public final boolean isStoreBlankNodes()

apply

public Object apply(IIndex ndx)
For each term whose serialized key is mapped to the current index partition, lookup the term in the terms index. If it is there then note its assigned termId. Otherwise, use the partition local counter to assign the term identifier, note the term identifier so that it can be communicated back to the client, and insert the {term,termId} entry into the terms index.

Specified by:
apply in interface IIndexProcedure
Parameters:
ndx - The terms index.
Returns:
The Term2IdWriteProc.Result, which contains the discovered / assigned term identifiers. TODO no point sending bnodes when readOnly.

readMetadata

protected void readMetadata(ObjectInput in)
                     throws IOException,
                            ClassNotFoundException
Description copied from class: AbstractKeyArrayIndexProcedure
Reads metadata written by AbstractKeyArrayIndexProcedure.writeMetadata(ObjectOutput).

Overrides:
readMetadata in class AbstractKeyArrayIndexProcedure
Throws:
IOException
ClassNotFoundException

writeMetadata

protected void writeMetadata(ObjectOutput out)
                      throws IOException
Writes metadata (not the keys or values, but just other metadata used by the procedure).

The default implementation writes toIndex - fromIndex, which is the #of keys.

Overrides:
writeMetadata in class AbstractKeyArrayIndexProcedure
Parameters:
out -
Throws:
IOException

VTE

public static final VTE VTE(byte code)


Copyright © 2006-2011 SYSTAP, LLC. All Rights Reserved.