com.bigdata.rdf.spo
Class FastRDFValueCoder2

java.lang.Object
  extended by com.bigdata.rdf.spo.FastRDFValueCoder2
All Implemented Interfaces:
IRabaCoder, Externalizable, Serializable

public class FastRDFValueCoder2
extends Object
implements Externalizable, IRabaCoder

Coder for statement index with inference enabled but without SIDS. We encode the value in 4 bits per statement. The 1st bit is the override flag. The remaining next two bits are the statement type {inferred, explicit, or axiom}. The last bit is not used. The bit sequence 0111 is used as a place holder for a null value and de-serializes to a [null]. This is just the low nibble of the StatementEnum.code(). This "nibble" encoding makes it fast and easy to extract the value from the coded record. The first value is stored in the low nibble, the next in the high nibble, then it is on to the low nibble of the next byte.

Note: the 'override' flag is NOT stored in the statement indices, but it is passed by the procedure that writes on the statement indices so that we can decide whether or not to override the type when the statement is pre-existing in the index.

Note: this procedure can not be used if AbstractTripleStore.Options#STATEMENT_IDENTIFIERS are enabled.

Version:
$Id: FastRDFValueCoder2.java 2265 2009-10-26 12:51:06Z thompsonbry $
Author:
Bryan Thompson
See Also:
StatementEnum, Serialized Form
TODO:
Fast coder for SIDs+type? E.g., SID[size] followed by nibble[size]?, A mutable coded value raba could be implemented for the statement indices. With a fixed bit length per value, we can represent the data in m/2 bytes. This is also true for things like TERM2ID where the values could be represented as a long[].

Field Summary
protected static org.apache.log4j.Logger log
           
 
Constructor Summary
FastRDFValueCoder2()
          Sole constructor (handles de-serialization also).
 
Method Summary
 ICodedRaba decode(AbstractFixedByteArrayBuffer data)
          Return an IRaba which can access the coded data.
 AbstractFixedByteArrayBuffer encode(IRaba raba, DataOutputBuffer buf)
          Encode the data.
 ICodedRaba encodeLive(IRaba raba, DataOutputBuffer buf)
          Encode the data, returning an ICodedRaba.
 boolean isKeyCoder()
          No.
 boolean isValueCoder()
          Yes.
 void readExternal(ObjectInput in)
           
 void writeExternal(ObjectOutput out)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.log4j.Logger log
Constructor Detail

FastRDFValueCoder2

public FastRDFValueCoder2()
Sole constructor (handles de-serialization also).

Method Detail

isKeyCoder

public final boolean isKeyCoder()
No.

Specified by:
isKeyCoder in interface IRabaCoder

isValueCoder

public final boolean isValueCoder()
Yes.

Specified by:
isValueCoder in interface IRabaCoder

writeExternal

public void writeExternal(ObjectOutput out)
                   throws IOException
Specified by:
writeExternal in interface Externalizable
Throws:
IOException

readExternal

public void readExternal(ObjectInput in)
                  throws IOException,
                         ClassNotFoundException
Specified by:
readExternal in interface Externalizable
Throws:
IOException
ClassNotFoundException

encode

public AbstractFixedByteArrayBuffer encode(IRaba raba,
                                           DataOutputBuffer buf)
Description copied from interface: IRabaCoder
Encode the data.

Note: Implementations of this method are typically heavy. While it is always valid to IRabaCoder.encode(IRaba, DataOutputBuffer) an IRaba , DO NOT invoke this arbitrarily on data which may already be coded. The ICodedRaba interface will always be implemented for coded data.

Specified by:
encode in interface IRabaCoder
Parameters:
raba - The data.
buf - A buffer on which the coded data will be written.
Returns:
A slice onto the post-condition state of the caller's buffer whose view corresponds to the coded record. This may be written directly onto an output stream or the slice may be converted to an exact fit byte[].

encodeLive

public ICodedRaba encodeLive(IRaba raba,
                             DataOutputBuffer buf)
Description copied from interface: IRabaCoder
Encode the data, returning an ICodedRaba. Implementations of this method should be optimized for the very common use case where the caller requires immediate access to the coded data record. In that case, many of the IRabaCoder implementations can be optimized by passing the underlying decoding object directly into an alternative constructor for the ICodedRaba. The byte[] slice for the coded data record is available from ICodedRaba.data().

This method covers the vast major of the use cases for coding data, which is to code B+Tree keys or values for a node or leaf that has been evicted from the AbstractBTree's write retention queue. The common use case is to wrap a coded record that was read from an IRawStore. The IndexSegmentBuilder is a special case, since the coded record will not be used other than to write it on the disk.

Specified by:
encodeLive in interface IRabaCoder

decode

public ICodedRaba decode(AbstractFixedByteArrayBuffer data)
Description copied from interface: IRabaCoder
Return an IRaba which can access the coded data. In general, implementations SHOULD NOT materialize a backing byte[][]. Instead, the implementation should access the data in place within the caller's buffer. Frequently used fields MAY be cached, but the whole point of the IRabaCoder is to minimize the in-memory footprint for the B+Tree by using a coded (aka compressed) representation of the keys and values whenever possible.

Specified by:
decode in interface IRabaCoder
Parameters:
data - The record containing the coded data.
Returns:
A view of the coded data.


Copyright © 2006-2009 SYSTAP, LLC. All Rights Reserved.