com.bigdata.btree.raba.codec
Class FrontCodedRabaCoder

java.lang.Object
  extended by com.bigdata.btree.raba.codec.FrontCodedRabaCoder
All Implemented Interfaces:
IRabaCoder, Externalizable, Serializable
Direct Known Subclasses:
FrontCodedRabaCoder.DefaultFrontCodedRabaCoder

public class FrontCodedRabaCoder
extends Object
implements IRabaCoder, Externalizable

Class provides (de-)compression for logical byte[][]s based on front coding. The data MUST be ordered. null values are not allowed.

Version:
$Id: FrontCodedRabaCoder.java 2966 2010-06-03 18:31:54Z thompsonbry $
Author:
Bryan Thompson
See Also:
Serialized Form

Nested Class Summary
static class FrontCodedRabaCoder.DefaultFrontCodedRabaCoder
          A pre-parameterized version of the FrontCodedRabaCoder which is used as the default IRabaCoder for B+Tree keys for both nodes and leaves.
 
Field Summary
protected static org.apache.log4j.Logger log
           
 
Constructor Summary
FrontCodedRabaCoder()
          De-serialization ctor.
FrontCodedRabaCoder(int ratio)
           
 
Method Summary
 ICodedRaba decode(AbstractFixedByteArrayBuffer data)
          Return an IRaba which can access the coded data.
 AbstractFixedByteArrayBuffer encode(IRaba raba, DataOutputBuffer buf)
          Encode the data.
 ICodedRaba encodeLive(IRaba raba, DataOutputBuffer buf)
          Encode the data, returning an ICodedRaba.
 boolean isKeyCoder()
          Return true if this implementation can code B+Tree keys (supports search on the coded representation).
 boolean isValueCoder()
          Return true if this implementation can code B+Tree values (allows nulls).
 void readExternal(ObjectInput in)
           
 String toString()
           
 void writeExternal(ObjectOutput out)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

log

protected static final org.apache.log4j.Logger log
Constructor Detail

FrontCodedRabaCoder

public FrontCodedRabaCoder()
De-serialization ctor.


FrontCodedRabaCoder

public FrontCodedRabaCoder(int ratio)
Parameters:
ratio - The ratio as defined by ByteArrayFrontCodedList. For front-coding, compression trades directly for search performance. Every ratio byte[]s is fully coded. Binary search is used on the fully coded byte[]s and will identify a bucket ratio front-coded values. Linear search is then performed within the bucket of front-coded values in which the key would be found if it is present. Therefore the ratio is also the maximum of steps in the linear scan.

Let m := n / ratio, where n is the #of entries in the byte[][] (the size of the total search problem), m is the size of the binary search problem and ratio is the size of the linear search problem. Solving for ratio, we have: ratio := n / m. Some examples:

 m = n(64)/ratio(16) = 4
 
 m = n(64)/ratio(8) = 8
 
 m = n(64)/ratio(6) ˜ 11
 
 m = n(64)/ratio(4) = 16
 
Method Detail

toString

public String toString()
Overrides:
toString in class Object

isKeyCoder

public final boolean isKeyCoder()
Description copied from interface: IRabaCoder
Return true if this implementation can code B+Tree keys (supports search on the coded representation). Note that some implementations can code either keys or values.

Specified by:
isKeyCoder in interface IRabaCoder

isValueCoder

public final boolean isValueCoder()
Description copied from interface: IRabaCoder
Return true if this implementation can code B+Tree values (allows nulls). Note that some implementations can code either keys or values.

Specified by:
isValueCoder in interface IRabaCoder

readExternal

public void readExternal(ObjectInput in)
                  throws IOException,
                         ClassNotFoundException
Specified by:
readExternal in interface Externalizable
Throws:
IOException
ClassNotFoundException

writeExternal

public void writeExternal(ObjectOutput out)
                   throws IOException
Specified by:
writeExternal in interface Externalizable
Throws:
IOException

encodeLive

public ICodedRaba encodeLive(IRaba raba,
                             DataOutputBuffer buf)
Description copied from interface: IRabaCoder
Encode the data, returning an ICodedRaba. Implementations of this method should be optimized for the very common use case where the caller requires immediate access to the coded data record. In that case, many of the IRabaCoder implementations can be optimized by passing the underlying decoding object directly into an alternative constructor for the ICodedRaba. The byte[] slice for the coded data record is available from ICodedRaba.data().

This method covers the vast major of the use cases for coding data, which is to code B+Tree keys or values for a node or leaf that has been evicted from the AbstractBTree's write retention queue. The common use case is to wrap a coded record that was read from an IRawStore. The IndexSegmentBuilder is a special case, since the coded record will not be used other than to write it on the disk.

Specified by:
encodeLive in interface IRabaCoder

encode

public AbstractFixedByteArrayBuffer encode(IRaba raba,
                                           DataOutputBuffer buf)
Description copied from interface: IRabaCoder
Encode the data.

Note: Implementations of this method are typically heavy. While it is always valid to IRabaCoder.encode(IRaba, DataOutputBuffer) an IRaba , DO NOT invoke this arbitrarily on data which may already be coded. The ICodedRaba interface will always be implemented for coded data.

Specified by:
encode in interface IRabaCoder
Parameters:
raba - The data.
buf - A buffer on which the coded data will be written.
Returns:
A slice onto the post-condition state of the caller's buffer whose view corresponds to the coded record. This may be written directly onto an output stream or the slice may be converted to an exact fit byte[].

decode

public ICodedRaba decode(AbstractFixedByteArrayBuffer data)
Description copied from interface: IRabaCoder
Return an IRaba which can access the coded data. In general, implementations SHOULD NOT materialize a backing byte[][]. Instead, the implementation should access the data in place within the caller's buffer. Frequently used fields MAY be cached, but the whole point of the IRabaCoder is to minimize the in-memory footprint for the B+Tree by using a coded (aka compressed) representation of the keys and values whenever possible.

Specified by:
decode in interface IRabaCoder
Parameters:
data - The record containing the coded data.
Returns:
A view of the coded data.


Copyright © 2006-2011 SYSTAP, LLC. All Rights Reserved.