com.bigdata.sparse
Class KeyDecoder

java.lang.Object
  extended by com.bigdata.sparse.KeyDecoder

public class KeyDecoder
extends Object

A utility class that decodes a key in a SparseRowStore into the KeyType for the primary key, the column name, and the timestamp. Note that the exact schema name itself is not recoverable since it is encoded using a non-reversal algorithm (it is a sort key generated by a Unicode collator). Likewise, the primary key can be decoded for primitive data types, but while we can identify the bytes corresponding to the primary key for a Unicode KeyType we can not decode them (it is also a sort key generated by a Unicode collator). The column name is NOT stored with Unicode compression so that we can decode it without loss (it is encoded into bytes using UTF-8 and those bytes are written directly into the key). This means that column names are NOT ordered according to the Unicode collator. In practice this is not a problem since we never assume order for that part of the key. The SparseRowStore only relies on {columnName,timestamp} defining the semantics of distinct keys for a given {schema,primaryKey} prefix.

The encoded schema name is followed by the KeyType.getByteCode() and then by a nul byte. By searching for the nul byte we can identify the end of the encoded schema name and also the data type of the primary key. Most kinds of primary keys have a fixed length encoding, e.g., Long, Double, etc. However, Unicode primary keys have a variable length encoding which makes life more ... complex. Since the keys need to reflect the total sort order we can not include the byte count of the primary key in the key itself. The only reasonable approach is to append a byte sequence to the key that never occurs within the generated Unicode sort keys. We use a nul byte for this purpose since it is not emitted by most Unicode collation implementations as it would cause grief for C-language strings.

Version:
$Id: KeyDecoder.java 2265 2009-10-26 12:51:06Z thompsonbry $
Author:
Bryan Thompson
See Also:
Schema.fromKey(IKeyBuilder, Object), KeyType.getKeyType(byte), AtomicRowWriteRead, AtomicRowRead

Field Summary
 long timestamp
          The decoded timestamp on the column value.
 
Constructor Summary
KeyDecoder(byte[] key)
           
 
Method Summary
 String getColumnName()
          The decoded column name.
 byte[] getPrefix()
          Returns the head of the key corresponding to the encoded schema name, the primary key's KeyType, and the primary key (including any terminating nul byte).
 Object getPrimaryKey()
          The decoded primary key.
 KeyType getPrimaryKeyType()
          The decoded KeyType for the primary key.
 byte[] getSchemaBytes()
          The bytes from the key that represent the encoded name of the Schema.
 long getTimestamp()
          The decoded timestamp on the column value.
 String toString()
          Shows some of the data that is extracted.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

timestamp

public final long timestamp
The decoded timestamp on the column value.

Constructor Detail

KeyDecoder

public KeyDecoder(byte[] key)
Method Detail

getSchemaBytes

public byte[] getSchemaBytes()
The bytes from the key that represent the encoded name of the Schema.


getPrimaryKeyType

public final KeyType getPrimaryKeyType()
The decoded KeyType for the primary key.


getPrimaryKey

public Object getPrimaryKey()
The decoded primary key.

Throws:
UnsupportedOperationException - if the primary key can not be decoded (e.g., for KeyType.Unicode keys).

getColumnName

public final String getColumnName()
The decoded column name.


getTimestamp

public long getTimestamp()
The decoded timestamp on the column value. The semantics of the timestamp depend entirely on the application. When the application provides timestamps, they are application defined long integers. When the application requests auto-timestamps, they are generated by the data service.


getPrefix

public byte[] getPrefix()
Returns the head of the key corresponding to the encoded schema name, the primary key's KeyType, and the primary key (including any terminating nul byte).

Returns:

toString

public String toString()
Shows some of the data that is extracted.

Overrides:
toString in class Object


Copyright © 2006-2009 SYSTAP, LLC. All Rights Reserved.