com.bigdata.btree.raba.codec
Class CanonicalHuffmanRabaCoder.AbstractCodingSetup

java.lang.Object
  extended by com.bigdata.btree.raba.codec.CanonicalHuffmanRabaCoder.AbstractCodingSetup
Direct Known Subclasses:
CanonicalHuffmanRabaCoder.RabaCodingSetup
Enclosing class:
CanonicalHuffmanRabaCoder

protected abstract static class CanonicalHuffmanRabaCoder.AbstractCodingSetup
extends Object

Abstract base class for preparing a logical byte[][] for coding.

Version:
$Id: CanonicalHuffmanRabaCoder.java 2547 2010-03-24 20:44:07Z thompsonbry $
Author:
Bryan Thompson

Constructor Summary
protected CanonicalHuffmanRabaCoder.AbstractCodingSetup()
           
 
Method Summary
protected  it.unimi.dsi.fastutil.bytes.Byte2IntOpenHashMap buildSymbolTable(int[] frequency, int[] packedFrequency, byte[] symbol2byte)
          Build the symbol table, populating the packedFrequency array, etc.
 int byte2symbol(byte b)
          Mapping from byte values to symbol indices.
abstract  it.unimi.dsi.compression.HuffmanCodec codec()
          The codec used to encode and decode the logical byte[][].
abstract  it.unimi.dsi.compression.HuffmanCodec.DecoderInputs decoderInputs()
          The data required to reconstruct the decoder.
protected  int[] getFrequencyCount(IRaba raba)
          Create a frequency table reporting the #of occurrences of for every possible byte value.
protected  int[] getPackedFrequencyCount(IRaba raba)
          Return a dense array of the non-zero frequency counts in byte value order.
abstract  int getSymbolCount()
          Return the #of distinct symbols used to generate the code.
protected  int getSymbolCount(int[] frequency)
          Compute the number of distinct bytes.
protected static String printCodeBook(it.unimi.dsi.bits.BitVector[] codeWords, com.bigdata.btree.raba.codec.CanonicalHuffmanRabaCoder.Symbol2Byte symbol2byte)
          Format the code book as a multi-line string.
 byte symbol2byte(int symbol)
          Mapping from symbol indices to byte values.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

CanonicalHuffmanRabaCoder.AbstractCodingSetup

protected CanonicalHuffmanRabaCoder.AbstractCodingSetup()
Method Detail

getSymbolCount

public abstract int getSymbolCount()
Return the #of distinct symbols used to generate the code.


codec

public abstract it.unimi.dsi.compression.HuffmanCodec codec()
The codec used to encode and decode the logical byte[][].


decoderInputs

public abstract it.unimi.dsi.compression.HuffmanCodec.DecoderInputs decoderInputs()
The data required to reconstruct the decoder.


printCodeBook

protected static String printCodeBook(it.unimi.dsi.bits.BitVector[] codeWords,
                                      com.bigdata.btree.raba.codec.CanonicalHuffmanRabaCoder.Symbol2Byte symbol2byte)
Format the code book as a multi-line string.

Parameters:
codeWords - The code words.
symbol2byte - The mapping from symbol indices to byte value.
Returns:
A representation of the code book.

getPackedFrequencyCount

protected int[] getPackedFrequencyCount(IRaba raba)
Return a dense array of the non-zero frequency counts in byte value order. The length of the array is the #of distinct symbols appearing in the input.

Parameters:
raba - The logical byte[][].
Returns:
The packed frequency counts.

getFrequencyCount

protected int[] getFrequencyCount(IRaba raba)
Create a frequency table reporting the #of occurrences of for every possible byte value.

Parameters:
raba - The data.
Returns:
An 256 element array giving the frequency of each byte value. Values not observed will have a zero frequency count.

getSymbolCount

protected int getSymbolCount(int[] frequency)
Compute the number of distinct bytes.

Parameters:
frequency - An array of 256 elements giving the frequency of occurrence for each possible byte value.
Returns:
The #of non-zero elements in that array.

buildSymbolTable

protected it.unimi.dsi.fastutil.bytes.Byte2IntOpenHashMap buildSymbolTable(int[] frequency,
                                                                           int[] packedFrequency,
                                                                           byte[] symbol2byte)
Build the symbol table, populating the packedFrequency array, etc. as a side effect.

Parameters:
frequency - An array of 256 frequency counts. Each element of the array gives the frequency of occurrence of the corresponding byte value.
packedFrequency - The non-zero symbol frequency counts. This array is correlated with the packed symbol table. The array must be pre-allocated by the caller with nsymbol elements.
symbol2byte - The forward lookup symbol table. The array must be pre-allocated by the caller with nsymbol elements.
Returns:
The reverse symbol table.

byte2symbol

public int byte2symbol(byte b)
Mapping from byte values to symbol indices.

Parameters:
b - The byte value.
Returns:
The symbol used to code that byte value -or- -1 if the byte value was not assigned to any symbol.

symbol2byte

public byte symbol2byte(int symbol)
Mapping from symbol indices to byte values.

Parameters:
symbol - The symbol index.
Returns:
The byte value.


Copyright © 2006-2011 SYSTAP, LLC. All Rights Reserved.