com.bigdata.rawstore
Class WormAddressManager

java.lang.Object
  extended by com.bigdata.rawstore.WormAddressManager
All Implemented Interfaces:
IAddressManager
Direct Known Subclasses:
IndexSegmentAddressManager

public class WormAddressManager
extends Object
implements IAddressManager

Encapsulates logic for operations on an opaque long integer comprising an byte offset and a byte count suitable for use in a WORM (Write Once, Read Many) IRawStore. Both the byte offset and the byte count of the record are stored directly in the opaque identifier. Note that the maximum byte offset only indirectly governs the maximum #of records that can be written on a store - the maximum #of records also depends on the actual lengths of the individual records that are written since the byte offset of the next record increases by the size of the last record written.

The constructor defines where the split is between the high and the low bits and therefore the maximum byte offset and the maximum byte count. This allows an IRawStore implementation to parameterize its handling of addresses to trade off the #of distinct offsets at which it can store records against the size of those records.

The offset is stored in the high bits of the long integer while the byte count is stored in the low bits. This means that two addresses encoded by an WormAddressManager with the same split point can be placed into a total ordering by their offset without being decoded.

Version:
$Id: WormAddressManager.java 5832 2012-01-06 17:55:31Z martyncutcher $
Author:
Bryan Thompson

Field Summary
protected static String _NULL_
          Used to represent a null reference by toString(long).
static int MAX_OFFSET_BITS
          The maximum #of bits that may be used to encode an offset (this leaves 4 bits for the byte count, so the maximum record size is only 16 bytes).
static int MIN_OFFSET_BITS
          The minimum #of bits that may be used to encode an offset as an unsigned integer (31).
static int SCALE_OUT_OFFSET_BITS
          The #of offset bits that must be used in order to support 64M (67,108,864 bytes) blobs (38).
static int SCALE_UP_OFFSET_BITS
          The #of offset bits that allows byte offsets of up to 4,398,046,511,103 (4 terabytes minus one) and a maximum record size of 4,194,303 (4 megabytes minus one).
 
Fields inherited from interface com.bigdata.rawstore.IAddressManager
NULL
 
Constructor Summary
WormAddressManager(int offsetBits)
          Construct an IAddressManager that will allocate a specified #of bits to the offset and use the remaining bits for the byte count component.
 
Method Summary
 boolean assertByteCount(int nbytes)
          Range check the byte count.
 boolean assertOffset(long offset)
          Range check the byte offset.
static boolean assertOffsetBits(int offsetBits)
          Range checks the #of offset bits.
 int getByteCount(long addr)
          The length of the datum in bytes.
 int getMaxByteCount()
          The maximum byte count that may be represented.
static int getMaxByteCount(int offsetBits)
          Compute the maximum byte count (aka record size) allowed for a given #of bits dedicated to the byte offset.
 long getMaxOffset()
          The maximum byte offset that may be represented.
 long getOffset(long addr)
          Note: overridden by IndexSegmentAddressManager.
 int getOffsetBits()
          Return the #of bits that are allocated to the offset.
 long getPhysicalAddress(long addr)
          Determine the unencoded physical address
static void main(String[] args)
          Displays a table of offset bits and the corresponding maximum byte offset and maximum byte count (aka record size) that a store may address for a given #of offset bits.
 long toAddr(int nbytes, long offset)
          Converts a byte count and offset into a long integer.
 String toString()
          A human readable representation of the state of the WormAddressManager.
 String toString(long addr)
          A human readable representation of the address.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

_NULL_

protected static final String _NULL_
Used to represent a null reference by toString(long).

See Also:
Constant Field Values

MIN_OFFSET_BITS

public static final int MIN_OFFSET_BITS
The minimum #of bits that may be used to encode an offset as an unsigned integer (31). This value MUST be used when the IRawStore implementation is backed by an in-memory array since an array index may not have more than 31 unsigned bits (the equivalent of 32 signed bits).

See Also:
Constant Field Values

MAX_OFFSET_BITS

public static final int MAX_OFFSET_BITS
The maximum #of bits that may be used to encode an offset (this leaves 4 bits for the byte count, so the maximum record size is only 16 bytes). This is not a useful record size for BTree data, but it might be useful for some custom data structures.

See Also:
Constant Field Values

SCALE_UP_OFFSET_BITS

public static final int SCALE_UP_OFFSET_BITS
The #of offset bits that allows byte offsets of up to 4,398,046,511,103 (4 terabytes minus one) and a maximum record size of 4,194,303 (4 megabytes minus one).

This is a good value when deploying a scale-up solution. For the scale-out deployment scenario you will have monolithic indices on a journal, providing effectively a WORM or "immortal" database. It is highly unlikely to have records as large as 4M with the modest branching factors used on a BTree backed by a Journal. At the same time, 4T is as much storage as we could reasonable expect to be able to address on a single file system.

See Also:
Constant Field Values

SCALE_OUT_OFFSET_BITS

public static final int SCALE_OUT_OFFSET_BITS
The #of offset bits that must be used in order to support 64M (67,108,864 bytes) blobs (38).

This is a good value when deploying a scale-out solution. For the scale-out deployment scenario you will have key-range partitioned indices automatically distributed among data services available on a cluster. The journal files are never permitted to grow very large for scale-out deployments. Instead, the journal periodically overflows, generating index segments which capture historical views. The larger record size (64M) also supports the distributed repository and map/reduce processing models.

See Also:
Constant Field Values
Constructor Detail

WormAddressManager

public WormAddressManager(int offsetBits)
Construct an IAddressManager that will allocate a specified #of bits to the offset and use the remaining bits for the byte count component.

Parameters:
offsetBits - An integer defining how many bits will be used for the offset component and thereby determines the maximum #of records that may be stored. The remaining bits are used for the byte count, so this indirectly determines the maximum #of bytes that may be stored in a record.
Method Detail

getOffsetBits

public final int getOffsetBits()
Return the #of bits that are allocated to the offset.


getMaxOffset

public final long getMaxOffset()
The maximum byte offset that may be represented.


getMaxByteCount

public final int getMaxByteCount()
The maximum byte count that may be represented.


assertOffsetBits

public static final boolean assertOffsetBits(int offsetBits)
Range checks the #of offset bits.

Parameters:
offsetBits - The #of offset bits.
Returns:
true otherwise.
Throws:
IllegalArgumentException - if the parameter is out of range.

getMaxByteCount

public static final int getMaxByteCount(int offsetBits)
Compute the maximum byte count (aka record size) allowed for a given #of bits dedicated to the byte offset.

Parameters:
offsetBits - The #of bits to be used to represent the byte offset.
Returns:
The maximum byte count that can be represented.

assertByteCount

public final boolean assertByteCount(int nbytes)
Range check the byte count.

Parameters:
nbytes - The byte count.
Returns:
true otherwise.
Throws:
IllegalArgumentException - if the byte count is out of range.

assertOffset

public final boolean assertOffset(long offset)
Range check the byte offset.

Parameters:
offset - The byte offset.
Returns:
true otherwise.
Throws:
IllegalArgumentException - if the offset is out of range.

toAddr

public final long toAddr(int nbytes,
                         long offset)
Description copied from interface: IAddressManager
Converts a byte count and offset into a long integer.

Specified by:
toAddr in interface IAddressManager
Parameters:
nbytes - The byte count.
offset - The byte offset.
Returns:
The long integer.

getByteCount

public final int getByteCount(long addr)
Description copied from interface: IAddressManager
The length of the datum in bytes. This must be the actual length of the record on the disk, not the length of the caller's byte[]. This is necessary in order to support transparent checksums and/or compression for records in the IRawStore.

Specified by:
getByteCount in interface IAddressManager
Parameters:
addr - The opaque identifier that is the within store locator for some datum.
Returns:
The offset of that datum.

getOffset

public long getOffset(long addr)
Note: overridden by IndexSegmentAddressManager.

Specified by:
getOffset in interface IAddressManager
Parameters:
addr - The opaque identifier that is the within store locator for some datum.
Returns:
The offset of that datum.

getPhysicalAddress

public long getPhysicalAddress(long addr)
Description copied from interface: IAddressManager
Determine the unencoded physical address

Specified by:
getPhysicalAddress in interface IAddressManager
Parameters:
addr - - the encoded address
Returns:
an unencoded address offset

toString

public String toString(long addr)
Description copied from interface: IAddressManager
A human readable representation of the address.

Specified by:
toString in interface IAddressManager
Parameters:
addr - The opaque identifier that is the within store locator for some datum.
Returns:
A human readable representation.

toString

public String toString()
A human readable representation of the state of the WormAddressManager.

Overrides:
toString in class Object

main

public static void main(String[] args)
Displays a table of offset bits and the corresponding maximum byte offset and maximum byte count (aka record size) that a store may address for a given #of offset bits. This table may be used to choose how to parameterize the WormAddressManager and hence a IRawStore using that WormAddressManager so as to best leverage the 64-bit long integer as a persistent locator into the store.

Parameters:
args - unused.


Copyright © 2006-2011 SYSTAP, LLC. All Rights Reserved.