|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.bigdata.btree.keys.KeyBuilder
public class KeyBuilder
A class that may be used to form multi-component keys but which does not support Unicode. An instance of this class is quite light-weight and SHOULD be used when Unicode support is not required.
Note: Avoid any dependencies within this class on the ICU libraries so that the code may run without those libraries when they are not required.
Compute the successor of a value before encoding it as a
component of a key.,
Compute the successor of an encoded key.| Nested Class Summary | |
|---|---|
static interface |
KeyBuilder.Options
Configuration options for DefaultKeyBuilderFactory and the
KeyBuilder factory methods. |
| Field Summary | |
|---|---|
protected byte[] |
buf
The key buffer. |
static int |
DEFAULT_INITIAL_CAPACITY
The default capacity of the key buffer. |
protected static boolean |
INFO
|
protected int |
len
A non-negative integer specifying the #of bytes of data in the buffer that contain valid data starting from position zero(0). |
protected static org.apache.log4j.Logger |
log
|
byte |
pad
The default pad character (a space). |
protected UnicodeSortKeyGenerator |
sortKeyGenerator
The object used to generate sort keys from Unicode strings (optional). |
| Fields inherited from interface com.bigdata.btree.keys.IKeyBuilder |
|---|
maxlen |
| Constructor Summary | |
|---|---|
|
KeyBuilder()
Creates a key builder with an initial buffer capacity of 1024 bytes. |
|
KeyBuilder(int initialCapacity)
Creates a key builder with the specified initial buffer capacity. |
protected |
KeyBuilder(UnicodeSortKeyGenerator sortKeyGenerator,
int len,
byte[] buf)
Creates a key builder using an existing buffer with some data (designated constructor). |
| Method Summary | |
|---|---|
IKeyBuilder |
append(byte v)
Converts the signed byte to an unsigned byte and appends it to the key. |
IKeyBuilder |
append(byte[] a)
Appends an array of bytes - the bytes are treated as unsigned values. |
IKeyBuilder |
append(double d)
Appends a double precision floating point value by first converting it into a signed long integer using Double.doubleToLongBits(double),
converting that values into a twos-complement number and then appending
the bytes in big-endian order into the key buffer. |
IKeyBuilder |
append(float f)
Appends a single precision floating point value by first converting it into a signed integer using Float.floatToIntBits(float)
converting that values into a twos-complement number and then appending
the bytes in big-endian order into the key buffer. |
IKeyBuilder |
append(int v)
Appends a signed integer to the key by first converting it to a lexiographic ordering as an unsigned integer and then appending it into the buffer as 4 bytes using a big-endian order. |
IKeyBuilder |
append(int off,
int len,
byte[] a)
Append len bytes starting at off in a to the key buffer. |
IKeyBuilder |
append(long v)
Appends a signed long integer to the key by first converting it to a lexiographic ordering as an unsigned long integer and then appending it into the buffer as 8 bytes using a big-endian order. |
IKeyBuilder |
append(Object val)
Append the value to the buffer, encoding it as appropriate based on the class of the object. |
IKeyBuilder |
append(short v)
Appends a signed short integer to the key by first converting it to a two-complete representation supporting unsigned byte[] comparison and then appending it into the buffer as 2 bytes using a big-endian order. |
IKeyBuilder |
append(String s)
Encodes a Unicode string using the configured KeyBuilder.Options.COLLATOR
and appends the resulting sort key to the buffer (without a trailing nul
byte). |
IKeyBuilder |
append(UUID uuid)
Appends the UUID to the key using the MSB and then the LSB. |
IKeyBuilder |
appendASCII(String s)
Encodes a unicode string by assuming that its contents are ASCII characters. |
IKeyBuilder |
appendNul()
Append an unsigned zero byte to the key. |
IKeyBuilder |
appendText(String text,
boolean unicode,
boolean successor)
Encodes a variable length text field into the buffer. |
IKeyBuilder |
appendUnsigned(byte v)
|
static byte[] |
asSortKey(Object val)
Utility method converts an application key to a sort key (an unsigned byte[] that imposes the same sort order). |
protected static byte[] |
createBuffer(int initialCapacity)
Create a buffer of the specified initial capacity. |
static long |
d2l(double d)
Encodes a double precision floating point value as an int64 value that has the same total ordering (you can compare two doubles encoded by this method and the long values will have the same ordering as the double values). |
static String |
decodeASCII(byte[] key,
int off,
int len)
Decodes an ASCII string from a key. |
static byte |
decodeByte(int v)
Converts an unsigned byte into a signed byte. |
static double |
decodeDouble(byte[] key,
int off)
|
static float |
decodeFloat(byte[] key,
int off)
|
static int |
decodeInt(byte[] buf,
int off)
Decodes a signed int value as encoded by append(int). |
static long |
decodeLong(byte[] buf,
int off)
Decodes a signed long value as encoded by append(long). |
static short |
decodeShort(byte[] buf,
int off)
Decodes a signed short value as encoded by append(short). |
static byte |
encodeByte(int v)
Converts a signed byte into an unsigned byte. |
void |
ensureCapacity(int capacity)
Ensure that the buffer capacity is a least capacity total bytes. |
void |
ensureFree(int len)
Ensure that at least len bytes are free in the buffer. |
static int |
f2i(float f)
Encodes a floating point value as an int32 value that has the same total ordering (you can compare two floats encoded by this method and the int values will have the same ordering as the float values). |
byte[] |
getBuffer()
|
byte[] |
getKey()
Return the encoded key. |
int |
getLength()
The #of bytes of data in the key. |
byte[] |
getSortKey(Object val)
Return an unsigned byte[] sort key. |
UnicodeSortKeyGenerator |
getSortKeyGenerator()
The object responsible for generating sort keys from Unicode strings. |
boolean |
isUnicodeSupported()
Return true iff Unicode is supported by this object
(returns false if only ASCII support is configured). |
static IKeyBuilder |
newInstance()
|
static IKeyBuilder |
newInstance(int initialCapacity)
Create an instance for ASCII keys with the specified initial capacity. |
static IKeyBuilder |
newInstance(int capacity,
CollatorEnum collatorChoice,
Locale locale,
Object strength,
DecompositionEnum mode)
Create a new instance that optionally supports Unicode sort keys. |
static IKeyBuilder |
newUnicodeInstance()
Create a factory for IKeyBuilder instances configured using the
system properties. |
static IKeyBuilder |
newUnicodeInstance(Properties properties)
Create a factory for IKeyBuilder instances configured according
to the specified properties. |
void |
position(int pos)
Sets the position to any non-negative length less than the current capacity of the buffer. |
IKeyBuilder |
reset()
Reset the key length to zero before building another key. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected static final org.apache.log4j.Logger log
protected static final boolean INFO
public static final int DEFAULT_INITIAL_CAPACITY
protected int len
protected byte[] buf
protected final UnicodeSortKeyGenerator sortKeyGenerator
Note: When null the IKeyBuilder does NOT support Unicode
and the optional Unicode methods will all throw an
UnsupportedOperationException.
public final byte pad
Note: Any character may be choosen as the pad character as long as it has
a one byte representation. In practice this means you can choose 0x20 (a
space) or 0x00 (a nul). This limit arises in
appendText(String, boolean, boolean) which assumes that it can
write a pad character (or its successor) in one byte. 0xff will NOT work
since its successor is not defined within an bit string of length 8.
| Constructor Detail |
|---|
public KeyBuilder()
1024 bytes.
public KeyBuilder(int initialCapacity)
initialCapacity - The initial capacity of the internal byte[] used to construct
keys. When zero (0) the DEFAULT_INITIAL_CAPACITY will
be used.
protected KeyBuilder(UnicodeSortKeyGenerator sortKeyGenerator,
int len,
byte[] buf)
sortKeyGenerator - The object used to generate sort keys from Unicode strings
(when null Unicode collation support is
disabled).len - The #of bytes of data in the provided buffer.buf - The buffer, with len pre-existing bytes of valid data.
The buffer reference is used directly rather than making a
copy of the data.| Method Detail |
|---|
protected static byte[] createBuffer(int initialCapacity)
initialCapacity - The initial size of the buffer.
IllegalArgumentException - if the initial capacity is negative.public final int getLength()
IKeyBuilder
getLength in interface IKeyBuilderpublic final byte[] getBuffer()
public final void position(int pos)
public final IKeyBuilder append(int off,
int len,
byte[] a)
IKeyBuilder
append in interface IKeyBuilderoff - The offset.len - The #of bytes to append.a - The array containing the bytes to append.
public final void ensureFree(int len)
buffer may be grown by this operation but it will not be
truncated.
This operation is equivilent to
ensureCapacity(this.len + len)and the latter is often used as an optimization.
len - The minimum #of free bytes.public final void ensureCapacity(int capacity)
buffer may be grown by this operation but it will not be
truncated.
capacity - The minimum #of bytes in the buffer.public final byte[] getKey()
IKeyBuilderNote that keys are donated to the btree so it is important to allocate new keys when running in the same process space. When using a network api, the api provides the necessary decoupling.
getKey in interface IKeyBuilderBytesUtil.compareBytes(byte[], byte[])public final IKeyBuilder reset()
IKeyBuilder
reset in interface IKeyBuilderpublic final boolean isUnicodeSupported()
IKeyBuildertrue iff Unicode is supported by this object
(returns false if only ASCII support is configured).
isUnicodeSupported in interface IKeyBuilderpublic final UnicodeSortKeyGenerator getSortKeyGenerator()
UnicodeSortKeyGenerator -or- null if Unicode
is not supported by this KeyBuilder instance.
public final IKeyBuilder append(String s)
IKeyBuilderKeyBuilder.Options.COLLATOR
and appends the resulting sort key to the buffer (without a trailing nul
byte).
Note: The SuccessorUtil.successor(String) of a string is formed
by appending a trailing nul character. However, since
IDENTICAL appears to be required to differentiate between
a string and its successor (with the trailing nul
character), you MUST form the sort key first and then its successor (by
appending a trailing nul). Failure to follow this pattern
will lead to the successor of the key comparing as EQUAL to the key. For
example,
IKeyBuilder keyBuilder = ...;
String s = "foo";
byte[] fromKey = keyBuilder.reset().append( s );
// right.
byte[] toKey = keyBuilder.reset().append( s ).appendNul();
// wrong!
byte[] toKey = keyBuilder.reset().append( s+"\0" );
append in interface IKeyBuilders - A string.
SuccessorUtil.successor(String),
SuccessorUtil.successor(byte[]),
FIXME update the javadoc further to speak to handling of multi-field
keys.public IKeyBuilder appendASCII(String s)
IKeyBuilder
Note: This method is potentially much faster than the Unicode aware
IKeyBuilder.append(String). However, this method is NOT uncode aware and
non-ASCII characters will not be encoded correctly. This method MUST NOT
be mixed with keys whose corresponding component is encoded by the
unicode aware methods, e.g., IKeyBuilder.append(String).
appendASCII in interface IKeyBuilders - A String containing US-ASCII characters.
public static String decodeASCII(byte[] key,
int off,
int len)
key - The key.off - The offset of the start of the string.len - The #of bytes to decode (one byte per character).
appendASCII(String)
public IKeyBuilder appendText(String text,
boolean unicode,
boolean successor)
IKeyBuilderIKeyBuilder.maxlen characters. The sort keys for
strings that differ after truncation solely in the #of trailing
#pad characters will be identical (trailing pad characters are
implicit out to IKeyBuilder.maxlen characters).
Note: Trailing pad characters are normalized to a representation as a single pad character (1 byte) followed by the #of actual or implied trailing pad characters represented as an unsigned short integer (2 bytes). This technique serves to keep multi-field keys with embedded variable length text fields aligned such that the field following a variable length text field does not bleed into the lexiographic ordering of the variable length text field.
Note: While the ASCII encoding happens to use one byte for each character that is NOT true of the Unicode encoding. The space requirements for the Unicode encoding depend on the text, the Local, the collator strength, and the collator decomposition mode.
Note: The successor option is designed to encapsulate some
trickiness around forming the successor of a variable length text field
embedded in a multi-field key. In particular, simply appending a
nul byte will NOT work (it works fine when the text field
is the last field in the key or when it is the only component in the
key). This approach breaks encapsulation of the field boundaries such
that the resulting "successor" is actually ordered before the original
key. This happens because you introduce a 0x0 byte right on the boundary
of the next field, effectively causing the next field to have a smaller
value. Consider the following example (in hex) where "|" represents the
end of the "text" field:
ab cd | 12
if you compute the successor by appending a nul byte to the text field
you get
ab cd | 00 12
which is ordered before the original key!
appendText in interface IKeyBuildertext - The text.unicode - When true the text is interpreted as Unicode according to the
KeyBuilder.Options.COLLATOR option. Otherwise it is interpreted
as ASCII.successor - When true, the successor of the text will be encoded.
Otherwise the text will be encoded.
IKeyBuilder.http://www.unicode.org/reports/tr10/tr10-10.html#Interleaved_Levelspublic final IKeyBuilder append(byte[] a)
IKeyBuilderunsigned values.
append in interface IKeyBuildera - The array of bytes.
public final IKeyBuilder append(double d)
IKeyBuilderDouble.doubleToLongBits(double),
converting that values into a twos-complement number and then appending
the bytes in big-endian order into the key buffer.
Note: this converts -0d and +0d to the same key.
append in interface IKeyBuilderd - The double-precision floating point value.
public static double decodeDouble(byte[] key,
int off)
public final IKeyBuilder append(float f)
IKeyBuilderFloat.floatToIntBits(float)
converting that values into a twos-complement number and then appending
the bytes in big-endian order into the key buffer.
Note: this converts -0f and +0f to the same key.
append in interface IKeyBuilderf - The single-precision floating point value.
public static float decodeFloat(byte[] key,
int off)
public final IKeyBuilder append(UUID uuid)
IKeyBuilder
append in interface IKeyBuilderuuid - The UUID.
public final IKeyBuilder append(long v)
IKeyBuilder
append in interface IKeyBuilderpublic final IKeyBuilder append(int v)
IKeyBuilder
append in interface IKeyBuilderpublic final IKeyBuilder append(short v)
IKeyBuilder
append in interface IKeyBuilderpublic final IKeyBuilder appendUnsigned(byte v)
public final IKeyBuilder append(byte v)
IKeyBuilder
append in interface IKeyBuilderv - The signed byte.
public final IKeyBuilder appendNul()
IKeyBuilder
appendNul in interface IKeyBuilderpublic static final byte[] asSortKey(Object val)
Note: This method is thread-safe.
Note: Strings are Unicode safe for the default locale. See
Locale.getDefault(). If you require a specific local or
different locals at different times or for different indices then you
MUST provision and apply your own KeyBuilder.
val - An application key.
null iff the key is null.
If the key is a byte[], then the byte[] itself will be
returned.public byte[] getSortKey(Object val)
ISortKeyBuilder
getSortKey in interface ISortKeyBuilderval - Some object (required).
public IKeyBuilder append(Object val)
IKeyBuilderUUID and Unicode Strings.
append in interface IKeyBuilderval - The value.
public static byte encodeByte(int v)
v - The signed byte.
public static byte decodeByte(int v)
v - The unsigned byte.
public static long d2l(double d)
Double.doubleToLongBits(double) and then converting the resulting
long into a two's complement number.
See
Comparing floating point numbers by Bruce Dawson.
d - The double precision floating point value.
public static int f2i(float f)
Float.floatToIntBits(float) and then converting
the resulting int into a two's complement number.
See
Comparing floating point numbers by Bruce Dawson.
f - The floating point value.
public static long decodeLong(byte[] buf,
int off)
append(long).
buf - The buffer containing the encoded key.off - The offset at which to decode the key.
public static int decodeInt(byte[] buf,
int off)
append(int).
buf - The buffer containing the encoded key.off - The offset at which to decode the key.
public static short decodeShort(byte[] buf,
int off)
append(short).
buf - The buffer containing the encoded key.off - The offset at which to decode the key.
public static IKeyBuilder newInstance()
public static IKeyBuilder newInstance(int initialCapacity)
initialCapacity - The initial capacity.
public static IKeyBuilder newUnicodeInstance()
IKeyBuilder instances configured using the
system properties. The factory will support Unicode unless
CollatorEnum.ASCII is explicitly specified for the
KeyBuilder.Options.COLLATOR property.
properties - The properties to be used (optional). When null
the System properties are used.
UnsupportedOperationException -
The ICU library was required but was not located. Make sure
that the ICU JAR is on the classpath. See
KeyBuilder.Options.COLLATOR.
Note: If you are trying to use ICU4JNI then that has to be locatable as a native library. How you do this is different for Windows and Un*x.
KeyBuilder.Optionspublic static IKeyBuilder newUnicodeInstance(Properties properties)
IKeyBuilder instances configured according
to the specified properties. Any properties NOT explicitly given
will be defaulted from System.getProperties(). The pre-defined
properties KeyBuilder.Options.USER_LANGUAGE, KeyBuilder.Options.USER_COUNTRY,
and KeyBuilder.Options.USER_VARIANT MAY be overriden. The factory will
support Unicode unless CollatorEnum.ASCII is explicitly specified
for the KeyBuilder.Options.COLLATOR property.
properties - The properties to be used (optional). When null
the System properties are used.
UnsupportedOperationException -
The ICU library was required but was not located. Make sure
that the ICU JAR is on the classpath. See
KeyBuilder.Options.COLLATOR.
Note: If you are trying to use ICU4JNI then that has to be locatable as a native library. How you do this is different for Windows and Un*x.
KeyBuilder.Options
public static IKeyBuilder newInstance(int capacity,
CollatorEnum collatorChoice,
Locale locale,
Object strength,
DecompositionEnum mode)
capacity - The initial capacity of the buffer. When zero (0) the
DEFAULT_INITIAL_CAPACITY will be used.collatorChoice - Identifies the collator that will be used to generate sort
keys from Unicode values.locale - When null the
default locale will be used.strength - Either an Integer or a StrengthEnum specifying
the strength to be set on the collator object (optional). When
null the default strength of the collator will
not be overridden.mode - The decomposition mode to be set on the collator object
(optional). When null the default decomposition
mode of the collator will not be overridden.
UnsupportedOperationException - The ICU library was required but was not located. Make sure that the ICU JAR is on the classpath.
Note: If you are trying to use ICUJNI then that has to be locatable as a native library. How you do this is different for Windows and Un*x.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||