|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.bigdata.sparse.SparseRowStore
public class SparseRowStore
A client-side class that knows how to use an IIndex to provide an
efficient data model in which a logical row is stored as one or more entries
in the IIndex. Operations are provided for atomic read and write of
logical row. While the scan operations are always consistent (they will never
reveal data from a row that undergoing concurrent modification), they do NOT
cause concurrent atomic row writes to block. This means that rows that would
be visited by a scan MAY be modified before the scan reaches those rows and
the client will see the updates.
The SparseRowStore requires that you declare the KeyType for
primary key so that it may impose a consistent total ordering over the
generated keys in the index.
There is no intrinsic reason why column values must be strongly typed.
Therefore, by default column values are loosely typed. However, column values
MAY be constrained by a Schema.
This class builds keys using the sparse row store design pattern. Each logical row is modeled as an ordered set of index entries whose keys are formed as:
[schemaName][primaryKey][columnName][timestamp]
and the values are the value for a given column for that primary key.
Timestamps are either generated by the application, in which case they define the semantics of a write-write conflict, or on write by the index. In the latter case, write-write conflicts never arise. Regardless of how timestamps are generated, the use of the timestamp in the key requires that applications specify filters that are applied during row scans to limit the data points actually returned as part of the row. For example, only returning the most recent column values no later than a given timestamp for all columns for some primary key.
For example, assuming records with the following columns
[employee][12][DateOfHire][t0] : [4/30/02]
[employee][12][DateOfHire][t1] : [4/30/05]
[employee][12][Employer][t0] : [SAIC]
[employee][12][Employer][t1] : [SYSTAP]
[employee][12][Id][t0] : [12]
[employee][12][Name][t0] : [Bryan Thompson]
In order to read the logical row whose last update was t0,
the caller would specify t0 as the toTime of interest.
The values read in this example would be {<DateOfHire, t0, 4/30/02>,
<Employer, t0, SAIC>, <Id, t0, 12>, <Name, t0, Bryan
Thompson>}.
Likewise, in order to read the logical row whose last update was <code>t1</code> the caller would specify <code>t1</code> as the toTime of interest. The values read in this example would be {<DateOfHire, t1, 4/30/05>, <Employer, t0, SYSTAP>, <Id, t0, 12>, <Name, t0, Bryan Thompson>}. Notice that values written at <code>t0</code> and not overwritten or deleted by <code>t1</code> are present in the resulting logical row.
Note: Very large objects should be stored in the BigdataFileSystem
(distributed, atomic, versioned, chunked file system) and the identifier for
that object can then be stored in the row store.
SparseRowStore. A caching layer in the web app could be used to
reduce any hotspots., $Id: SparseRowStore.java 2265 2009-10-26 12:51:06Z thompsonbry $| Field Summary | |
|---|---|
protected boolean |
DEBUG
True iff the log level is DEBUG or less. |
protected boolean |
INFO
True iff the log level is INFO or less. |
protected static org.apache.log4j.Logger |
log
|
| Fields inherited from interface com.bigdata.sparse.IRowStoreConstants |
|---|
AUTO_TIMESTAMP, AUTO_TIMESTAMP_UNIQUE, CURRENT_ROW, MAX_TIMESTAMP, MIN_TIMESTAMP |
| Constructor Summary | |
|---|---|
SparseRowStore(IIndex ndx)
Create a client-side abstraction that treats an IIndex as a
SparseRowStore. |
|
| Method Summary | |
|---|---|
ITPS |
delete(Schema schema,
Object primaryKey)
Atomic delete of all property values for the current logical row. |
ITPS |
delete(Schema schema,
Object primaryKey,
long fromTime,
long toTime,
long writeTime,
INameFilter filter)
Atomic delete of all property values for the logical row. |
Object |
get(Schema schema,
Object primaryKey,
String name)
Return the current binding for the named property. |
IIndex |
getIndex()
The backing index. |
Iterator<? extends ITPS> |
rangeIterator(Schema schema)
A logical row scan. |
Iterator<? extends ITPS> |
rangeIterator(Schema schema,
Object fromKey,
Object toKey)
A logical row scan. |
Iterator<? extends ITPS> |
rangeIterator(Schema schema,
Object fromKey,
Object toKey,
INameFilter filter)
A logical row scan. |
Iterator<? extends ITPS> |
rangeIterator(Schema schema,
Object fromKey,
Object toKey,
int capacity,
long fromTime,
long toTime,
INameFilter nameFilter)
A logical row scan. |
Map<String,Object> |
read(Schema schema,
Object primaryKey)
Read the most recent logical row from the index. |
Map<String,Object> |
read(Schema schema,
Object primaryKey,
INameFilter filter)
Read the most recent logical row from the index. |
ITPS |
read(Schema schema,
Object primaryKey,
long fromTime,
long toTime,
INameFilter filter)
Read a logical row from the index. |
Map<String,Object> |
write(Schema schema,
Map<String,Object> propertySet)
Atomic write with atomic read-back of the post-update state of the logical row. |
Map<String,Object> |
write(Schema schema,
Map<String,Object> propertySet,
long writeTime)
Atomic write with atomic read-back of the post-update state of the logical row. |
TPS |
write(Schema schema,
Map<String,Object> propertySet,
long writeTime,
INameFilter filter,
IPrecondition precondition)
Atomic write with atomic read of the then current post-condition state of the logical row. |
TPS |
write(Schema schema,
Map<String,Object> propertySet,
long fromTime,
long toTime,
long writeTime,
INameFilter filter,
IPrecondition precondition)
Atomic write with atomic read of the post-condition state of the logical row. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected static final org.apache.log4j.Logger log
protected final boolean INFO
log level is INFO or less.
protected final boolean DEBUG
log level is DEBUG or less.
| Constructor Detail |
|---|
public SparseRowStore(IIndex ndx)
IIndex as a
SparseRowStore.
ndx - The index.| Method Detail |
|---|
public IIndex getIndex()
public Object get(Schema schema,
Object primaryKey,
String name)
schema - The Schema governing the logical row.primaryKey - The primary key that identifies the logical row.name - The property name.
null iff the property is
not bound.AbstractAtomicRowReadOrWrite.getCurrentValue(IIndex, Schema, Object, String)
public Map<String,Object> read(Schema schema,
Object primaryKey)
schema - The Schema governing the logical row.primaryKey - The primary key that identifies the logical row.
null IFF there are no property values for that
logical row (including no deleted property values, no property
values that are excluded due to their timestamps, and no property
values that are excluded due to a property name filter). A
null return is a strong guarentee that NO data
existed in the row store and that time of the read for the given
schema and primaryKey.
public Map<String,Object> read(Schema schema,
Object primaryKey,
INameFilter filter)
schema - The Schema governing the logical row.primaryKey - The primary key that identifies the logical row.filter - An optional filter.
null IFF there are no property values for that
logical row (including no deleted property values, no property
values that are excluded due to their timestamps, and no property
values that are excluded due to a property name filter). A
null return is a strong guarentee that NO data
existed in the row store and that time of the read for the given
schema and primaryKey.
public ITPS read(Schema schema,
Object primaryKey,
long fromTime,
long toTime,
INameFilter filter)
schema - The Schema governing the logical row.primaryKey - The primary key that identifies the logical row.fromTime - The first timestamp for which timestamped property values will
be accepted.toTime - The first timestamp for which timestamped property values will
NOT be accepted -or- IRowStoreConstants.CURRENT_ROW to
accept only the most current binding whose timestamp is GTE
fromTime.filter - An optional filter that may be used to select values for
property names accepted by the filter.
null IFF there are no
property values for that logical row (including no deleted
property values, no property values that are excluded due to
their timestamps, and no property values that are excluded due to
a property name filter). A null return is a strong
guarentee that NO data existed in the row store and that time of
the read for the given schema and primaryKey.
IllegalArgumentException - if the schema is null.
IllegalArgumentException - if the primaryKey is null.
IllegalArgumentException - if the fromFrom and or toTime are invalid.ITimestampPropertySet#asMap(), return the most current bindings.,
ITimestampPropertySet#asMap(long)), return the most current bindings
as of the specified timestamp.,
IRowStoreConstants.CURRENT_ROW,
IRowStoreConstants.MIN_TIMESTAMP,
IRowStoreConstants.MAX_TIMESTAMP
public Map<String,Object> write(Schema schema,
Map<String,Object> propertySet)
Note: In order to cause a column value for row to be deleted you MUST
specify a null column value for that column.
Note: the value of the primaryKey is written each time the logical row is updated and timestamp associate with the value for the primaryKey property tells you the timestamp of each row revision.
schema - The Schema governing the logical row.propertySet - The column names and values for that row.
public Map<String,Object> write(Schema schema,
Map<String,Object> propertySet,
long writeTime)
schema - The Schema governing the logical row.propertySet - The column names and values for that row.writeTime - The timestamp to use for the row -or-
IRowStoreConstants.AUTO_TIMESTAMP if the timestamp
will be generated by the server -or-
IRowStoreConstants.AUTO_TIMESTAMP_UNIQUE if a
federation-wide unique timestamp will be generated by the
server.
public TPS write(Schema schema,
Map<String,Object> propertySet,
long writeTime,
INameFilter filter,
IPrecondition precondition)
Note: In order to cause a column value for row to be deleted you MUST
specify a null column value for that column. A
null will be written under the key for the column value
with a new timestamp. This is interpreted as a deleted property value
when the row is simplified as a Map. If you examine the
ITPS you can see the ITPV with the null
value and the timestamp of the delete.
Note: the value of the primaryKey is written each time the logical row is updated and timestamp associate with the value for the primaryKey property tells you the timestamp of each row revision.
Note: If the caller specified a timestamp, then that timestamp is used by the atomic read. If the timestamp was assigned by the server, then the server assigned timestamp is used by the atomic read.
Note: You can verify pre-conditions for the logical row on the server. Among other things this could be used to reject an update if someone has modified the logical row since you last read some value.
schema - The Schema governing the logical row.propertySet - The column names and values for that row. The primaryKey as
identified by the Schema MUST be present in the
propertySet.writeTime - The timestamp to use for the row -or-
IRowStoreConstants.AUTO_TIMESTAMP if the timestamp
will be generated by the server -or-
IRowStoreConstants.AUTO_TIMESTAMP_UNIQUE if a
federation-wide unique timestamp will be generated by the
server.filter - An optional filter used to select the property values that
will be returned (this has no effect on the atomic write).precondition - When present, the pre-condition state of the row will be read
and offered to the IPrecondition. If the
IPrecondition fails, then the atomic write will NOT be
performed and the pre-condition state of the row will be
returned. If the IPrecondition succeeds, then the
atomic write will be performed and the post-condition state of
the row will be returned. Use TPS.isPreconditionOk()
to determine whether or not the write was performed.
null iff there is no data for the
primaryKey (per the contract for an atomic read).
If an optional IPrecondition was specified and the
IPrecondition was NOT satisified, then the write
operation was NOT performed and the result is the pre-condition
state of the logical row (which, again, will be null
IFF there is NO data for the primaryKey).
ITPS.getWriteTimestamp()
public TPS write(Schema schema,
Map<String,Object> propertySet,
long fromTime,
long toTime,
long writeTime,
INameFilter filter,
IPrecondition precondition)
Note: In order to cause a column value for row to be deleted you MUST
specify a null column value for that column. A
null will be written under the key for the column value
with a new timestamp. This is interpreted as a deleted property value
when the row is simplified as a Map. If you examine the
ITPS you can see the ITPV with the null
value and the timestamp of the delete.
Note: the value of the primaryKey is written each time the logical row is updated and timestamp associate with the value for the primaryKey property tells you the timestamp of each row revision.
Note: If the caller specified a timestamp, then that timestamp is used by the atomic read. If the timestamp was assigned by the server, then the server assigned timestamp is used by the atomic read.
Note: You can verify pre-conditions for the logical row on the server. Among other things this could be used to reject an update if someone has modified the logical row since you last read some value.
schema - The Schema governing the logical row.propertySet - The column names and values for that row. The primaryKey as
identified by the Schema MUST be present in the
propertySet.fromTime - During pre-condition and post-condition reads, the
first timestamp for which timestamped property values will be
accepted.toTime - During pre-condition and post-condition reads, the
first timestamp for which timestamped property values will NOT
be accepted -or- IRowStoreConstants.CURRENT_ROW to
accept only the most current binding whose timestamp is GTE
fromTime.writeTime - The timestamp to use for the row -or-
IRowStoreConstants.AUTO_TIMESTAMP if the timestamp
will be generated by the server -or-
IRowStoreConstants.AUTO_TIMESTAMP_UNIQUE if a
federation-wide unique timestamp will be generated by the
server.filter - An optional filter used to select the property values that
will be returned (this has no effect on the atomic write).precondition - When present, the pre-condition state of the row will be read
and offered to the IPrecondition. If the
IPrecondition fails, then the atomic write will NOT be
performed and the pre-condition state of the row will be
returned. If the IPrecondition succeeds, then the
atomic write will be performed and the post-condition state of
the row will be returned. Use TPS.isPreconditionOk()
to determine whether or not the write was performed.
null IFF there is NO
data for the primaryKey.
If an optional IPrecondition was specified and the
IPrecondition was NOT satisified, then the write
operation was NOT performed and the result is the pre-condition
state of the logical row (which, again, will be null
IFF there is NO data for the primaryKey).
UnsupportedOperationException - if a property has an auto-increment type and the
ValueType of the property does not support
auto-increment.
UnsupportedOperationException - if a property has an auto-increment type but there is no
successor in the value space of that property.ITPS.getWriteTimestamp()ITimestampService with an
implementation that always returns a caller-given constant, another
that uses the local system clock, another that uses the system
clock but ensures that it never hands off the same timestamp twice
in a row, and another than resolves the global timestamp service.
it is also possible that the timestamp behavior should be defined
by the Schema and therefore factored out of this method
signature.
public ITPS delete(Schema schema,
Object primaryKey)
schema - The schema.primaryKey - The primary key for the logical row.
public ITPS delete(Schema schema,
Object primaryKey,
long fromTime,
long toTime,
long writeTime,
INameFilter filter)
null, and the read property values are
returned.
schema - The schema.primaryKey - The primary key for the logical row.fromTime - During pre-condition and post-condition reads, the
first timestamp for which timestamped property values will be
accepted.toTime - During pre-condition and post-condition reads, the
first timestamp for which timestamped property values will NOT
be accepted -or- IRowStoreConstants.CURRENT_ROW to
accept only the most current binding whose timestamp is GTE
fromTime.writeTime - The timestamp that will be written into the "deleted" entries
-or- IRowStoreConstants.AUTO_TIMESTAMP if the
timestamp will be generated by the server -or-
IRowStoreConstants.AUTO_TIMESTAMP_UNIQUE if a
federation-wide unique timestamp will be generated by the
server.filter - An optional filter used to select the property values that
will be deleted.
ITPS.getWriteTimestamp() will report
the timestamp assigned to the deleted entries used to overwrite
these property values in the store.IPrecondition., unit tests.public Iterator<? extends ITPS> rangeIterator(Schema schema)
schema - The Schema governing the logical row.
public Iterator<? extends ITPS> rangeIterator(Schema schema,
Object fromKey,
Object toKey)
schema - The Schema governing the logical row.fromKey - The value of the primary key for lower bound (inclusive) of
the key range -or- null iff there is no lower
bound.toKey - The value of the primary key for upper bound (exclusive) of
the key range -or- null iff there is no lower
bound.
public Iterator<? extends ITPS> rangeIterator(Schema schema,
Object fromKey,
Object toKey,
INameFilter filter)
schema - The Schema governing the logical row.fromKey - The value of the primary key for lower bound (inclusive) of
the key range -or- null iff there is no lower
bound.toKey - The value of the primary key for upper bound (exclusive) of
the key range -or- null iff there is no lower
bound.filter - An optional filter.
public Iterator<? extends ITPS> rangeIterator(Schema schema,
Object fromKey,
Object toKey,
int capacity,
long fromTime,
long toTime,
INameFilter nameFilter)
schema - The Schema governing the logical row.fromKey - The value of the primary key for lower bound (inclusive) of
the key range -or- null iff there is no lower
bound.toKey - The value of the primary key for upper bound (exclusive) of
the key range -or- null iff there is no lower
bound.capacity - When non-zero, this is the maximum #of logical rows that will
be read atomically. This is only an upper bound. The actual
#of logical rows in an atomic read depends on a variety of
factors.fromTime - The first timestamp for which timestamped property values will
be accepted.toTime - The first timestamp for which timestamped property values will
NOT be accepted -or- IRowStoreConstants.CURRENT_ROW to
accept only the most current binding whose timestamp is GTE
fromTime.nameFilter - An optional filter used to select the property(s) of interest.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||