|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
public interface IRootBlockView
Interface for a root block on the journal. The root block provides metadata about the journal. The journal has two root blocks. The root blocks are written in an alternating order according to the Challis algorithm. Each root block includes a field at the head and tail whose value is strictly increasing fields. This field is often referred to as a root block "timestamps", but in practice we use the commit counter. On restart, the root block is choosen whose (a) strictly increasing fields agree; and (b) whose value on those fields is greater. This protected against both crashes and partial writes of the root block itself.
The commit counter is a store local strictly increasing non-negative long integer (commit counters are distinct for each store regardless of whether they are part of the same distributed database). The commit counters MUST be strictly increasing (a) so that they place the commit records into a total ordering; (b) so that the more current root block may be choose by comparing the value of the field in each of the two root blocks; and (c) so that a partial write of a root block may be detected by the presence of different values for the field at the head and tail of a given root block. The commit counter is also used as the field written at the head and tail of each root block according to the Challis algorithm. If those fields are the same then the root block is assumed to have been completely written.
Note that random data may still result in an identical value during a partial write. This possibility is guarded against by storing the checksum of the root block.
The first and last commit times are persisted in each root block in order to
support both unisolated commits and transactions, whether in a local or a
distributed database. These "times" are generated by the appropriate
ITransactionManagerService service, which is responsible both for assigning
transaction start times (which are in fact the transaction identifier) and
transaction commit times, which are stored in root blocks of the various
stored that participate in a given database and reported via
getFirstCommitTime() and getLastCommitTime(). While these
do not strictly speaking have to be "times" they do have to be assigned using
the same measure as the transaction identifiers, so either a coordinated time
server or a strictly increasing counter. Regardless, we need to know "when" a
transaction commits as well as "when" it starts whether we measure "when"
using a counter or a clock. Also note that we need to assign "commit times"
even when the operation is unisolated. This means that we have to coordinate
an unisolated commit on a store that is part of a distributed database with
the centralized transaction manager. This should be done as part of the group
commit since we are waiting at that point anyway to optimize IO by minimizing
syncs to disk.
Note that some file systems or disks can re-order writes of by the
application and write the data in a more efficient order. This can cause the
root blocks to be written before the application data is stable on disk. The
Options.DOUBLE_SYNC option exists to defeat this behavior and ensure
restart-safety for such systems.
| Method Summary | |
|---|---|
ByteBuffer |
asReadOnlyBuffer()
A read-only buffer whose contents are the root block. |
long |
getCloseTime()
The timestamp assigned as the time at which writes were disallowed for the journal. |
long |
getCommitCounter()
The commit counter is a positive long integer that is strictly local to the store. |
long |
getCommitRecordAddr()
Return the address at which the ICommitRecord for this root block
is stored. |
long |
getCommitRecordIndexAddr()
The address of the root of the CommitRecordIndex. |
long |
getCreateTime()
The timestamp assigned as the creation time for the journal. |
long |
getFirstCommitTime()
The database wide timestamp of first commit on the store -or- 0L if there have been no commits. |
long |
getLastCommitTime()
The database wide timestamp of the most recent commit on the store or 0L iff there have been no commits. |
long |
getNextOffset()
The next offset at which a data item would be written on the store. |
int |
getOffsetBits()
The #of bits in a 64-bit long integer address that are dedicated to the byte offset into the store. |
UUID |
getUUID()
The unique journal identifier |
int |
getVersion()
The root block version number. |
boolean |
isRootBlock0()
There are two root blocks and they are written in an alternating order. |
void |
valid()
Assertion throws exception unless the root block is valid. |
| Method Detail |
|---|
void valid()
throws RootBlockException
RootBlockExceptionboolean isRootBlock0()
int getVersion()
long getNextOffset()
long getFirstCommitTime()
long getLastCommitTime()
long getCommitCounter()
Challis field).
long getCommitRecordAddr()
ICommitRecord for this root block
is stored. The ICommitRecords are stored separately from the
root block so that they may be indexed by the commit timestamps. This is
necessary in order to be able to quickly recover the root addresses for a
given commit timestamp, which is a featured used to support transactional
isolation.
Note: When a logical journal may overflow onto more than one physical
journal then the address of the ICommitRecord MAY refer to a
historical physical journal and care MUST be exercised to resolve the
address against the appropriate journal file.
ICommitRecord for this root
block is stored.long getCommitRecordIndexAddr()
CommitRecordIndex. The
CommitRecordIndex contains the ordered addresses of the
historical ICommitRecords on the Journal. The address
of the CommitRecordIndex is stored directly in the root block
rather than the ICommitRecord since we can not obtain this
address until after we have formatted and written the
ICommitRecord.
UUID getUUID()
int getOffsetBits()
WormAddressManagerlong getCreateTime()
long getCloseTime()
ByteBuffer asReadOnlyBuffer()
ByteBuffer that is
returned by this method.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||