com.bigdata.service
Interface ITxCommitProtocol

All Superinterfaces:
Remote
All Known Subinterfaces:
IDataService, IMetadataService
All Known Implementing Classes:
AbstractEmbeddedDataService, DataServer.AdministrableDataService, DataService, EmbeddedFederation.EmbeddedDataServiceImpl, EmbeddedMetadataService, LocalDataServiceFederation.LocalDataServiceImpl, MetadataServer.AdministrableMetadataService, MetadataService

public interface ITxCommitProtocol
extends Remote

Remote interface by which the ITransactionService manages the state of transactions on the distributed IDataServices.

Version:
$Id: ITxCommitProtocol.java 2265 2009-10-26 12:51:06Z thompsonbry $
Author:
Bryan Thompson

Method Summary
 void abort(long tx)
          Request abort of the transaction by the data service.
 void prepare(long tx, long revisionTime)
          Request that the IDataService participate in a 3-phase commit.
 void setReleaseTime(long releaseTime)
          Notify a data service that it MAY release data required to support views for up to the specified releaseTime .
 long singlePhaseCommit(long tx)
          Request commit of the transaction by the data service.
 

Method Detail

setReleaseTime

void setReleaseTime(long releaseTime)
                    throws IOException
Notify a data service that it MAY release data required to support views for up to the specified releaseTime . This is the mechanism by which read locks are released. In effect, a read lock is a requirement that the releaseTime not be advanced as far as the start time of the transaction holding that read lock. Periodically and as transactions complete, the transaction manager will advance the releaseTime, thereby releasing read locks.

Parameters:
releaseTime - The new release time (strictly advanced by the transaction manager).
Throws:
IllegalStateException - if the read lock is set to a time earlier than its current value.
IOException - if there is an RMI problem.

abort

void abort(long tx)
           throws IOException
Request abort of the transaction by the data service. This message is sent in response to ITransactionService.abort(long) to each IDataService on which the transaction has written. It is NOT sent for read-only transactions since they have no local state on the IDataServices.

Parameters:
tx - The transaction identifier.
Throws:
IllegalArgumentException - if the transaction has not been started on this data service.
IOException - if there is an RMI problem.

singlePhaseCommit

long singlePhaseCommit(long tx)
                       throws InterruptedException,
                              ExecutionException,
                              IOException
Request commit of the transaction by the data service. In the case where the transaction is entirely contained on the data service this method may be used to both prepare (validate) and commit the transaction (a single phase commit). Otherwise a 2-/3- phase commit is required and a separate #prepare(long) message MUST be used.

Parameters:
tx - The transaction identifier.
Returns:
The commit time assigned to that transaction.
Throws:
IllegalArgumentException - if the transaction is read-only.
IllegalStateException - if the transaction is not known to the data service.
InterruptedException - if interrupted.
ExecutionException - This will wrap a ValidationError if validation fails.
IOException - if there is an RMI problem.

prepare

void prepare(long tx,
             long revisionTime)
             throws Throwable,
                    IOException
Request that the IDataService participate in a 3-phase commit.

When the IDataService is sent the prepare(long, long) message it executes a task which will handle commit processing for the transaction. That task MUST hold exclusive locks for the unisolated indices to which the transaction write sets will be applied. While holding those locks, the task must first validate the transaction's write set and then merge down the write set onto the corresponding unisolated indices using the specified revisionTime and checkpoint the indices in order to reduce all possible sources of latency. Note that each IDataService is able to independently prepare exactly those parts of the transaction's write set which are mapped onto index partitions hosted by a given IDataService.

Once validation is complete and all possible steps have been taken to reduce sources of latency (e.g., checkpoint the indices and pre-extending the store if necessary), the task notifies the ITransactionService that it has prepared using ITransactionService#prepared(long). The ITransactionService will wait until all tasks have prepared. If a task CAN NOT prepare the transaction, then it MUST throw an exception out of its prepare(long, long) method.

Once all tasks have send an ITransactionService#prepared(long) message to the ITransactionService, it will assign a commitTime to the transaction and permit those methods to return that commitTime to the IDataServices. Once the task receives the assigned commit time, it must obtain an exclusive write lock for the live journal (this is a higher requirement than just an exclusive lock on the necessary indices and will lock out all other write requests for the journal), register the checkpointed indices on the commit list and then request a commit of the journal using the specified commitTime. The task then notifies the transaction service that it has completed its commit using ITransactionService#committed(long) and awaits a response. If the ITransactionService indicates that the commit was not successful, the task rolls back the live journal to the prior commit point and throws an exception out of prepare(long, long).

A sample flow for successful a distributed transaction commit is shown below. This example shows two IDataServices on which the client has written. (If the client only writes on a single data service then we use a single-phase commit protocol).

 client -------+----txService----+--dataService1--+--dataService2--+...           
   | [1]                    
   | commit(tx) -------- + [2]
   |                     | prepare(tx,rev) +
   |                     | [3]             |
   |                     | prepare(tx,rev) ------------------+
   |                     |                 |                 |
   |                     | <--prepared(tx) +                 |
   |                     |                                   |
   |                     | <------------------- prepared(tx) +
   |                     |  
   |       "prepared" barrier [4]
   |                     |
   |                     | -- (commitTime) +  
   |                     | -------------------- (commitTime) +
   |                     | [5]             |                 |
   |                     | <--committed(tx)------------------+  
   |                     | [6]             |
   |                     | <--committed(tx)+
   |                     | 
   |       "committed" barrier [7]
   |                     | [8]
   |                     | ------ (success)+  
   |                     | [9]             |
   |                     | (void)----------+
   |                     |                 halt
   |                     | [10]             
   |                     | ------------------------ (success)+
   |                     | [11]                              |
   |                     | (void)----------------------------+
   |                [12] |                                   halt
   | (commitTime)--------+  
   |                   
 
There are many points in the protocol where commit processing can fail. However, there are two primary failure classifications that are of interest for error handling. Up until the first barrier is satisified, there is no side-effect on the persistent state so error handling need only halt processing on the IDataServices and discard any local state associated with the transaction and throw an exception out of prepare(long, long). Once the first barrier has been satisfied, persistent side-effects MAY occur. Error handling in this case must rollback the state of the live journal for each of the participating IDataServices. If error handling was performed in response to a local error, then the IDataService must throw that error out of prepare(long, long). However, if error handling was initiated because ITransactionService#committed(long) returned false then it should return normally (after rolling back the journal).

Parameters:
tx - The transaction identifier.
revisionTime - The timestamp that will be written into the ITuples when the write set of the validated transaction is merged down onto the unisolated indices.
Throws:
Throwable - if there is a problem during the execution of the commit protocol by the IDataService.
IOException - if there is an RMI problem.
TODO:
it may be possible to set the desired commit time on the abstract task (or a subclass specific to the distributed commit protocol) and then use that timestamp rather than requesting one from the ITransactionService in the group commit. This would allow us to use the normal commit processing.

If each distributed transaction gets its own commit time then we can not allow more than one distributed transaction into a given commit group. Therefore it seems that the ITransactionService would have to be able to assign the same commitTime to a set of distributed transactions that it knew were prepared together and would commit together. I can't quite see how that would work.

Failing that, we will need to exclude other tasks (or at least other distributed commit processing tasks) from the commit group.



Copyright © 2006-2009 SYSTAP, LLC. All Rights Reserved.