com.bigdata.resources
Class OverflowManager

java.lang.Object
  extended by com.bigdata.resources.ResourceEvents
      extended by com.bigdata.resources.StoreManager
          extended by com.bigdata.resources.IndexManager
              extended by com.bigdata.resources.OverflowManager
All Implemented Interfaces:
IResourceManager, IServiceShutdown
Direct Known Subclasses:
ResourceManager

public abstract class OverflowManager
extends IndexManager

Class encapsulates logic for handling journal overflow events. Overflow is triggered automatically when the user data extent on the journal nears a configured threshold. Once the preconditions for overflow are satisfied, the WriteExecutorServices for the journal are paused and all running tasks on those services are allowed to complete and commit. Once no writers are running, the WriteExecutorService triggers synchronous overflow. Synchronous overflow is a low-latency process which creates a new journal to absorb future writes, re-defines the views for all index partitions found on the old journal to include the new journal as their first source, and initiates a background thread performing asynchronous overflow post-processing.

Asynchronous overflow post-processing is responsible for identifying index partitions overflow (resulting in a split into two or more index partitions), index partition underflow (resulting in the join of the under-capacity index partition with its rightSibling), index partition moves (the index partition is moved to a different DataService), and index partition builds (an IndexSegment is created from the current view in what is effectively a compacting merge). Overflow processing is suspended during asynchronous post-processing, but is automatically re-enabled once post-processing completes.

Version:
$Id: OverflowManager.java 4523 2011-05-18 18:14:45Z thompsonbry $
Author:
Bryan Thompson

Nested Class Summary
static interface OverflowManager.IIndexPartitionTaskCounters
          Performance counters for the index partition tasks.
static interface OverflowManager.IOverflowManagerCounters
          Performance counters for the OverflowManager.
static interface OverflowManager.Options
          Options understood by the OverflowManager.
static class OverflowManager.ResourceScores
          Helper class reports performance counters of interest for this service.
 
Nested classes/interfaces inherited from class com.bigdata.resources.IndexManager
IndexManager.IIndexManagerCounters, IndexManager.IndexSegmentStats
 
Nested classes/interfaces inherited from class com.bigdata.resources.StoreManager
StoreManager.IStoreManagerCounters, StoreManager.ManagedJournal
 
Field Summary
protected  int accelerateSplitThreshold
           
protected  AtomicBoolean asyncOverflowEnabled
          A flag used to disable the asynchronous overflow processing for some unit tests.
protected  int buildServiceCorePoolSize
          The #of threads which will execute index partition build operations.
 AtomicBoolean compactingMerge
          A flag that may be set to force the next asynchronous overflow to perform a compacting merge for all indices that are not simply copied over to the new journal (the use of this flag significantly raises the time required for asynchronous overflow processing as all shard views must be made compact and SHOULD NOT be used for deployed federations).
protected  boolean compactingMergeWithAfterAction
          FIXME This is a temporary flag used to (dis|en)able the logic for executing various index partition operations as after actions for a compacting merge.
protected  int copyIndexThreshold
           
 AtomicBoolean forceOverflow
          Flag may be set to force overflow processing during the next group commit.
protected  boolean joinsEnabled
           
protected static org.apache.log4j.Logger log
          Logger.
protected  long maximumBuildSegmentBytes
           
protected  int maximumJournalsPerView
          Deprecated. merges are now performed in priority order while time remains in a given asynchronous overflow cycle.
protected  double maximumMovePercentOfSplit
           
protected  int maximumMoves
          Deprecated. Moves are now decided on a case by case basis. An alternative parameter might be introduced in the future to restrict the rate at which a DS can shed shards by moving them to other nodes.
protected  int maximumMovesPerTarget
          Deprecated. Moves are now decided on a case by case basis. An alternative parameter might be introduced in the future to restrict the rate at which a DS can shed shards by moving them to other nodes.

Note: This is also used to disable moves by some of the unit tests so we need a way to replace that functionality before this can be taken out.

protected  int maximumOptionalMergesPerOverflow
          Deprecated. merges are now performed in priority order while time remains in a given asynchronous overflow cycle.
protected  int maximumSegmentsPerView
          Deprecated. merges are now performed in priority order while time remains in a given asynchronous overflow cycle.
protected  int mergeServiceCorePoolSize
          The #of threads which will execute index partition merge operations.
protected  int minimumActiveIndexPartitions
           
protected  double movePercentCpuTimeThreshold
           
 long nominalShardSize
          Index partitions are split when they approach this size on the disk.
protected  AtomicBoolean overflowAllowed
          A flag used to disable overflow of the live journal until asynchronous post-processing of the old journal has been completed.
protected  boolean overflowCancelledWhenJournalFull
           
protected  OverflowCounters overflowCounters
          The "live" overflow counters which are maintained by the service.
protected  int overflowTasksConcurrent
          Deprecated. by mergeServiceCorePoolSize and buildServiceCorePoolSize
protected  double overflowThreshold
           
protected  long overflowTimeout
          The timeout for asynchronous overflow processing.
protected  double percentOfJoinThreshold
          FIXME configuration option.
protected  double percentOfSplitThreshold
           
protected  boolean scatterSplitEnabled
           
protected  String serviceName
          The name of the service (iff available).
 double shardOverextensionLimit
          If an index partition refuses to split it will be disabled once its size on disk (for a compact view) is greater than this multiplier.
protected  double tailSplitThreshold
           
 
Fields inherited from class com.bigdata.resources.IndexManager
buildTasks, concurrentBuildTaskCount, concurrentMergeTaskCount, staleLocatorCache
 
Fields inherited from class com.bigdata.resources.StoreManager
accelerateOverflowThreshold, bytesDeleted, bytesUnderManagement, dataDir, indexCacheLock, journalBytesUnderManagement, journalDeleteCount, journalReopenCount, journalsDir, lastCommitTimePreserved, lastOverflowTime, liveJournalRef, maximumJournalSizeAtOverflow, purgeResourcesMillis, segmentBytesUnderManagement, segmentsDir, segmentStoreDeleteCount, segmentStoreReopenCount, storeCache, tmpDir
 
Constructor Summary
OverflowManager(Properties properties)
           
 
Method Summary
protected  OverflowMetadata doSynchronousOverflow()
          Synchronous overflow processing.
 long getAsynchronousOverflowCount()
          #of asynchronous overflows that have taken place.
protected  double getHostCounter(String path, double defaultValue)
          Return the value of a host counter.
 OverflowCounters getOverflowCounters()
          Return a copy of the OverflowCounters.
protected  double getServiceCounter(String path, double defaultValue)
          Return the value of a service counter.
 long getSynchronousOverflowCount()
          #of synchronous overflows that have taken place.
 boolean isOverflowAllowed()
          true unless an overflow event is currently being processed.
 boolean isOverflowEnabled()
          true if overflow processing is enabled and false if overflow processing was disabled as a configuration option or if a maximum overflow count was configured and has been satisfied, in which case the live journal will NOT overflow.
 Future<Object> overflow()
          Core method for overflow with post-processing.
 boolean shouldOverflow()
          An overflow condition is recognized when the journal is within some declared percentage of Options.MAXIMUM_EXTENT.
 void shutdown()
          The service will no longer accept new requests, but existing requests will be processed (sychronous).
 void shutdownNow()
          The service will no longer accept new requests and will make a best effort attempt to terminate all existing requests and return ASAP.
 
Methods inherited from class com.bigdata.resources.IndexManager
buildIndexSegment, disableWrites, enableWrites, getIndex, getIndexCacheCapacity, getIndexCacheSize, getIndexCounters, getIndexCounters, getIndexOnStore, getIndexPartitionGone, getIndexRetentionTime, getIndexSegmentCacheCapacity, getIndexSegmentCacheSize, getIndexSources, getIndexSources, getStaleLocatorCount, isDisabledWrites, listIndexPartitions, markAndGetDelta, setIndexPartitionGone
 
Methods inherited from class com.bigdata.resources.StoreManager
addResource, assertNotOpen, assertOpen, assertRunning, awaitRunning, deleteResource, deleteResources, getBytesUnderManagement, getCommitTimeStrictlyGreaterThan, getConcurrencyManager, getDataDir, getDataDirFreeSpace, getIndexSegmentFile, getIndexSegmentFile, getJournal, getJournalBytesUnderManagement, getLiveJournal, getManagedJournalCount, getManagedSegmentCount, getProperties, getReleaseTime, getResourceService, getResourcesForTimestamp, getSegmentBytesUnderManagement, getStoreCacheSize, getStoreCounters, getTempDirFreeSpace, getTmpDir, isOpen, isRunning, isStarting, isTransient, munge, newFileFilter, nextTimestamp, openStore, overrideJournalExtent, purgeOldResources, purgeOldResources, retentionSetAdd, retentionSetRemove, setConcurrencyManager, setReleaseTime
 
Methods inherited from class com.bigdata.resources.ResourceEvents
closeJournal, closeTx, closeUnisolatedBTree, deleteJournal, dropUnisolatedBTree, extendJournal, isolateIndex, openJournal, openTx, openUnisolatedBTree
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface com.bigdata.journal.IResourceManager
getCounters, getDataService, getDataServiceUUID, getFederation
 

Field Detail

log

protected static final org.apache.log4j.Logger log
Logger.


compactingMergeWithAfterAction

protected final boolean compactingMergeWithAfterAction
FIXME This is a temporary flag used to (dis|en)able the logic for executing various index partition operations as after actions for a compacting merge.

See Also:
Constant Field Values

copyIndexThreshold

protected final int copyIndexThreshold
See Also:
OverflowManager.Options.COPY_INDEX_THRESHOLD

accelerateSplitThreshold

protected final int accelerateSplitThreshold
See Also:
OverflowManager.Options.ACCELERATE_SPLIT_THRESHOLD

percentOfSplitThreshold

protected final double percentOfSplitThreshold
See Also:
OverflowManager.Options.PERCENT_OF_SPLIT_THRESHOLD

percentOfJoinThreshold

protected final double percentOfJoinThreshold
FIXME configuration option.

See Also:
Constant Field Values

tailSplitThreshold

protected final double tailSplitThreshold
See Also:
OverflowManager.Options.TAIL_SPLIT_THRESHOLD

scatterSplitEnabled

protected final boolean scatterSplitEnabled
See Also:
OverflowManager.Options.SCATTER_SPLIT_ENABLED

joinsEnabled

protected final boolean joinsEnabled
See Also:
OverflowManager.Options.JOINS_ENABLED

minimumActiveIndexPartitions

protected final int minimumActiveIndexPartitions
See Also:
OverflowManager.Options.MINIMUM_ACTIVE_INDEX_PARTITIONS

maximumMoves

protected final int maximumMoves
Deprecated. Moves are now decided on a case by case basis. An alternative parameter might be introduced in the future to restrict the rate at which a DS can shed shards by moving them to other nodes.
See Also:
OverflowManager.Options.MAXIMUM_MOVES

maximumMovesPerTarget

protected final int maximumMovesPerTarget
Deprecated. Moves are now decided on a case by case basis. An alternative parameter might be introduced in the future to restrict the rate at which a DS can shed shards by moving them to other nodes.

Note: This is also used to disable moves by some of the unit tests so we need a way to replace that functionality before this can be taken out.

See Also:
OverflowManager.Options.MAXIMUM_MOVES_PER_TARGET

maximumMovePercentOfSplit

protected final double maximumMovePercentOfSplit
See Also:
OverflowManager.Options.MAXIMUM_MOVE_PERCENT_OF_SPLIT

movePercentCpuTimeThreshold

protected final double movePercentCpuTimeThreshold
See Also:
OverflowManager.Options.MOVE_PERCENT_CPU_TIME_THRESHOLD

maximumOptionalMergesPerOverflow

protected final int maximumOptionalMergesPerOverflow
Deprecated. merges are now performed in priority order while time remains in a given asynchronous overflow cycle.
The maximum #of optional compacting merge operations that will be performed during a single overflow event.

See Also:
OverflowManager.Options.MAXIMUM_OPTIONAL_MERGES_PER_OVERFLOW

maximumJournalsPerView

protected final int maximumJournalsPerView
Deprecated. merges are now performed in priority order while time remains in a given asynchronous overflow cycle.
See Also:
OverflowManager.Options.MAXIMUM_JOURNALS_PER_VIEW

maximumSegmentsPerView

protected final int maximumSegmentsPerView
Deprecated. merges are now performed in priority order while time remains in a given asynchronous overflow cycle.
See Also:
OverflowManager.Options.MAXIMUM_SEGMENTS_PER_VIEW

maximumBuildSegmentBytes

protected final long maximumBuildSegmentBytes
See Also:
OverflowManager.Options.MAXIMUM_BUILD_SEGMENT_BYTES

buildServiceCorePoolSize

protected final int buildServiceCorePoolSize
The #of threads which will execute index partition build operations.

See Also:
OverflowManager.Options.BUILD_SERVICE_CORE_POOL_SIZE

mergeServiceCorePoolSize

protected final int mergeServiceCorePoolSize
The #of threads which will execute index partition merge operations.

See Also:
OverflowManager.Options.MERGE_SERVICE_CORE_POOL_SIZE

serviceName

protected final String serviceName
The name of the service (iff available). This is used to help label thread pools and the like.


overflowThreshold

protected final double overflowThreshold
See Also:
OverflowManager.Options.OVERFLOW_THRESHOLD

overflowAllowed

protected final AtomicBoolean overflowAllowed
A flag used to disable overflow of the live journal until asynchronous post-processing of the old journal has been completed.

See Also:
AsynchronousOverflowTask

asyncOverflowEnabled

protected final AtomicBoolean asyncOverflowEnabled
A flag used to disable the asynchronous overflow processing for some unit tests.


forceOverflow

public final AtomicBoolean forceOverflow
Flag may be set to force overflow processing during the next group commit. The flag is cleared by overflow().

See Also:
DataService.forceOverflow(boolean, boolean)

compactingMerge

public final AtomicBoolean compactingMerge
A flag that may be set to force the next asynchronous overflow to perform a compacting merge for all indices that are not simply copied over to the new journal (the use of this flag significantly raises the time required for asynchronous overflow processing as all shard views must be made compact and SHOULD NOT be used for deployed federations). The state of the flag is cleared each time asynchronous overflow processing begins.

See Also:
DataService.forceOverflow(boolean, boolean)

overflowCounters

protected final OverflowCounters overflowCounters
The "live" overflow counters which are maintained by the service.


overflowTimeout

protected final long overflowTimeout
The timeout for asynchronous overflow processing.

See Also:
OverflowManager.Options.OVERFLOW_TIMEOUT

overflowTasksConcurrent

protected final int overflowTasksConcurrent
Deprecated. by mergeServiceCorePoolSize and buildServiceCorePoolSize
See Also:
OverflowManager.Options.OVERFLOW_TASKS_CONCURRENT

overflowCancelledWhenJournalFull

protected final boolean overflowCancelledWhenJournalFull
See Also:
OverflowManager.Options.OVERFLOW_CANCELLED_WHEN_JOURNAL_FULL

nominalShardSize

public final long nominalShardSize
Index partitions are split when they approach this size on the disk.

See Also:
OverflowManager.Options.NOMINAL_SHARD_SIZE
TODO:
Encapsulate with split accelerator factor when this is the first index partition for some scale-out index.

shardOverextensionLimit

public final double shardOverextensionLimit
If an index partition refuses to split it will be disabled once its size on disk (for a compact view) is greater than this multiplier. The most common cause for this is a bad ISimpleSplitHandler implementation provided by the application when it registered the index. By disallowing further writes on the shard we prevent it from dragging down performance for the entire data service and push the problem back on the application. In order to remedy this issue on a pre-existing index you must fix the split handler, register the new split handler on the MDS and on each shard on the index, and then re-enable writes for the index.

See Also:
Constant Field Values
TODO:
configuration option?
Constructor Detail

OverflowManager

public OverflowManager(Properties properties)
Parameters:
properties -
Method Detail

getOverflowCounters

public OverflowCounters getOverflowCounters()
Return a copy of the OverflowCounters.


getSynchronousOverflowCount

public long getSynchronousOverflowCount()
#of synchronous overflows that have taken place. This counter is incremented each time the synchronous overflow operation.

See Also:
getOverflowCounters()

getAsynchronousOverflowCount

public long getAsynchronousOverflowCount()
#of asynchronous overflows that have taken place. This counter is incremented each time the entire overflow operation is complete, including any post-processing of the old journal.

See Also:
getOverflowCounters()

isOverflowEnabled

public boolean isOverflowEnabled()
true if overflow processing is enabled and false if overflow processing was disabled as a configuration option or if a maximum overflow count was configured and has been satisfied, in which case the live journal will NOT overflow.

See Also:
OverflowManager.Options.OVERFLOW_ENABLED, OverflowManager.Options.OVERFLOW_MAX_COUNT

isOverflowAllowed

public boolean isOverflowAllowed()
true unless an overflow event is currently being processed.


shutdown

public void shutdown()
Description copied from interface: IServiceShutdown
The service will no longer accept new requests, but existing requests will be processed (sychronous). This method should await the termination of pending requests, but no longer than the timeout specified by IServiceShutdown.Options.SHUTDOWN_TIMEOUT. Implementations SHOULD be synchronized. If the service is aleady shutdown, then this method should be a NOP.

Specified by:
shutdown in interface IServiceShutdown
Overrides:
shutdown in class StoreManager

shutdownNow

public void shutdownNow()
Description copied from interface: IServiceShutdown
The service will no longer accept new requests and will make a best effort attempt to terminate all existing requests and return ASAP. This method should terminate any asynchronous processing, release all resources and return immediately. Implementations SHOULD be synchronized. If the service is aleady shutdown, then this method should be a NOP.

Specified by:
shutdownNow in interface IServiceShutdown
Overrides:
shutdownNow in class StoreManager

shouldOverflow

public boolean shouldOverflow()
An overflow condition is recognized when the journal is within some declared percentage of Options.MAXIMUM_EXTENT. However, this method will return false if overflow has been disabled or if there is an asynchronous overflow operation in progress.

Returns:
true if overflow processing should occur.

overflow

public Future<Object> overflow()
Core method for overflow with post-processing.

Note: This method does not test preconditions based on the extent of the journal.

Note: The caller is responsible for ensuring that this method is invoked with an exclusive lock on the write service.

Preconditions:

  1. Exclusive lock on the WriteExecutorService
  2. isOverflowAllowed()

Post-conditions:

  1. Overflowed onto new journal
  2. PostProcessOldJournal task was submitted.
  3. isOverflowAllowed() was set false and will remain false until PostProcessOldJournal

Returns:
The Future for the task handling post-processing of the old journal.
TODO:
write unit test for an overflow edge case in which we attempt to perform an read-committed task on a pre-existing index immediately after an overflow() and verify that a commit record exists on the new journal and that the read-committed task can read from the fused view of the new (empty) index on the new journal and the old index on the old journal.

doSynchronousOverflow

protected OverflowMetadata doSynchronousOverflow()
Synchronous overflow processing.

This is invoked once all preconditions have been satisfied.

Index partitions that have fewer than some threshold #of index entries will be copied onto the new journal. Otherwise the view of the index will be re-defined to place writes on the new journal and read historical data from the old journal.

This uses StoreManager.purgeOldResources() to delete old resources from the local file system that are no longer required as determined by StoreManager.setReleaseTime(long) and #getEffectiveReleaseTime().

Note: This method does NOT start a AsynchronousOverflowTask.

Note: You MUST have an exclusive lock on the WriteExecutorService before you invoke this method!

Returns:
Metadata about the overflow operation including whether or not asynchronous should be performed.

getHostCounter

protected double getHostCounter(String path,
                                double defaultValue)
Return the value of a host counter.

Parameters:
path - The path (relative to the host root).
defaultValue - The default value to use if the counter was not found.
Returns:
The value if found and otherwise the defaultValue.

getServiceCounter

protected double getServiceCounter(String path,
                                   double defaultValue)
Return the value of a service counter.

Parameters:
path - The path (relative to the service root).
defaultValue - The default value to use if the counter was not found.
Returns:
The value if found and otherwise the defaultValue.


Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.