com.bigdata.relation
Interface AbstractResource.Options

All Known Subinterfaces:
AbstractTripleStore.Options, BigdataSail.Options, LocalTripleStore.Options, TempTripleStore.Options
Enclosing class:
AbstractResource<E>

public static interface AbstractResource.Options

Options for locatable resources.

Version:
$Id: AbstractResource.java 6041 2012-02-20 20:46:45Z thompsonbry $
Author:
Bryan Thompson
TODO:
most of these options effect asynchronous iterators, access path behavior, and join behavior. these are general features for bigdata resources, but some of the code to support this stuff is still local to the RDF module. That can be fixed using an abstract base class for IJoinNexus and IJoinNexusFactory., some of these defaults need to be re-examined. see notes in the javadoc below.

Field Summary
static String CHUNK_CAPACITY
          Deprecated. by BOp annotations.
static String CHUNK_OF_CHUNKS_CAPACITY
          Deprecated. by BOp annotations.
static String CHUNK_TIMEOUT
          Deprecated. by BOp annotations.
static String DEFAULT_CHUNK_CAPACITY
          Deprecated. by BOp annotations.
static String DEFAULT_CHUNK_OF_CHUNKS_CAPACITY
          Deprecated. by BOp annotations.
static String DEFAULT_CHUNK_TIMEOUT
          Deprecated. by BOp annotations.
static String DEFAULT_FORCE_SERIAL_EXECUTION
          Deprecated. by BOp annotations.
static String DEFAULT_FULLY_BUFFERED_READ_THRESHOLD
          Deprecated. by BOp annotations.
static String DEFAULT_MAX_PARALLEL_SUBQUERIES
          Deprecated. by BOp annotations.
static String FORCE_SERIAL_EXECUTION
          Deprecated. by BOp annotations.
static String FULLY_BUFFERED_READ_THRESHOLD
          Deprecated. by BOp annotations.
static String MAX_PARALLEL_SUBQUERIES
          Deprecated. by BOp annotations.
 

Field Detail

CHUNK_OF_CHUNKS_CAPACITY

static final String CHUNK_OF_CHUNKS_CAPACITY
Deprecated. by BOp annotations.

Set the maximum #of chunks from concurrent producers that can be buffered before an IBuffer containing chunks of ISolutions would block (default DEFAULT_CHUNK_OF_CHUNKS_CAPACITY). This is used to provision a BlockingQueue for BlockingBuffer. A value of ZERO(0) indicates that a SynchronousQueue should be used instead. The best value may be more than the #of concurrent producers if the producers are generating small chunks, e.g., because there are few solutions for a join subquery.


DEFAULT_CHUNK_OF_CHUNKS_CAPACITY

static final String DEFAULT_CHUNK_OF_CHUNKS_CAPACITY
Deprecated. by BOp annotations.
Default for CHUNK_OF_CHUNKS_CAPACITY

See Also:
Constant Field Values

CHUNK_CAPACITY

static final String CHUNK_CAPACITY
Deprecated. by BOp annotations.

Sets the capacity of the IBuffers used to accumulate a chunk when evaluating rules, etc (default ). Note that many processes use a BlockingBuffer to accumulate "chunks of chunks".

See Also:
CHUNK_OF_CHUNKS_CAPACITY

DEFAULT_CHUNK_CAPACITY

static final String DEFAULT_CHUNK_CAPACITY
Deprecated. by BOp annotations.
Default for CHUNK_CAPACITY

Note: This used to be 20k, but chunks of chunks works better than just a large chunk.

See Also:
Constant Field Values

CHUNK_TIMEOUT

static final String CHUNK_TIMEOUT
Deprecated. by BOp annotations.
The timeout in milliseconds that the BlockingBuffer will wait for another chunk to combine with the current chunk before returning the current chunk (default DEFAULT_CHUNK_TIMEOUT). This may be ZERO (0) to disable the chunk combiner.


DEFAULT_CHUNK_TIMEOUT

static final String DEFAULT_CHUNK_TIMEOUT
Deprecated. by BOp annotations.
The default for CHUNK_TIMEOUT.

See Also:
Constant Field Values
TODO:
this is probably much larger than we want. Try 10ms.

FULLY_BUFFERED_READ_THRESHOLD

static final String FULLY_BUFFERED_READ_THRESHOLD
Deprecated. by BOp annotations.
If the estimated rangeCount for an AccessPath.iterator() is LTE this threshold then use a fully buffered (synchronous) iterator. Otherwise use an asynchronous iterator whose capacity is governed by CHUNK_OF_CHUNKS_CAPACITY.


DEFAULT_FULLY_BUFFERED_READ_THRESHOLD

static final String DEFAULT_FULLY_BUFFERED_READ_THRESHOLD
Deprecated. by BOp annotations.
Default for FULLY_BUFFERED_READ_THRESHOLD

See Also:
Constant Field Values
TODO:
figure out how good this value is.

FORCE_SERIAL_EXECUTION

static final String FORCE_SERIAL_EXECUTION
Deprecated. by BOp annotations.
When true ("true"), rule sets will be forced to execute sequentially even when they are not flagged as a sequential program.

TODO:
The following discussion applies to the AbstractTripleStore. and should be relocated.

The #CLOSURE_CLASS option defaults to FastClosure, which has very little possible parallelism (it is mostly a sequential program by nature). For that reason, FORCE_SERIAL_EXECUTION defaults to false since the overhead of parallel execution is more likely to lower the observed performance with such limited possible parallelism. However, when using FullClosure the benefits of parallelism MAY justify its overhead.

The following data are for LUBM datasets.

 U1  Fast Serial   : closure =  2250ms; 2765, 2499, 2530
 U1  Fast Parallel : closure =  2579ms; 2514, 2594
 U1  Full Serial   : closure = 10437ms.
 U1  Full Parallel : closure = 10843ms.
 
 U10 Fast Serial   : closure = 41203ms, 39171ms (38594, 35360 when running in caller's thread rather than on the executorService).
 U10 Fast Parallel : closure = 30722ms. 
 U10 Full Serial   : closure = 108110ms.
 U10 Full Parallel : closure = 248550ms.
 
Note that the only rules in the fast closure program that have potential parallelism are RuleFastClosure5 and RuleFastClosure6 and these rules are not being triggered by these datasets, so there is in fact NO potential parallelism (in the data) for these datasets.

It is possible that a machine with more cores would perform better under the "full" closure program with parallel rule execution (these data were collected on a laptop with 2 cores) since performance tends to be CPU bound for small data sets. However, the benefit of the "fast" closure program is so large that there is little reason to consider parallel rule execution for the "full" closure program., collect new timings for this option. The LUBM performance has basically doubled since these data were collected. Look further into ways in which overhead might be reduced for rule parallelism and also for when rule parallelism is not enabled., rename as parallel_rule_execution.


DEFAULT_FORCE_SERIAL_EXECUTION

static final String DEFAULT_FORCE_SERIAL_EXECUTION
Deprecated. by BOp annotations.
See Also:
Constant Field Values

MAX_PARALLEL_SUBQUERIES

static final String MAX_PARALLEL_SUBQUERIES
Deprecated. by BOp annotations.
The maximum #of subqueries for the first join dimension that will be issued in parallel. Use ZERO(0) to avoid submitting tasks to the ExecutorService entirely and ONE (1) to submit a single task at a time to the ExecutorService.


DEFAULT_MAX_PARALLEL_SUBQUERIES

static final String DEFAULT_MAX_PARALLEL_SUBQUERIES
Deprecated. by BOp annotations.
See Also:
Constant Field Values


Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.