|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
public static interface AbstractResource.Options
Options for locatable resources.
IJoinNexus and IJoinNexusFactory., some of these defaults need to be re-examined. see notes in the
javadoc below.| Field Summary | |
|---|---|
static String |
CHUNK_CAPACITY
Sets the capacity of the IBuffers used to accumulate a chunk
when evaluating rules, etc (default ). |
static String |
CHUNK_OF_CHUNKS_CAPACITY
Set the maximum #of chunks from concurrent producers that can be buffered before an IBuffer containing chunks of
ISolutions would block (default
DEFAULT_CHUNK_OF_CHUNKS_CAPACITY). |
static String |
CHUNK_TIMEOUT
The timeout in milliseconds that the BlockingBuffer will wait
for another chunk to combine with the current chunk before returning
the current chunk (default DEFAULT_CHUNK_TIMEOUT). |
static String |
DEFAULT_CHUNK_CAPACITY
Default for CHUNK_CAPACITY |
static String |
DEFAULT_CHUNK_OF_CHUNKS_CAPACITY
Default for CHUNK_OF_CHUNKS_CAPACITY |
static String |
DEFAULT_CHUNK_TIMEOUT
The default for CHUNK_TIMEOUT. |
static String |
DEFAULT_FORCE_SERIAL_EXECUTION
|
static String |
DEFAULT_FULLY_BUFFERED_READ_THRESHOLD
Default for FULLY_BUFFERED_READ_THRESHOLD |
static String |
DEFAULT_MAX_PARALLEL_SUBQUERIES
|
static String |
FORCE_SERIAL_EXECUTION
When true ("true"),
rule sets will be forced to execute sequentially even when they are
not flagged as a sequential program. |
static String |
FULLY_BUFFERED_READ_THRESHOLD
If the estimated rangeCount for an AbstractAccessPath.iterator() is LTE this threshold then use
a fully buffered (synchronous) iterator. |
static String |
MAX_PARALLEL_SUBQUERIES
The maximum #of subqueries for the first join dimension that will be issued in parallel. |
static String |
NESTED_SUBQUERY
Boolean option controls the JOIN evaluation strategy. |
| Field Detail |
|---|
static final String CHUNK_OF_CHUNKS_CAPACITY
Set the maximum #of chunks from concurrent producers that can be
buffered before an IBuffer containing chunks of
ISolutions would block (default
DEFAULT_CHUNK_OF_CHUNKS_CAPACITY). This is used to
provision a BlockingQueue for BlockingBuffer. A
value of ZERO(0) indicates that a SynchronousQueue should be
used instead. The best value may be more than the #of concurrent
producers if the producers are generating small chunks, e.g., because
there are few solutions for a join subquery.
static final String DEFAULT_CHUNK_OF_CHUNKS_CAPACITY
CHUNK_OF_CHUNKS_CAPACITY
static final String CHUNK_CAPACITY
Sets the capacity of the IBuffers used to accumulate a chunk
when evaluating rules, etc (default ). Note
that many processes use a BlockingBuffer to accumulate
"chunks of chunks".
CHUNK_OF_CHUNKS_CAPACITYstatic final String DEFAULT_CHUNK_CAPACITY
CHUNK_CAPACITY
Note: This used to be 20k, but chunks of chunks works better than just a large chunk.
static final String CHUNK_TIMEOUT
BlockingBuffer will wait
for another chunk to combine with the current chunk before returning
the current chunk (default DEFAULT_CHUNK_TIMEOUT). This may
be ZERO (0) to disable the chunk combiner.
static final String DEFAULT_CHUNK_TIMEOUT
CHUNK_TIMEOUT.
static final String FULLY_BUFFERED_READ_THRESHOLD
AbstractAccessPath.iterator() is LTE this threshold then use
a fully buffered (synchronous) iterator. Otherwise use an
asynchronous iterator whose capacity is governed by
CHUNK_OF_CHUNKS_CAPACITY.
static final String DEFAULT_FULLY_BUFFERED_READ_THRESHOLD
FULLY_BUFFERED_READ_THRESHOLD
static final String FORCE_SERIAL_EXECUTION
true ("true"),
rule sets will be forced to execute sequentially even when they are
not flagged as a sequential program.
AbstractTripleStore. and should be relocated.
The #CLOSURE_CLASS option defaults to
FastClosure, which has very little possible
parallelism (it is mostly a sequential program by nature). For
that reason, FORCE_SERIAL_EXECUTION defaults to
false since the overhead of parallel execution
is more likely to lower the observed performance with such
limited possible parallelism. However, when using
FullClosure the benefits of parallelism MAY justify its
overhead.
The following data are for LUBM datasets.
U1 Fast Serial : closure = 2250ms; 2765, 2499, 2530 U1 Fast Parallel : closure = 2579ms; 2514, 2594 U1 Full Serial : closure = 10437ms. U1 Full Parallel : closure = 10843ms. U10 Fast Serial : closure = 41203ms, 39171ms (38594, 35360 when running in caller's thread rather than on the executorService). U10 Fast Parallel : closure = 30722ms. U10 Full Serial : closure = 108110ms. U10 Full Parallel : closure = 248550ms.Note that the only rules in the fast closure program that have potential parallelism are
RuleFastClosure5 and
RuleFastClosure6 and these rules are not being triggered by
these datasets, so there is in fact NO potential parallelism (in the
data) for these datasets.
It is possible that a machine with more cores would perform better under the "full" closure program with parallel rule execution (these data were collected on a laptop with 2 cores) since performance tends to be CPU bound for small data sets. However, the benefit of the "fast" closure program is so large that there is little reason to consider parallel rule execution for the "full" closure program., collect new timings for this option. The LUBM performance has basically doubled since these data were collected. Look further into ways in which overhead might be reduced for rule parallelism and also for when rule parallelism is not enabled., rename as parallel_rule_execution.
static final String DEFAULT_FORCE_SERIAL_EXECUTION
static final String MAX_PARALLEL_SUBQUERIES
ExecutorService entirely and ONE (1) to submit a single task
at a time to the ExecutorService.
NestedSubqueryWithJoinThreadsTask behaves as stated,
but may be refactored to allow this parallelism per join
dimension. The JoinMasterTask interprets this as a
per-join dimension parallelism (the parallelism limit is
currently imposed by a per JoinTask
ExecutorService, which must be explicitly enabled in
the code).static final String DEFAULT_MAX_PARALLEL_SUBQUERIES
static final String NESTED_SUBQUERY
true, NestedSubqueryWithJoinThreadsTask is
used to compute joins. When false,
JoinMasterTask is used instead (aka pipeline joins).
Note: The default depends on the deployment mode. Nested subquery
joins are somewhat faster for local data (temporary stores, journals,
and a federation that does not support scale-out). However, pipeline
joins are MUCH faster for scale-out so they are used by default
whenever IBigdataFederation.isScaleOut() reports
true.
Note: Cold query performance for complex high volume queries appears to be better for the pipeline join, so it may make sense to use the pipeline join even for local data.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||