com.bigdata.rdf.sparql.ast
Interface QueryHints


public interface QueryHints

Query hints are directives understood by the SPARQL end point. A query hint appears in the SPARQL query as a "virtual triple". A query hint is declared in a QueryHintScope, which specifies the parts of the SPARQL query to which it will be applied. A list of the common directives is declared by this interface. (Query hints declared elsewhere are generally for internal use only.) Note that not all query hints are permitted in all scopes.

Version:
$Id: QueryHints.java 6313 2012-05-09 15:31:31Z thompsonbry $
Author:
Bryan Thompson
See Also:
QueryHintScope, QueryHintRegistry

Field Summary
static String ACCESS_PATH_SAMPLE_LIMIT
          The #of samples to take when comparing the cost of a SCAN with an IN filter to as-bound evaluation for each graph in the data set (default 100).
static String ACCESS_PATH_SCAN_AND_FILTER
          For named and default graph access paths where access path cost estimation is disabled by setting the ACCESS_PATH_SAMPLE_LIMIT to ZERO (0), this query hint determines whether a SCAN + FILTER or PARALLEL SUBQUERY (aka as-bound data set join) approach.
static String ANALYTIC
          When true, enables all query hints pertaining to analytic query patterns.
static String AT_ONCE
          Query hint indicating whether or not a JOIN (including SERVICE, SUB-SELECT, etc) should be run as an "at-once" operator.
static String CHUNK_SIZE
          The target chunk (aka vector size) for the operator.
static String CUTOFF_LIMIT
          Used to mark a statement pattern with a cutoff limit for how many elements (maximum) should be read from its access path.
static int DEFAULT_ACCESS_PATH_SAMPLE_LIMIT
          Note: Set to ZERO to disable AP sampling for default and named graphs.
static boolean DEFAULT_ACCESS_PATH_SCAN_AND_FILTER
          Note: To ALWAYS use either SCAN + FILTER or PARALLEL subquery, set DEFAULT_ACCESS_PATH_SAMPLE_LIMIT to ZERO (0) and set this to the desired method for named graph and default graph evaluation.
static boolean DEFAULT_ANALYTIC
           
static boolean DEFAULT_HASH_JOIN
           
static boolean DEFAULT_MERGE_JOIN
           
static boolean DEFAULT_NATIVE_DISTINCT_SOLUTIONS
           
static boolean DEFAULT_NATIVE_DISTINCT_SPO
           
static long DEFAULT_NATIVE_DISTINCT_SPO_THRESHOLD
           
static boolean DEFAULT_NATIVE_HASH_JOINS
           
static double DEFAULT_OPTIMISTIC
           
static QueryOptimizerEnum DEFAULT_OPTIMIZER
           
static boolean DEFAULT_REIFICATION_DONE_RIGHT
           
static boolean DEFAULT_REMOTE_APS
           
static boolean DEFAULT_SOLUTION_SET_CACHE
           
static String HASH_JOIN
          Query hint to use a hash join against the access path for a given predicate.
static String MAX_PARALLEL
          The maximum parallelism for the operator within the query.
static String MERGE_JOIN
          When true, a merge-join pattern will be recognized if it appears in a join group.
static String NAMESPACE
          The namespace for the bigdata query hints.
static String NATIVE_DISTINCT_SOLUTIONS
          When true, will use the version of DISTINCT SOLUTIONS based on the HTree and the native (C process) heap.
static String NATIVE_DISTINCT_SPO
          When true and the range count of the default graph access path exceeds the NATIVE_DISTINCT_SPO_THRESHOLD, will use the version of DISTINCT SPO for a hash join against a DEFAULT GRAPH access path based on the HTree and the native (C process) heap.
static String NATIVE_DISTINCT_SPO_THRESHOLD
          The minimum range count for a default graph access path before the native DISTINCT SPO filter will be used.
static String NATIVE_HASH_JOINS
          When true, use hash index operations based on the HTree and backed by the native (C process) heap.
static String OPTIMISTIC
          Query hint sets the optimistic threshold for the static join order optimizer.
static String OPTIMIZER
          Specify the join order optimizer.
static String QUERYID
          The UUID to be assigned to the IRunningQuery (optional).
static String RANGE_SAFE
          Used to mark a predicate as "range safe" - that is, we can safely apply the range bop to constrain the predicate.
static String REIFICATION_DONE_RIGHT
          Option controls whether or not the proposed SPARQL extension for reification done right is enabled.
static String REMOTE_APS
          When true, force the use of REMOTE access paths in scale-out joins.
static String RUN_FIRST
          This query hint may be applied to any IJoinNode and marks a particular join to be run first among in a particular group.
static String RUN_LAST
          This query hint may be applied to any IJoinNode and marks a particular join to be run last among in a particular group.
static String RUN_ONCE
          Query hint indicating whether or not a Sub-Select should be transformed into a named subquery, lifting its evaluation out of the main body of the query and replacing the subquery with an INCLUDE.
static String SOLUTION_SET_CACHE
          Option controls whether or not the bigdata extension to SPARQL Update for named solution sets is enabled.
 

Field Detail

NAMESPACE

static final String NAMESPACE
The namespace for the bigdata query hints.

See Also:
Constant Field Values

OPTIMIZER

static final String OPTIMIZER
Specify the join order optimizer. For example, you can disable the query optimizer within some join group using
 hint:Group hint:optimizer "None".
 
Disabling the join order optimizer can be useful if you have a query for which the static optimizer is producing a inefficient join ordering. With the query optimizer disabled for that query, the joins will be run in the order given. This makes it possible for you to decide on the right join ordering for that query.

See Also:
QueryOptimizerEnum, Constant Field Values

DEFAULT_OPTIMIZER

static final QueryOptimizerEnum DEFAULT_OPTIMIZER

OPTIMISTIC

static final String OPTIMISTIC
Query hint sets the optimistic threshold for the static join order optimizer.

See Also:
Constant Field Values

DEFAULT_OPTIMISTIC

static final double DEFAULT_OPTIMISTIC

ANALYTIC

static final String ANALYTIC
When true, enables all query hints pertaining to analytic query patterns. When false, those features are disabled.

Note: This query hint MUST be applied in the QueryHintScope.Query . Hash indices are often created by one operator and then consumed by another so the same kinds of hash indices MUST be used throughout the query.

 hint:Query hint:analytic "true".
 

See Also:
NATIVE_DISTINCT_SPO, NATIVE_DISTINCT_SOLUTIONS, NATIVE_HASH_JOINS, MERGE_JOIN, Constant Field Values

DEFAULT_ANALYTIC

static final boolean DEFAULT_ANALYTIC
See Also:
Constant Field Values

NATIVE_DISTINCT_SOLUTIONS

static final String NATIVE_DISTINCT_SOLUTIONS
When true, will use the version of DISTINCT SOLUTIONS based on the HTree and the native (C process) heap. When false, use the version based on a JVM collection class. The JVM version does not scale-up as well, but it offers higher concurrency.

See Also:
Constant Field Values

DEFAULT_NATIVE_DISTINCT_SOLUTIONS

static final boolean DEFAULT_NATIVE_DISTINCT_SOLUTIONS
See Also:
Constant Field Values

NATIVE_DISTINCT_SPO

static final String NATIVE_DISTINCT_SPO
When true and the range count of the default graph access path exceeds the NATIVE_DISTINCT_SPO_THRESHOLD, will use the version of DISTINCT SPO for a hash join against a DEFAULT GRAPH access path based on the HTree and the native (C process) heap. When false, use the version based on a JVM collection class. The JVM version does not scale-up as well.

See Also:
Constant Field Values

DEFAULT_NATIVE_DISTINCT_SPO

static final boolean DEFAULT_NATIVE_DISTINCT_SPO
See Also:
Constant Field Values

NATIVE_DISTINCT_SPO_THRESHOLD

static final String NATIVE_DISTINCT_SPO_THRESHOLD
The minimum range count for a default graph access path before the native DISTINCT SPO filter will be used.

See Also:
NATIVE_DISTINCT_SPO, Constant Field Values

DEFAULT_NATIVE_DISTINCT_SPO_THRESHOLD

static final long DEFAULT_NATIVE_DISTINCT_SPO_THRESHOLD
See Also:
Constant Field Values

NATIVE_HASH_JOINS

static final String NATIVE_HASH_JOINS
When true, use hash index operations based on the HTree and backed by the native (C process) heap. When false, use hash index operations based on the Java collection classes. The HTree is more scalable but has higher overhead for small cardinality hash joins.

Note: This query hint MUST be applied in the QueryHintScope.Query . Hash indices are often created by one operator and then consumed by another so the same kinds of hash indices MUST be used throughout the query.

See Also:
Constant Field Values

DEFAULT_NATIVE_HASH_JOINS

static final boolean DEFAULT_NATIVE_HASH_JOINS
See Also:
Constant Field Values

MERGE_JOIN

static final String MERGE_JOIN
When true, a merge-join pattern will be recognized if it appears in a join group. When false, this can still be selectively enabled using a query hint.

See Also:
Constant Field Values

DEFAULT_MERGE_JOIN

static final boolean DEFAULT_MERGE_JOIN
See Also:
Constant Field Values

REMOTE_APS

static final String REMOTE_APS
When true, force the use of REMOTE access paths in scale-out joins. This is intended as a tool when analyzing query patterns in scale-out. It should normally be false.

See Also:
Constant Field Values

DEFAULT_REMOTE_APS

static final boolean DEFAULT_REMOTE_APS
See Also:
https://sourceforge.net/apps/trac/bigdata/ticket/380#comment:4, Constant Field Values

ACCESS_PATH_SAMPLE_LIMIT

static final String ACCESS_PATH_SAMPLE_LIMIT
The #of samples to take when comparing the cost of a SCAN with an IN filter to as-bound evaluation for each graph in the data set (default 100). The samples are taken from the data set. Each sample is a graph (aka context) in the data set. The range counts and estimated cost to visit the AP for each of the sampled contexts are combined to estimate the total cost of visiting all of the contexts in the NG or DG access path.

When ZERO (0), no cost estimation will be performed and the named graph or default graph join will always use approach specified by the boolean ACCESS_PATH_SCAN_AND_FILTER.

See Also:
Constant Field Values

DEFAULT_ACCESS_PATH_SAMPLE_LIMIT

static final int DEFAULT_ACCESS_PATH_SAMPLE_LIMIT
Note: Set to ZERO to disable AP sampling for default and named graphs.

See Also:
Constant Field Values

ACCESS_PATH_SCAN_AND_FILTER

static final String ACCESS_PATH_SCAN_AND_FILTER
For named and default graph access paths where access path cost estimation is disabled by setting the ACCESS_PATH_SAMPLE_LIMIT to ZERO (0), this query hint determines whether a SCAN + FILTER or PARALLEL SUBQUERY (aka as-bound data set join) approach.

See Also:
Constant Field Values

DEFAULT_ACCESS_PATH_SCAN_AND_FILTER

static final boolean DEFAULT_ACCESS_PATH_SCAN_AND_FILTER
Note: To ALWAYS use either SCAN + FILTER or PARALLEL subquery, set DEFAULT_ACCESS_PATH_SAMPLE_LIMIT to ZERO (0) and set this to the desired method for named graph and default graph evaluation. Note that you MAY still override this behavior within a given scope using a query hint.

See Also:
Constant Field Values

QUERYID

static final String QUERYID
The UUID to be assigned to the IRunningQuery (optional). This query hint makes it possible for the application to assign the UUID under which the query will run. This can be used to locate the IRunningQuery using its UUID and gather metadata about the query during its evaluation. The IRunningQuery may be used to monitor the query or even cancel a query.

The UUID of each query MUST be distinct. When using this query hint the application assumes responsibility for applying UUID.randomUUID() to generate a unique UUID for the query. The application may then discover the IRunningQuery using QueryEngineFactory.getQueryController(com.bigdata.journal.IIndexManager) and QueryEngine.getQuery(UUID).

Note: The openrdf iteration interface has a close() method, but this can not be invoked until hasNext() has run and the first solution has been materialized. For queries which use an "at-once" operator, such as ORDER BY, the query will run to completion before hasNext() returns. This means that it is effectively impossible to interrupt a running query which uses an ORDER BY clause from the SAIL. However, applications MAY use this query hint to discovery the IRunningQuery interface and cancel the query.

 hint:Query hint:queryId "36cff615-aaea-418a-bb47-006699702e45"
 

See Also:
https://sourceforge.net/apps/trac/bigdata/ticket/283, Constant Field Values

RUN_FIRST

static final String RUN_FIRST
This query hint may be applied to any IJoinNode and marks a particular join to be run first among in a particular group. Only one "run first" join is permitted in a given group. This query hint is not permitted on optional joins.

See Also:
Constant Field Values

RUN_LAST

static final String RUN_LAST
This query hint may be applied to any IJoinNode and marks a particular join to be run last among in a particular group. Only one "run last" join is permitted in a given group.

See Also:
Constant Field Values

RUN_ONCE

static final String RUN_ONCE
Query hint indicating whether or not a Sub-Select should be transformed into a named subquery, lifting its evaluation out of the main body of the query and replacing the subquery with an INCLUDE. This is similar to 'at-once' evaluation, but creates a different query plan by lifting out a named subquery. It is also only supported for a Sub-Select. The AT_ONCE query hint can be applied to other things as well.

When true, the subquery will be lifted out. When false, the subquery will not be lifted unless other semantics require that it be lifted out regardless.

For example, the following may be used to lift out the sub-select in which it appears into a NamedSubqueryRoot. The lifted expression will be executed exactly once.

 hint:SubQuery hint:runOnce "true" .
 

See Also:
AT_ONCE, Constant Field Values

AT_ONCE

static final String AT_ONCE
Query hint indicating whether or not a JOIN (including SERVICE, SUB-SELECT, etc) should be run as an "at-once" operator. All solutions for an "at-once" operator are materialized before the operator is evaluated. It is then evaluated against those materialized solutions exactly once.

Note: "At-once" evaluation is a general property of the query engine. This query hint does not change the structure of the query plan, but simply serves as a directive to the query engine that it should buffer all source solutions before running the operator. This is more general purpose than the RUN_ONCE query hint.

See Also:
TODO "Blocked" evaluation. Blocked evaluation is similar to at-once evaluation but lacks the strong guarantee of that the operator will run exactly once. For blocked evaluation, the solutions to be fed to the operator are buffered up to a memory limit. If that memory limit is reached, then the buffered solutions are vectored through the operator. If all solutions can be buffered within the memory limit then "at-once" and "blocked" evaluation amount to the same thing., Constant Field Values

CHUNK_SIZE

static final String CHUNK_SIZE
The target chunk (aka vector size) for the operator.

Note: The vectored query engine will buffer multiple chunks for an operator before the producer(s) (the operator(s) feeding into the annotated operator) must block.

See Also:
BufferAnnotations.CHUNK_CAPACITY, Constant Field Values

MAX_PARALLEL

static final String MAX_PARALLEL
The maximum parallelism for the operator within the query.

See Also:
PipelineOp.Annotations#MAX_PARALLEL, Constant Field Values

HASH_JOIN

static final String HASH_JOIN
Query hint to use a hash join against the access path for a given predicate. Hash joins should be enabled once it is recognized that the #of as-bound probes of the predicate will approach or exceed the range count of the predicate.

Note: HashJoinAnnotations.JOIN_VARS MUST also be specified for the predicate. The join variable(s) are variables which are (a) bound by the predicate and (b) are known bound in the source solutions. The query planner has the necessary context to figure this out based on the structure of the query plan and the join evaluation order.

See Also:
Constant Field Values

DEFAULT_HASH_JOIN

static final boolean DEFAULT_HASH_JOIN
See Also:
Constant Field Values

SOLUTION_SET_CACHE

static final String SOLUTION_SET_CACHE
Option controls whether or not the bigdata extension to SPARQL Update for named solution sets is enabled.

See Also:
SPARQL Update , Constant Field Values

DEFAULT_SOLUTION_SET_CACHE

static final boolean DEFAULT_SOLUTION_SET_CACHE
See Also:
Constant Field Values

REIFICATION_DONE_RIGHT

static final String REIFICATION_DONE_RIGHT
Option controls whether or not the proposed SPARQL extension for reification done right is enabled.

See Also:
Reification Done Right, Constant Field Values

DEFAULT_REIFICATION_DONE_RIGHT

static final boolean DEFAULT_REIFICATION_DONE_RIGHT
See Also:
Constant Field Values

RANGE_SAFE

static final String RANGE_SAFE
Used to mark a predicate as "range safe" - that is, we can safely apply the range bop to constrain the predicate. This can only be used currently when there is a single datatype for attribute values.

See Also:
Constant Field Values

CUTOFF_LIMIT

static final String CUTOFF_LIMIT
Used to mark a statement pattern with a cutoff limit for how many elements (maximum) should be read from its access path. This effectively limits the input into the join.

See Also:
Annotations#CUTOFF_LIMIT}., Constant Field Values


Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.