com.bigdata.bop.joinGraph
Class PartitionedJoinGroup

java.lang.Object
  extended by com.bigdata.bop.joinGraph.PartitionedJoinGroup

Deprecated. by StaticAnalysis_CanJoin which is a port of this code to the AST mode.

public class PartitionedJoinGroup
extends Object

Class accepts a join group and partitions it into a join graph and a tail plan.

A join group consists of an ordered collection of IPredicates and an unordered collection of IConstraints. IPredicate representing non-optional joins are extracted into a JoinGraph along with any IConstraints whose variables are guaranteed to be bound by the implied joins.

The remainder of the IPredicates and IConstraints form a "tail plan". IConstraints in the tail plan are attached to the last IPredicate at which their variable(s) MIGHT have become bound.

Author:
Bryan Thompson
TODO:
Things like LET can also bind variables. So can a subquery. Analysis of those will tell us whether the variable will definitely or conditionally become bound (I am assuming that a LET can conditionally leave a variable unbound). See Bind., runFirst flag on the expander (for free text search). this should be an annotation. this can be a [headPlan]. [There can be constraints which are evaluated against the head plan. They need to get attached to the joins generated for the head plan. MikeP writes: There is a free text search access path that replaces the actual access path for the predicate, which is meaningless in an of itself because the P is magical.]

Constructor Summary
PartitionedJoinGroup(IPredicate<?>[] sourcePreds, IConstraint[] constraints)
          Deprecated. Analyze a set of IPredicates representing "runFirst", optional joins, and non-optional joins which may be freely reordered together with a collection of IConstraints and partition them into a join graph and a tail plan.
 
Method Summary
static boolean canJoin(IPredicate<?> p1, IPredicate<?> p2)
          Deprecated. Return true iff two predicates can join on the basis of at least one variable which is shared directly by those predicates.
static boolean canJoinUsingConstraints(IPredicate<?>[] path, IPredicate<?> vertex, IConstraint[] constraints)
          Deprecated. Return true iff a predicate may be used to extend a join path on the basis of at least one variable which is shared either directly or via one or more constraints which may be attached to the predicate when it is added to the join path.
 IPredicate<?>[] getJoinGraph()
          Deprecated. The IPredicates in the join graph (required joins).
 IConstraint[] getJoinGraphConstraints()
          Deprecated. The IConstraints to be applied to the IPredicates in the join graph.
 IConstraint[] getJoinGraphConstraints(int[] pathIds, boolean pathIsComplete)
          Deprecated. Return the set of constraints which should be attached to the last join in the given the join path.
static IConstraint[][] getJoinGraphConstraints(IPredicate<?>[] path, IConstraint[] joinGraphConstraints, IVariable<?>[] knownBoundVars, boolean pathIsComplete)
          Deprecated. Given a join path, return the set of constraints to be associated with each join in that join path.
 Set<IVariable<?>> getJoinGraphVars()
          Deprecated. The set of variables bound by the non-optional predicates (either the head plan or the join graph).
static PipelineOp getQuery(BOpIdFactory idFactory, boolean distinct, IVariable<?>[] selected, IPredicate<?>[] preds, IConstraint[] constraints)
          Deprecated. Generate a query plan from an ordered collection of predicates.
 IPredicate<?>[] getTailPlan()
          Deprecated. The IPredicates representing optional joins.
 IConstraint[] getTailPlanConstraints(int bopId)
          Deprecated. Return the set of IConstraints which should be evaluated when an identified predicate having SPARQL optional semantics is evaluated.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PartitionedJoinGroup

public PartitionedJoinGroup(IPredicate<?>[] sourcePreds,
                            IConstraint[] constraints)
Deprecated. 
Analyze a set of IPredicates representing "runFirst", optional joins, and non-optional joins which may be freely reordered together with a collection of IConstraints and partition them into a join graph and a tail plan. The resulting data structure can efficiently answer a variety of queries regarding joins, join paths, and constraints and can be used to formulate a complete query when combined with a desired join ordering.

Parameters:
knownBound - A set of variables which are known to be bound on entry.
sourcePreds - The predicates.
constraints - The constraints.
Throws:
IllegalArgumentException - if the source predicates array is null.
IllegalArgumentException - if the source predicates array is empty.
IllegalArgumentException - if any element of the source predicates array is null.
Method Detail

getJoinGraphVars

public Set<IVariable<?>> getJoinGraphVars()
Deprecated. 
The set of variables bound by the non-optional predicates (either the head plan or the join graph).


getJoinGraph

public IPredicate<?>[] getJoinGraph()
Deprecated. 
The IPredicates in the join graph (required joins).


getJoinGraphConstraints

public IConstraint[] getJoinGraphConstraints()
Deprecated. 
The IConstraints to be applied to the IPredicates in the join graph. Each IConstraint should be applied as soon as all of its variable(s) are known to be bound. The constraints are not attached to the IPredicates in the join graph because the evaluation order of those IPredicates is not yet known (it will be determined by a query optimizer when it decides on an evaluation order for those joins).


getJoinGraphConstraints

public IConstraint[] getJoinGraphConstraints(int[] pathIds,
                                             boolean pathIsComplete)
Deprecated. 
Return the set of constraints which should be attached to the last join in the given the join path. All joins in the join path must be non-optional joins (that is, part of either the head plan or the join graph).

The rule followed by this method is that each constraint will be attached to the first non-optional join at which all of its variables are known to be bound. It is assumed that constraints are attached to each join in the join path by a consistent logic, e.g., as dictated by this method.

Parameters:
joinPath - An ordered array of predicate identifiers representing a specific sequence of non-optional joins.
pathIsComplete - true iff the path represents a complete join path. When true, any constraints which have not already been attached will be attached to the last predicate in the join path.
Returns:
The constraints which should be attached to the last join in the join path.
Throws:
IllegalArgumentException - if the join path is null.
IllegalArgumentException - if the join path is empty.
IllegalArgumentException - if any element of the join path is null.
IllegalArgumentException - if any predicate specified in the join path is not known to this class.
IllegalArgumentException - if any predicate specified in the join path is optional.
TODO:
Implement (or refactor) the logic to decide which variables need to be propagated and which can be dropped. This decision logic will need to be available to the runtime query optimizer., This does not pay attention to the head plan. If there can be constraints on the head plan then either this should be modified such that it can decide where they attach or we need to have a method which does the same thing for the head plan.

getJoinGraphConstraints

public static IConstraint[][] getJoinGraphConstraints(IPredicate<?>[] path,
                                                      IConstraint[] joinGraphConstraints,
                                                      IVariable<?>[] knownBoundVars,
                                                      boolean pathIsComplete)
Deprecated. 
Given a join path, return the set of constraints to be associated with each join in that join path. Only those constraints whose variables are known to be bound will be attached.

Parameters:
path - The join path.
joinGraphConstraints - The constraints to be applied to the join path (optional).
knownBoundVars - Variables that are known to be bound as inputs to this join graph (parent queries).
pathIsComplete - true iff the path represents a complete join path. When true, any constraints which have not already been attached will be attached to the last predicate in the join path.
Returns:
The constraints to be paired with each element of the join path.
Throws:
IllegalArgumentException - if the join path is null.
IllegalArgumentException - if the join path is empty.
IllegalArgumentException - if any element of the join path is null.
IllegalArgumentException - if any element of the join graph constraints is null.

getTailPlan

public IPredicate<?>[] getTailPlan()
Deprecated. 
The IPredicates representing optional joins. Any IConstraints having variable(s) NOT bound by the required joins will already have been attached to the last IPredicate in the tail plan in which their variable(S) MIGHT have been bound.


getTailPlanConstraints

public IConstraint[] getTailPlanConstraints(int bopId)
Deprecated. 
Return the set of IConstraints which should be evaluated when an identified predicate having SPARQL optional semantics is evaluated. For constraints whose variables are not known to be bound when entering the tail plan, the constraint should be evaluated at the last predicate for which its variables MIGHT become bound.

Parameters:
bopId - The identifier for an IPredicate appearing in the tail plan.
Returns:
The set of constraints to be imposed by the join which evaluates that predicate. This will be an empty array if there are no constraints which can be imposed when that predicate is evaluated.
Throws:
IllegalArgumentException - if there is no such predicate in the tail plan.

canJoin

public static boolean canJoin(IPredicate<?> p1,
                              IPredicate<?> p2)
Deprecated. 
Return true iff two predicates can join on the basis of at least one variable which is shared directly by those predicates. Only the operands of the predicates are considered.

Note: This method will only identify joins where the predicates directly share at least one variable. However, joins are also possible when the predicates share variables via one or more constraint(s). Use canJoinUsingConstraints to identify such joins.

Note: Any two predicates may join regardless of the presence of shared variables. However, such joins will produce the full cross product of the binding sets selected by each predicate. As such, they should be run last and this method will not return true for such predicates.

Note: This method is more efficient than BOpUtility.getSharedVars(BOp, BOp) because it does not materialize the sets of shared variables. However, it only considers the operands of the IPredicates and is thus more restricted than BOpUtility.getSharedVars(BOp, BOp) as well.

Parameters:
p1 - A predicate.
p2 - Another predicate.
Returns:
true iff the predicates share at least one variable as an operand.
Throws:
IllegalArgumentException - if the two either reference is null.

canJoinUsingConstraints

public static boolean canJoinUsingConstraints(IPredicate<?>[] path,
                                              IPredicate<?> vertex,
                                              IConstraint[] constraints)
Deprecated. 
Return true iff a predicate may be used to extend a join path on the basis of at least one variable which is shared either directly or via one or more constraints which may be attached to the predicate when it is added to the join path. The join path is used to decide which variables are known to be bound, which in turn decides which constraints may be run. Unlike the case when the variable is directly shared between the two predicates, a join involving a constraint requires us to know which variables are already bound so we can know when the constraint may be attached.

Note: Use canJoin(IPredicate, IPredicate) instead to identify joins based on a variable which is directly shared.

Note: Any two predicates may join regardless of the presence of shared variables. However, such joins will produce the full cross product of the binding sets selected by each predicate. As such, they should be run last and this method will not return true for such predicates.

Parameters:
path - A join path containing at least one predicate.
vertex - A predicate which is being considered as an extension of that join path.
constraints - A set of zero or more constraints (optional). Constraints are attached dynamically once the variables which they use are bound. Hence, a constraint will always share a variable with any predicate to which it is attached. If any constraints are attached to the given vertex and they share a variable which has already been bound by the join path, then the vertex may join with the join path even if it does not directly bind that variable.
Returns:
true iff the vertex can join with the join path via a shared variable.
Throws:
IllegalArgumentException - if the join path is null.
IllegalArgumentException - if the join path is empty.
IllegalArgumentException - if any element in the join path is null.
IllegalArgumentException - if the vertex is null.
IllegalArgumentException - if the vertex is already part of the join path.
IllegalArgumentException - if any element in the optional constraints array is null.

getQuery

public static PipelineOp getQuery(BOpIdFactory idFactory,
                                  boolean distinct,
                                  IVariable<?>[] selected,
                                  IPredicate<?>[] preds,
                                  IConstraint[] constraints)
Deprecated. 
Generate a query plan from an ordered collection of predicates.

Parameters:
distinct - true iff only the distinct solutions are desired.
selected - The variable(s) to be projected out of the join graph.
preds - The join path which will be used to execute the join graph.
constraints - The constraints on the join graph.
Returns:
The query plan. FIXME Select only those variables required by downstream processing or explicitly specified by the caller (in the case when this is a subquery, the caller has to declare which variables are selected and will be returned out of the subquery). FIXME For scale-out, we need to either mark the join's evaluation context based on whether or not the access path is local or remote (and whether the index is key-range distributed or hash partitioned). FIXME Add a method to generate a runnable query plan from the collection of predicates and constraints on the PartitionedJoinGroup together with an ordering over the join graph. This is a bit different for the join graph and the optionals in the tail plan. The join graph itself should either be a JoinGraph operator which gets evaluated at run time or reordered by whichever optimizer is selected for the query (query hints).
TODO:
The order of the IPredicates in the tail plan is currently unchanged from their given order (optional joins without constraints can not reduce the selectivity of the query). However, it could be worthwhile to run optionals with constraints before those without constraints since the constraints can reduce the selectivity of the query. If we do this, then we need to reorder the optionals based on the partial order imposed what variables they MIGHT bind (which are not bound by the join graph)., multiple runFirst predicates can be evaluated in parallel unless they have shared variables. When there are no shared variables, construct a TEE pattern such that evaluation proceeds in parallel. When there are shared variables, the runFirst predicates must be ordered based on those shared variables (at which point, it is probably an error to flag them as runFirst).


Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.