|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
public interface IHashJoinUtility
Interface for hash index build and hash join operations.
This class also supports DISTINCT SOLUTIONS filters. For this use case, the
caller uses filterSolutions(ICloseableIterator, BOpStats, IBuffer)
method.
| Method Summary | |
|---|---|
long |
acceptSolutions(ICloseableIterator<IBindingSet[]> itr,
BOpStats stats)
Buffer solutions on a hash index. |
long |
filterSolutions(ICloseableIterator<IBindingSet[]> itr,
BOpStats stats,
IBuffer<IBindingSet> sink)
Filter solutions, writing only the DISTINCT solutions onto the sink. |
IVariable<?> |
getAskVar()
The variable bound based on whether or not a solution survives an "EXISTS" graph pattern (optional). |
IConstraint[] |
getConstraints()
The join constraints (optional). |
JoinTypeEnum |
getJoinType()
Return the type safe enumeration indicating what kind of operation is to be performed. |
IVariable<?>[] |
getJoinVars()
The join variables. |
long |
getRightSolutionCount()
Return the #of solutions in the hash index. |
IVariable<?>[] |
getSelectVars()
The variables to be retained (optional, all variables are retained if not specified). |
void |
hashJoin(ICloseableIterator<IBindingSet> leftItr,
IBuffer<IBindingSet> outputBuffer)
Do a hash join between a stream of source solutions (left) and a hash index (right). |
void |
hashJoin2(ICloseableIterator<IBindingSet> leftItr,
IBuffer<IBindingSet> outputBuffer,
IConstraint[] constraints)
Variant hash join method allows the caller to impose different constraints or additional constraints. |
boolean |
isEmpty()
Return true iff there are no solutions in the hash index. |
void |
mergeJoin(IHashJoinUtility[] others,
IBuffer<IBindingSet> outputBuffer,
IConstraint[] constraints,
boolean optional)
Perform an N-way merge join. |
void |
outputJoinSet(IBuffer<IBindingSet> out)
Output the solutions which joined. |
void |
outputOptionals(IBuffer<IBindingSet> outputBuffer)
Identify and output the optional solutions. |
void |
outputSolutions(IBuffer<IBindingSet> out)
Output the solutions buffered in the hash index. |
void |
release()
Discard the hash index. |
| Method Detail |
|---|
JoinTypeEnum getJoinType()
IVariable<?> getAskVar()
HashJoinAnnotations.ASK_VARIVariable<?>[] getJoinVars()
HashJoinAnnotations.JOIN_VARSIVariable<?>[] getSelectVars()
JoinAnnotations.SELECTIConstraint[] getConstraints()
JoinAnnotations.CONSTRAINTSboolean isEmpty()
true iff there are no solutions in the hash index.
long getRightSolutionCount()
void release()
long acceptSolutions(ICloseableIterator<IBindingSet[]> itr,
BOpStats stats)
When optional:=true, solutions which do not have a binding
for one or more of the join variables will be inserted into the hash
index anyway using hashCode:=1. This allows the solutions to
be discovered when we scan the hash index and the set of solutions which
did join to identify the optional solutions.
itr - The source from which the solutions will be drained.stats - The statistics to be updated as the solutions are buffered on
the hash index.
long filterSolutions(ICloseableIterator<IBindingSet[]> itr,
BOpStats stats,
IBuffer<IBindingSet> sink)
itr - The source solutions.stats - The stats to be updated.sink - The sink.
void hashJoin(ICloseableIterator<IBindingSet> leftItr,
IBuffer<IBindingSet> outputBuffer)
Note: Some JoinTypeEnums have side-effects on the join state. For
this joins, once method has been invoked for the final time, you must
then invoke either outputOptionals(IBuffer) (Optional or
NotExists) or outputJoinSet(IBuffer) (Exists).
leftItr - A stream of solutions to be joined against the hash index
(left).outputBuffer - Where to write the solutions which join.
void hashJoin2(ICloseableIterator<IBindingSet> leftItr,
IBuffer<IBindingSet> outputBuffer,
IConstraint[] constraints)
Note: Some JoinTypeEnums have side-effects on the join state. For
this joins, once method has been invoked for the final time, you must
then invoke either outputOptionals(IBuffer) (Optional or
NotExists) or outputJoinSet(IBuffer) (Exists).
leftItr - A stream of solutions to be joined against the hash index
(left).outputBuffer - Where to write the solutions which join.constraints - Constraints attached to this join (optional). Any constraints
specified here are combined with those specified in the
constructor.
void mergeJoin(IHashJoinUtility[] others,
IBuffer<IBindingSet> outputBuffer,
IConstraint[] constraints,
boolean optional)
The merge join takes a set of solution sets in the some order and having the same join variables. It examines the next solution in order for each solution set and compares them. For each solution set which reported a solution having the same join variables as that earliest solution, it outputs the cross product and advances the iterator on that solution set.
The iterators draining the source solution sets need to be synchronized such that we consider only solutions having the same hash code in each cycle of the MERGE JOIN. The synchronization step is different depending on whether or not the MERGE JOIN is OPTIONAL.
If the MERGE JOIN is REQUIRED, then we want to synchronize the source solution iterators on the next lowest key (aka hash code) which they all have in common.
If the MERGE JOIN is OPTIONAL, then we want to synchronize the source solution iterators on the next lowest key (aka hash code) which appears for any source iterator. Solutions will not be drawn from iterators not having that key in that pass.
Note that each hash code may be an alias for solutions having different values for their join variables. Such solutions will not join. However, only solutions having the same values for the hash code can join. Thus, by proceeding with synchronized iterators and operating only on solutions having the same hash code in each round, we will consider all solutions which COULD join with one another in each round.
Note: If the solutions are not in a stable and mutually consistent order
by hash code in the hash indices then the solutions in each hash index
MUST be SORTED before proceeding. (The HTree maintains solutions
in such an order but the JVM collections do not.)
others - The other solution sets to be joined. All instances must be of
the same concrete type as this.outputBuffer - Where to write the solutions.constraints - The join constraints.optional - true iff the join is optional.void outputOptionals(IBuffer<IBindingSet> outputBuffer)
Optionals are identified using a joinSet containing each right solution which joined with at least one left solution. The total set of right solutions is then scanned once. For each right solution, we probe the joinSet. If the right solution did not join, then it is output now as an optional join.
outputBuffer - Where to write the optional solutions.void outputSolutions(IBuffer<IBindingSet> out)
out - Where to write the solutions.void outputJoinSet(IBuffer<IBindingSet> out)
out - Where to write the solutions.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||