com.bigdata.relation.rule.eval.pipeline
Class JoinStats

java.lang.Object
  extended by com.bigdata.relation.rule.eval.pipeline.JoinStats
All Implemented Interfaces:
Serializable

public class JoinStats
extends Object
implements Serializable

Statistics about processing for a single join dimension as reported by a single JoinTask. Each JoinTask handles a single index partition, so the JoinStats for those index partitions need to be aggregated by the JoinMasterTask.

Version:
$Id$
Author:
Bryan Thompson
See Also:
Serialized Form

Field Summary
 long accessPathCount
          The #of IAccessPaths read.
 long accessPathDups
          The #of duplicate IAccessPaths that were eliminated by a JoinTask.
 long bindingSetChunksIn
          The #of binding set chunks read from all source JoinTasks.
 long bindingSetChunksOut
          The #of IBindingSet chunks written onto the next join dimension (aka the #of solutions written iff this is the last join dimension in the evaluation order).
 long bindingSetsIn
          The #of binding sets read from all source JoinTasks.
 long bindingSetsOut
          The #of IBindingSets written onto the next join dimension (aka the #of solutions written iff this is the last join dimension).
 long chunkCount
          #of chunks visited over all access paths.
 long elementCount
          #of elements visited over all chunks.
 int fanIn
          The maximum observed fan in for this join dimension (maximum #of sources observed writing on any join task for this join dimension).
 int fanOut
          The maximum observed fan out for this join dimension (maximum #of sinks on which any join task is writing for this join dimension).
 AtomicLong mutationCount
          The mutationCount is the #of solutions output by a JoinTask(s) for the last join dimension of a mutation operation that were not already present in the target relation.
 int orderIndex
          The index in the evaluation order whose statistics are reported here.
 int partitionCount
          The #of index partitions for which join tasks were created for this join dimension.
 int partitionId
          The index partition for which these statistics were collected or -1 if the statistics are aggregated across index partitions.
 long startTime
          The timestamp associated with the start of execution for the join dimension.
 
Constructor Summary
JoinStats(int orderIndex)
          Ctor variant used by the JoinMasterTask to aggregate statistics across the index partitions for a given join dimension.
JoinStats(int partitionId, int orderIndex)
          Ctor variant used by a JoinTask to self-report.
 
Method Summary
 String toString()
           
static StringBuilder toString(IRule rule, IRuleState ruleState, JoinStats[] a)
          Formats the array of JoinStats into a CSV table view.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

startTime

public final long startTime
The timestamp associated with the start of execution for the join dimension. This is not aggregated. The timestamp is assigned when the JoinStats object is created. That corresponds either to the start of the distributed JoinMasterTask execution (aggregated level) or to the start of some specific JoinTask (detail level).


partitionId

public final int partitionId
The index partition for which these statistics were collected or -1 if the statistics are aggregated across index partitions.


orderIndex

public final int orderIndex
The index in the evaluation order whose statistics are reported here.


fanIn

public int fanIn
The maximum observed fan in for this join dimension (maximum #of sources observed writing on any join task for this join dimension). Since join tasks may be closed and new join tasks re-opened for the same query, join dimension and index partition, and since each join task for the same join dimension could, in principle, have a different fan in based on the actual binding sets propagated this is not necessarily the "actual" fan in for the join dimension.


fanOut

public int fanOut
The maximum observed fan out for this join dimension (maximum #of sinks on which any join task is writing for this join dimension). Since join tasks may be closed and new join tasks re-opened for the same query, join dimension and index partition, and since each join task for the same join dimension could, in principle, have a different fan out based on the actual binding sets propagated this is not necessarily the "actual" fan out for the join dimension.


partitionCount

public int partitionCount
The #of index partitions for which join tasks were created for this join dimension. This is computed by explicitly tracking the distinct index partition identifiers reported for the join dimension. This is the "real" fan out for the prior join dimension.


bindingSetChunksIn

public long bindingSetChunksIn
The #of binding set chunks read from all source JoinTasks.


bindingSetsIn

public long bindingSetsIn
The #of binding sets read from all source JoinTasks.


accessPathCount

public long accessPathCount
The #of IAccessPaths read. This will differ from #bindingSetIn iff the same IBindingSet is read from more than one source and the JoinTask is able to recognize the duplication and collapse it by removing the duplicate(s).


accessPathDups

public long accessPathDups
The #of duplicate IAccessPaths that were eliminated by a JoinTask. Duplicate IAccessPaths arise when the source JoinTask(s) generate the bindings on the IPredicate for a join dimension. Duplicates are detected by a JoinTask when it generates chunk of distinct JoinTask.AccessPathTasks from a chunk of IBindingSets read from its source(s) JoinTasks.

Note: While the IPredicates for those tasks may have the same bindings, the source IBindingSets typically (always?) have variety not represented in the bound IPredicate and therefore are combined under a single JoinTask.AccessPathTask. This reduces redundant reads on an IAccessPath while producing exactly the same output IBindingSets that would have been produced if we did not identify the duplicate IAccessPaths.


chunkCount

public long chunkCount
#of chunks visited over all access paths.


elementCount

public long elementCount
#of elements visited over all chunks.


bindingSetsOut

public long bindingSetsOut
The #of IBindingSets written onto the next join dimension (aka the #of solutions written iff this is the last join dimension).

Note: An IBindingSet can be written onto more than one index partition for the next join dimension, so one generated IBindingSet MAY result in N GTE ONE "binding sets out". This occurs when the IAccessPath required to read on the next IPredicate in the evaluation order spans more than one index partition.


bindingSetChunksOut

public long bindingSetChunksOut
The #of IBindingSet chunks written onto the next join dimension (aka the #of solutions written iff this is the last join dimension in the evaluation order).


mutationCount

public AtomicLong mutationCount
The mutationCount is the #of solutions output by a JoinTask(s) for the last join dimension of a mutation operation that were not already present in the target relation. This value is always zero (0L) for query.

Note: The mutationCount MUST be obtained from IBuffer.flush() for the buffer on which the JoinTask(s) for the last join dimension write their solutions. For mutation, this buffer is obligated to report the #of elements whose state was changed in the target relation. Failure to correctly obey this contract can result in non-termination of fix point closure operations.

See Also:
RuleStats.mutationCount
Constructor Detail

JoinStats

public JoinStats(int orderIndex)
Ctor variant used by the JoinMasterTask to aggregate statistics across the index partitions for a given join dimension.

Parameters:
orderIndex - The index in the evaluation order.

JoinStats

public JoinStats(int partitionId,
                 int orderIndex)
Ctor variant used by a JoinTask to self-report.

Parameters:
partitionId - The index partition identifier.
orderIndex - The index in the evaluation order.
Method Detail

toString

public String toString()
Overrides:
toString in class Object

toString

public static StringBuilder toString(IRule rule,
                                     IRuleState ruleState,
                                     JoinStats[] a)
Formats the array of JoinStats into a CSV table view.

Parameters:
rule - The IRule whose JoinStats are being reported.
ruleState - Contains details about evaluation order for the IPredicates in the tail of the rule, the access paths that were used, etc.
a - The JoinStats.
Returns:
The table view.


Copyright © 2006-2011 SYSTAP, LLC. All Rights Reserved.