|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.bigdata.service.mapred.AbstractMapTask
public abstract class AbstractMapTask
Abstract base class for IMapTasks.
Note: The presumption is that there is a distinct instance of the map task for each task executed and that each task is executed within a single-threaded environment.
Note: Any declared fields are materialized on the master and the service, so make the field transient unless you need to send it to the service and do not initialize anything large on the master (unless it is transient). Lazy initialization is nice since we only do it on the service.
| Field Summary | |
|---|---|
protected IHashFunction |
hashFunction
|
protected int |
nreduce
|
protected Object |
source
|
protected UUID |
uuid
|
| Fields inherited from interface com.bigdata.service.mapred.IMapTask |
|---|
log |
| Constructor Summary | |
|---|---|
protected |
AbstractMapTask(UUID uuid,
Object source,
Integer nreduce,
IHashFunction hashFunction)
|
| Method Summary | |
|---|---|
protected DataOutputBuffer |
getDataOutputBuffer()
The values may be formatted using this utility class. |
int[] |
getHistogram()
Return the histogram of the #of tuples in each output partition. |
protected IKeyBuilder |
getKeyBuilder()
The KeyBuilder MUST be used by the IMapTask so that
the generated keys will have a total ordering determined by their
interpretation as an unsigned byte[]. |
Object |
getSource()
The source from which the map task will read its data. |
int |
getTupleCount()
The #of tuples written by the task. |
com.bigdata.service.mapred.Tuple[] |
getTuples()
Return the tuples. |
UUID |
getUUID()
The unique identifier for the task. |
void |
output(byte[] val)
Hash partitions the tuple based on the key already in keyBuilder
into one of nreduce output buckets. |
protected void |
output(int partition,
byte[] key,
byte[] val)
Output a key-value pair (tuple) to the appropriate reduce task. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected final UUID uuid
protected final Object source
protected final int nreduce
protected final IHashFunction hashFunction
| Constructor Detail |
|---|
protected AbstractMapTask(UUID uuid,
Object source,
Integer nreduce,
IHashFunction hashFunction)
uuid - The UUID of the map task. This MUST be the same UUID each time
if a map task is re-executed for a given input. The UUID
(together with the tuple counter) is used to generate a key
that makes the map operation "retry safe". That is, the
operation may be executed one or more times and the result
will be the same. This guarentee arises because the values for
identical keys are overwritten during the reduce operation.source - The source from which the map task will read its data. This is
commonly a File in a networked file system but other
kinds of sources may be supported.nreduce - The #of reduce tasks that are being feed by this map task.hashFunction - The hash function used to hash partition the tuples generated
by the map task into the input sink for each of the reduce
tasks.| Method Detail |
|---|
protected IKeyBuilder getKeyBuilder()
KeyBuilder MUST be used by the IMapTask so that
the generated keys will have a total ordering determined by their
interpretation as an unsigned byte[].
protected DataOutputBuffer getDataOutputBuffer()
valBuilder.reset().append(foo).toByteArray();
public UUID getUUID()
ITaskNote: if a task is retried then the new instance of that task MUST have the same identifier.
getUUID in interface ITaskpublic Object getSource()
File in a networked file system but other kinds of sources may be
supported.
public com.bigdata.service.mapred.Tuple[] getTuples()
public int getTupleCount()
public void output(byte[] val)
keyBuilder
into one of nreduce output buckets. Forms a unique key using the
data already in keyBuilder and appending the task UUID and the
int32 tuple counter. Finally, invokes #output(byte[], byte[]) to
output the key-value pair. The resulting key preserves the key order,
groups all keys with the same value for the same map task, and finally
distinguishes individual key-value pairs using the tuple counter.
val - The value for the tuple.output(int,byte[], byte[])
protected void output(int partition,
byte[] key,
byte[] val)
partition - The output partition in [0:nreduce}.key - The complete key. The key MUST be encoded such that the keys
may be placed into a total order by interpreting them as an
unsigned byte[]. See KeyBuilder.val - The value. The value encoding is essentially arbitrary but the
DataOutputBuffer may be helpful here.public int[] getHistogram()
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||