com.bigdata.service.mapred.tasks
Class ExtractKeywords

java.lang.Object
  extended by com.bigdata.service.mapred.AbstractMapTask
      extended by com.bigdata.service.mapred.AbstractFileInputMapTask
          extended by com.bigdata.service.mapred.tasks.ExtractKeywords
All Implemented Interfaces:
IMapTask, ITask, Serializable

public class ExtractKeywords
extends AbstractFileInputMapTask

Tokenizes an input file, writing {key, term} tuples. The key is an compressed Unicode sort key. The term is a UTF-8 serialization of the term (it can be deserialized to recover the exact Unicode term).

Version:
$Id: ExtractKeywords.java 2265 2009-10-26 12:51:06Z thompsonbry $
Author:
Bryan Thompson
See Also:
CountKeywords, Serialized Form

Field Summary
static String UTF8
          The encoding used to serialize the term (the value of each tuple).
 
Fields inherited from class com.bigdata.service.mapred.AbstractMapTask
hashFunction, nreduce, source, uuid
 
Fields inherited from interface com.bigdata.service.mapred.IMapTask
log
 
Constructor Summary
ExtractKeywords(UUID uuid, Object source, Integer nreduce, IHashFunction hashFunction)
           
 
Method Summary
 void input(File file, InputStream is)
           
 
Methods inherited from class com.bigdata.service.mapred.AbstractFileInputMapTask
input
 
Methods inherited from class com.bigdata.service.mapred.AbstractMapTask
getDataOutputBuffer, getHistogram, getKeyBuilder, getSource, getTupleCount, getTuples, getUUID, output, output
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

UTF8

public static final transient String UTF8
The encoding used to serialize the term (the value of each tuple).

See Also:
Constant Field Values
Constructor Detail

ExtractKeywords

public ExtractKeywords(UUID uuid,
                       Object source,
                       Integer nreduce,
                       IHashFunction hashFunction)
Method Detail

input

public void input(File file,
                  InputStream is)
           throws Exception
Specified by:
input in class AbstractFileInputMapTask
Throws:
Exception


Copyright © 2006-2009 SYSTAP, LLC. All Rights Reserved.