com.bigdata.rdf.load
Class MappedRDFDataLoadMaster<S extends MappedRDFDataLoadMaster.JobState,T extends AbstractAsynchronousClientTask<U,V,L>,U,L extends ClientLocator,V extends Serializable>

java.lang.Object
  extended by com.bigdata.service.jini.master.TaskMaster<S,T,U>
      extended by com.bigdata.service.jini.master.MappedTaskMaster<S,T,L,U,V>
          extended by com.bigdata.rdf.load.MappedRDFDataLoadMaster<S,T,U,L,V>
All Implemented Interfaces:
Callable<Void>

public class MappedRDFDataLoadMaster<S extends MappedRDFDataLoadMaster.JobState,T extends AbstractAsynchronousClientTask<U,V,L>,U,L extends ClientLocator,V extends Serializable>
extends MappedTaskMaster<S,T,L,U,V>

Distributed bulk loader for RDF data. Creates/(re-)opens the AbstractTripleStore, loads the optional ontology, and starts the clients. The clients will run until the master is canceled loading any data found in the JobState#dataDir. Files are optionally deleted after they have been successfully loaded. Closure may be optionally computed.

Version:
$Id: MappedRDFDataLoadMaster.java 6045 2012-02-27 17:33:44Z thompsonbry $
Author:
Bryan Thompson
TODO:
Support loading files from URLs, BFS, etc. This can be achieved via subclassing and overriding MappedTaskMaster.newClientTask(int) and newJobState(String, Configuration) as necessary.

Nested Class Summary
static interface MappedRDFDataLoadMaster.ConfigurationOptions
          Configuration options for the MappedRDFDataLoadMaster.
static class MappedRDFDataLoadMaster.JobState
          The job description for an MappedRDFDataLoadMaster.
 
Nested classes/interfaces inherited from class com.bigdata.service.jini.master.TaskMaster
TaskMaster.DiscoveredServices
 
Field Summary
protected static org.apache.log4j.Logger log
           
 
Fields inherited from class com.bigdata.service.jini.master.TaskMaster
fed
 
Constructor Summary
MappedRDFDataLoadMaster(JiniFederation fed)
           
 
Method Summary
protected  void beginJob(S jobState)
          Extended to open/create the KB.
protected  AbstractTripleStore createTripleStore()
          Create the AbstractTripleStore specified by MappedRDFDataLoadMaster.ConfigurationOptions.NAMESPACE using the properties associated with the TaskMaster.JobState#component.
protected  StringBuilder getKBInfo(AbstractTripleStore tripleStore)
          Return various interesting metadata about the KB state.
protected  void loadOntology(AbstractTripleStore tripleStore)
          Loads the file or directory specified by MappedRDFDataLoadMaster.ConfigurationOptions.ONTOLOGY into the ITripleStore
static void main(String[] args)
          Runs the master.
protected  T newClientTask(INotifyOutcome<V,L> notifyProxy, L locator)
          The default creates RDFFileLoadTask instances.
protected  S newJobState(String component, net.jini.config.Configuration config)
          Return a TaskMaster.JobState.
 AbstractTripleStore openTripleStore()
          Create/re-open the repository.
protected  void runJob()
          Extended to support optional load, closure, and reporting.
 void showProperties(AbstractTripleStore tripleStore)
          Dump some properties of interest.
 
Methods inherited from class com.bigdata.service.jini.master.MappedTaskMaster
newClientTask, newResourceBuffer
 
Methods inherited from class com.bigdata.service.jini.master.TaskMaster
allDone, attachPerformanceCounters, awaitAll, call, cancelAll, detachPerformanceCounters, error, execute, forceOverflow, getFederation, getJobState, innerMain, notifyOutcome, setupJob, startClients, success, tearDownJob
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.log4j.Logger log
Constructor Detail

MappedRDFDataLoadMaster

public MappedRDFDataLoadMaster(JiniFederation fed)
                        throws net.jini.config.ConfigurationException
Throws:
net.jini.config.ConfigurationException
Method Detail

main

public static void main(String[] args)
                 throws net.jini.config.ConfigurationException,
                        ExecutionException,
                        InterruptedException,
                        org.apache.zookeeper.KeeperException
Runs the master. SIGTERM (normal kill or ^C) will cancel the job, including any running clients. Use -Dbigdata.component to override the configuration component name.

Parameters:
args - The Configuration and any overrides.
Throws:
net.jini.config.ConfigurationException
ExecutionException
InterruptedException
org.apache.zookeeper.KeeperException

runJob

protected void runJob()
               throws Exception
Extended to support optional load, closure, and reporting.

Overrides:
runJob in class MappedTaskMaster<S extends MappedRDFDataLoadMaster.JobState,T extends AbstractAsynchronousClientTask<U,V,L>,L extends ClientLocator,U,V extends Serializable>
Throws:
Exception - Client execution problem.
InterruptedException - Master interrupted awaiting clients.

getKBInfo

protected StringBuilder getKBInfo(AbstractTripleStore tripleStore)
Return various interesting metadata about the KB state.


beginJob

protected void beginJob(S jobState)
                 throws Exception
Extended to open/create the KB.

Overrides:
beginJob in class TaskMaster<S extends MappedRDFDataLoadMaster.JobState,T extends AbstractAsynchronousClientTask<U,V,L>,U>
Throws:
Exception
See Also:
TaskMaster.ConfigurationOptions.INDEX_DUMP_DIR, TaskMaster.ConfigurationOptions.INDEX_DUMP_NAMESPACE

openTripleStore

public AbstractTripleStore openTripleStore()
                                    throws net.jini.config.ConfigurationException
Create/re-open the repository.

Throws:
net.jini.config.ConfigurationException

createTripleStore

protected AbstractTripleStore createTripleStore()
                                         throws net.jini.config.ConfigurationException
Create the AbstractTripleStore specified by MappedRDFDataLoadMaster.ConfigurationOptions.NAMESPACE using the properties associated with the TaskMaster.JobState#component.

Returns:
The AbstractTripleStore
Throws:
net.jini.config.ConfigurationException

loadOntology

protected void loadOntology(AbstractTripleStore tripleStore)
                     throws IOException
Loads the file or directory specified by MappedRDFDataLoadMaster.ConfigurationOptions.ONTOLOGY into the ITripleStore

Throws:
IOException

showProperties

public void showProperties(AbstractTripleStore tripleStore)
Dump some properties of interest.


newClientTask

protected T newClientTask(INotifyOutcome<V,L> notifyProxy,
                          L locator)
The default creates RDFFileLoadTask instances.

Specified by:
newClientTask in class MappedTaskMaster<S extends MappedRDFDataLoadMaster.JobState,T extends AbstractAsynchronousClientTask<U,V,L>,L extends ClientLocator,U,V extends Serializable>
Parameters:
notifyProxy - The proxy for the object to which the client must deliver notice of success or failure for each processed resource.
locator - The locator for the client on which the task will be executed.
Returns:
The client task.

newJobState

protected S newJobState(String component,
                        net.jini.config.Configuration config)
                                                          throws net.jini.config.ConfigurationException
Description copied from class: TaskMaster
Return a TaskMaster.JobState.

Specified by:
newJobState in class TaskMaster<S extends MappedRDFDataLoadMaster.JobState,T extends AbstractAsynchronousClientTask<U,V,L>,U>
Parameters:
component - The component.
config - The configuration.
Returns:
The TaskMaster.JobState.
Throws:
net.jini.config.ConfigurationException


Copyright © 2006-2012 SYSTAP, LLC. All Rights Reserved.