public class DataReuseEngine extends Engine implements Refiner
In the first pass , we determine all the jobs whose output files exist in the Replica Catalog. An output file with the transfer flag set to false is treated equivalent to the file existing in the Replica Catalog , if
- the output file is not an input to any of the children of the job XIn the second pass, we remove the job whose output files exist in the Replica Catalog and try to cascade the deletion upwards to the parent jobs. We start the breadth first traversal of the workflow bottom up. A node is marked for deletion if -
( It is already marked for deletion in pass 1 OR ( ALL of it's children have been marked for deletion AND Node's output files have transfer flags set to false ) )
Modifier and Type | Class and Description |
---|---|
class |
DataReuseEngine.BooleanBag
A bag implementation that cam be used to hold a boolean value associated with the
graph node
|
Modifier and Type | Field and Description |
---|---|
private List<Job> |
mAllDeletedJobs
List of all deleted jobs during workflow reduction.
|
private ADag |
mWorkflow
The workflow object being worked upon.
|
private XMLProducer |
mXMLStore
The XML Producer object that records the actions.
|
mBag, mLogger, mLogMsg, mOutputPool, mPoolFile, mPOptions, mProps, mRLIUrl, mSiteStore, mTCFile, mTCHandle, mTCMode, REGISTRATION_UNIVERSE, TRANSFER_UNIVERSE
Constructor and Description |
---|
DataReuseEngine(ADag orgDag,
PegasusBag bag)
The constructor
|
Modifier and Type | Method and Description |
---|---|
protected Graph |
cascadeDeletionUpwards(Graph workflow,
List<GraphNode> originalJobsInRC)
Cascade the deletion of the jobs upwards in the workflow.
|
List<Job> |
getDeletedJobs()
This returns all the jobs deleted from the workflow after the reduction
algorithm has run.
|
List<Job> |
getDeletedLeafJobs()
This returns all the deleted jobs that happen to be leaf nodes.
|
private List<GraphNode> |
getJobsInRC(Graph workflow,
Set filesInRC)
Returns all the jobs whose output files exist in the Replica Catalog.
|
ADag |
getWorkflow()
Returns a reference to the workflow that is being refined by the refiner.
|
XMLProducer |
getXMLProducer()
Returns a reference to the XMLProducer, that generates the XML fragment
capturing the actions of the refiner.
|
ADag |
reduceWorkflow(ADag workflow,
ReplicaCatalogBridge rcb)
Reduces the workflow on the basis of the existence of lfn's in the
replica catalog.
|
Graph |
reduceWorkflow(Graph workflow,
ReplicaCatalogBridge rcb)
Reduces the workflow on the basis of the existence of lfn's in the
replica catalog.
|
protected boolean |
transferOutput(GraphNode node)
Returns whether a user wants output transferred for a node or not.
|
addVector, appendArrayList, loadProperties, printVector, stringInList, stringInPegVector, stringInVector, vectorToString
private List<Job> mAllDeletedJobs
private XMLProducer mXMLStore
private ADag mWorkflow
public DataReuseEngine(ADag orgDag, PegasusBag bag)
orgDag
- The original Dag objectbag
- the bag of initialization objects.public ADag getWorkflow()
getWorkflow
in interface Refiner
public XMLProducer getXMLProducer()
getXMLProducer
in interface Refiner
public ADag reduceWorkflow(ADag workflow, ReplicaCatalogBridge rcb)
workflow
- the workflow to be reduced.rcb
- instance of the replica catalog bridge.public Graph reduceWorkflow(Graph workflow, ReplicaCatalogBridge rcb)
workflow
- the workflow to be reduced.rcb
- instance of the replica catalog bridge.public List<Job> getDeletedJobs()
Job
of deleted leaf jobs.public List<Job> getDeletedLeafJobs()
Job
of deleted leaf jobs.private List<GraphNode> getJobsInRC(Graph workflow, Set filesInRC)
workflow
- the workflow objectfilesInRC
- Set of String
objects corresponding to the
logical filenames of files that are found to be in the
Replica Catalog.org.griphyn.cPlanner.classes.Job
protected Graph cascadeDeletionUpwards(Graph workflow, List<GraphNode> originalJobsInRC)
( It is already marked for deletion OR ( ALL of it's children have been marked for deletion AND Node's output files have transfer flags set to false ) )
workflow
- the worfklow to be deducedoriginalJobsInRC
- list of nodes found to be in the Replica Catalog.protected boolean transferOutput(GraphNode node)
node
- the GraphNodeCopyright © 2011 The University of Southern California. All Rights Reserved.