public class InPlace extends Object implements CleanupStrategy
Modifier and Type | Field and Description |
---|---|
static String |
CLEANUP_JOB_PREFIX
The prefix for CLEANUP_JOB ID i.e prefix+the parent compute_job ID becomes
ID of the cleanup job.
|
static int |
DEFAULT_CLUSTERED_CLEANUP_JOBS_PER_LEVEL
The default value for the number of clustered cleanup jobs created per
level.
|
static String |
DEFAULT_MAX_JOBS_FOR_CLEANUP_CATEGORY
The default value for the maxjobs variable for the category of cleanup
jobs.
|
private int |
mCleanupJobsPerLevel
The number of cleanup jobs per level to be created
|
private int |
mCleanupJobsSize
the number of cleanup jobs clustered into a clustered cleanup job
|
private HashSet |
mDoNotClean
HashSet of Files that should not be cleaned up
|
private CleanupImplementation |
mImpl
The handle to the CleanupImplementation instance that creates the jobs for us.
|
private LogManager |
mLogger
The handle to the logging object used for logging.
|
private int |
mMaxDepth
The max depth of any job in the workflow useful for a priorityQueue
implementation in an array
|
private PegasusProperties |
mProps
The handle to the properties passed to Pegasus.
|
private HashMap |
mResMap
The mapping to siteHandle to all the jobs that are mapped to it
mapping to siteHandle(String) to Set
|
private HashMap |
mResMapLeaves
The mapping of siteHandle to all subset of the jobs mapped to it that are
leaves in the workflow mapping to siteHandle(String) to Set
|
private HashMap |
mResMapRoots
The mapping of siteHandle to all subset of the jobs mapped to it that are
roots in the workflow mapping to siteHandle(String) to Set
|
private boolean |
mUseSizeFactor
A boolean indicating whether we prefer use the size factor or the num
factor
|
VERSION
Constructor and Description |
---|
InPlace()
The default constructor.
|
Modifier and Type | Method and Description |
---|---|
Graph |
addCleanupJobs(Graph workflow)
Adds cleanup jobs to the workflow.
|
private void |
addCleanUpJobs(String site,
Set leaves,
Graph workflow)
Adds cleanup jobs for the workflow scheduled to a particular site
a breadth first search strategy is implemented based on the depth of the job
in the workflow
|
protected void |
applyJobPriorities(Graph workflow)
Adds job priorities to the jobs in the workflow on the basis of
the levels in the traversal order given by the iterator.
|
private List<GraphNode> |
clusterCleanupGraphNodes(List<GraphNode> cleanupNodes,
HashMap cleanedBy,
String site,
int level)
Takes in a list of cleanup nodes ,one per cleanupNode(compute/stageout job)
whose files need to be deleted) and clusters them into a smaller set
of cleanup nodes.
|
private GraphNode |
createClusteredCleanupGraphNode(List<GraphNode> nodes,
HashMap cleanedBy,
String site,
int level,
int index)
Creates a clustered cleanup graph node that aggregates multiple cleanup nodes
into one node
|
protected String |
generateCleanupID(Job job)
Returns the identifier that is to be assigned to cleanup job.
|
String |
generateClusteredJobID(String site,
int level,
int index)
Generated an ID for a clustered cleanup job
|
private int |
getClusterSize(int size)
Returns the number of cleanup jobs clustered into one job per level.
|
String |
getDefaultCleanupMaxJobsPropertyKey()
Returns the property key that can be used to set the max jobs for the
default category associated with the registration jobs.
|
protected String |
getSiteForCleanup(Job job)
Returns site to be used for the cleanup algorithm.
|
void |
initialize(PegasusBag bag,
CleanupImplementation impl)
Intializes the class.
|
protected void |
reduceDependency(GraphNode node)
Reduces the number of edges between the nodes and it's parents.
|
protected void |
reset()
Resets the internal data structures.
|
private void |
setDepth_ResMap(List roots)
A BFS implementation to set depth value (roots have depth 1) and also
to populate mResMap ,mResMapLeaves,mResMapRoots which contains all the
jobs that are assigned to a particular resource
|
protected boolean |
typeNeedsCleanUp(GraphNode node)
Checks to see which job types are required to be looked at for cleanup.
|
protected boolean |
typeNeedsCleanUp(int type)
Checks to see which job types are required to be looked at for cleanup.
|
protected boolean |
typeStageOut(int type)
Checks to see if job type is a stageout job type.
|
public static final String CLEANUP_JOB_PREFIX
public static final String DEFAULT_MAX_JOBS_FOR_CLEANUP_CATEGORY
public static final int DEFAULT_CLUSTERED_CLEANUP_JOBS_PER_LEVEL
private HashMap mResMap
private HashMap mResMapLeaves
private HashMap mResMapRoots
private int mMaxDepth
private HashSet mDoNotClean
private CleanupImplementation mImpl
private PegasusProperties mProps
private LogManager mLogger
private int mCleanupJobsPerLevel
private int mCleanupJobsSize
private boolean mUseSizeFactor
public void initialize(PegasusBag bag, CleanupImplementation impl)
initialize
in interface CleanupStrategy
bag
- bag of initialization objectsimpl
- the implementation instance that creates cleanup jobpublic Graph addCleanupJobs(Graph workflow)
addCleanupJobs
in interface CleanupStrategy
workflow
- the workflow to add cleanup jobs to.protected void reset()
private void setDepth_ResMap(List roots)
roots
- List of GraphNode objects that are rootsprivate void addCleanUpJobs(String site, Set leaves, Graph workflow)
site
- the site IDleaves
- the leaf jobs that are scheduled to siteworkflow
- the Graph into which new cleanup jobs can be addedprotected void reduceDependency(GraphNode node)
For the node look at the parents of the Node. For each parent Y see if there is a path to any other parent Z of X. If a path exists, then the edge from Z to node can be removed.
node
- the nodes whose parent edges need to be reduced.protected void applyJobPriorities(Graph workflow)
workflow
- the workflow on which to apply job priorities.protected String generateCleanupID(Job job)
job
- the job with which the cleanup job is primarily associated.public String generateClusteredJobID(String site, int level, int index)
site
- the site associated with the cleanup jobslevel
- the level of the workflowindex
- the index of the job on that levelprotected boolean typeNeedsCleanUp(GraphNode node)
node
- the graph nodeprotected boolean typeNeedsCleanUp(int type)
type
- the type of the job.protected boolean typeStageOut(int type)
type
- the type of the job.protected String getSiteForCleanup(Job job)
job
- the jobpublic String getDefaultCleanupMaxJobsPropertyKey()
private List<GraphNode> clusterCleanupGraphNodes(List<GraphNode> cleanupNodes, HashMap cleanedBy, String site, int level)
cleanupNodes
- List of stub cleanup nodes created corresponding to a job
in the workflow that needs cleanup. the cleanup jobs
have content as a CleanupJobContentcleanedBy
- a map that tracks which file was deleted by which cleanup
jobsite
- the site associated with the cleanup jobslevel
- the level of the workflowprivate GraphNode createClusteredCleanupGraphNode(List<GraphNode> nodes, HashMap cleanedBy, String site, int level, int index)
nodes
- list of cleanup nodes that are to be aggregatedcleanedBy
- a map that tracks which file was deleted by which cleanup
jobsite
- the site associated with the cleanup jobslevel
- the level of the workflowindex
- the index of the cleanup job for that levelprivate int getClusterSize(int size)
size
- the number of cleanup jobs created by the algorithm before clustering
for the level.