Job Dependency Control
Out of necessity or for better throughput, an application may spawn a series of batch jobs which may be required to run in specific orders. For these applications, the job dependency can be controlled, in general, with qsub‘s “-hold_jid” switch (see man qsub). The procedure described below is applicable to both single- and multi-processor batch jobs. Here are two examples:
- Example 1. All batch jobs in the group run in specific sequence.
scc1% qsub -N job1 script1 scc1% qsub -N job2 -hold_jid job1 script2 scc1% qsub -N job3 -hold_jid job2 script3
- Example 2. A designated job must wait until the remaining jobs in the group have completed (aka post-processing).
In this example, lastjob won’t start until job1, job2, and job3 have completed.
scc1% qsub -N job1 script1 scc1% qsub -N job2 script2 scc1% qsub -N job3 script3 scc1% qsub -N lastjob -hold_jid "job*" script4
In both examples, the use of “-N” to assign job names makes job identification easier as the names of the referenced jobs are known a priori; this is especially helpful in the second example because wild card (*) can be used effectively. Note that the above procedures are applicable to parallel batch jobs (i.e., with the -pe switch). As an alternative to the manual job-by-job submission shown above (for conceptual demonstration), incorporating all the steps into a script is more practical. For a complete example, visit Running Multiple MATLAB Tasks.