{"id":137978,"date":"2021-12-03T15:42:09","date_gmt":"2021-12-03T20:42:09","guid":{"rendered":"http:\/\/www.bu.edu\/tech\/?page_id=137978"},"modified":"2024-08-28T10:48:00","modified_gmt":"2024-08-28T14:48:00","slug":"advanced-batch","status":"publish","type":"page","link":"https:\/\/www.bu.edu\/tech\/support\/research\/system-usage\/running-jobs\/advanced-batch\/","title":{"rendered":"Advanced Batch System Usage"},"content":{"rendered":"<h2>Content<\/h2>\n<ul>\n<li><a href=\"#depend\">Job dependency control<\/a><\/li>\n<li><a href=\"#array\">Submitting Array Jobs<\/a><\/li>\n<li><a href=\"#sge_request\">Customize qsub settings with .sge_request<\/a><\/li>\n<li><a href=\"#jobenv\">Job Environment<\/a><\/li>\n<\/ul>\n<h2 style=\"margin-bottom: 1.em; margin-top: 2.5em;\"><a name=\"depend\"><\/a>Job dependency control<\/h2>\n<p>Out of necessity or for better throughput, an application may spawn a series of batch jobs which may be required to run in a specific order. For these applications, the job dependency can be controlled, in general, with the <nobr><code><span class=\"command\">qsub<\/span> -hold_jid<\/code><\/nobr> option (see <a href=\"http:\/\/scv.bu.edu\/cgi-bin\/perl\/manscript\/SCC\/qsub\/1\">man qsub<\/a>). The procedure described below is applicable to both single- and multi-processor batch jobs. Here are two examples:<\/p>\n<ul>\n<li><b>Example 1.<\/b> All batch jobs in the group need to run in a specific sequence.\n<pre class=\"code-block\"><code><span class=\"prompt\">scc1$<\/span> <span class=\"command\">qsub -N<\/span> <span class=\"placeholder\">job1 script1<\/span>\r\n<span class=\"prompt\">scc1$<\/span> <span class=\"command\">qsub -N<\/span> <span class=\"placeholder\">job2<\/span> -hold_jid <span class=\"placeholder\">job1 script2<\/span>\r\n<span class=\"prompt\">scc1$<\/span> <span class=\"command\">qsub -N<\/span> <span class=\"placeholder\">job3<\/span> -hold_jid <span class=\"placeholder\">job2 script3<\/span><\/code><\/pre>\n<\/li>\n<li><b>Example 2.<\/b> A designated job must wait until the remaining jobs in the group have completed (aka post-processing).<br \/>\nIn this example, <code><span class=\"placeholder\">lastjob<\/span><\/code> won&#8217;t start until <code><span class=\"placeholder\">job1, job2,<\/span><\/code> and <code><span class=\"placeholder\">job3<\/span><\/code> have completed.<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">scc1$<\/span> <span class=\"command\">qsub -N<\/span> <span class=\"placeholder\">job1 script1<\/span>\r\n<span class=\"prompt\">scc1$<\/span> <span class=\"command\">qsub -N<\/span> <span class=\"placeholder\">job2 script2<\/span>\r\n<span class=\"prompt\">scc1$<\/span> <span class=\"command\">qsub -N<\/span> <span class=\"placeholder\">job3 script3<\/span>\r\n<span class=\"prompt\">scc1% <\/span><span class=\"command\">qsub -N<\/span> <span class=\"placeholder\">lastjob<\/span> -hold_jid <span class=\"placeholder\">\"job*\" script4<\/span><\/code><\/pre>\n<\/li>\n<\/ul>\n<p>In both examples, the use of the &#8220;<code><span class=\"command\">-N<\/span><\/code>&#8221; option to assign job names makes job identification easier as the names of the referenced jobs are known a priori; this is especially helpful in the second example because a wild card (*) can be used effectively. Note that the above procedures are also applicable to parallel batch jobs (<i>i.e.<\/i> with the <code>-pe<\/code> switch). As an alternative to the manual job-by-job submission shown above (for conceptual demonstration), incorporating all the steps into a script is more practical. For a complete example, visit <a href=\"https:\/\/www.bu.edu\/tech\/support\/research\/software-and-programming\/common-languages\/matlab\/matlab-batch#ARRAYJOB\">Running Multiple Batch Jobs With qsub Array Job Option<\/a>.<\/p>\n<h2 style=\"margin-bottom: 1.em; margin-top: 2.5em;\"><a name=\"array\"><\/a>Submitting Array Jobs<\/h2>\n<p>If you submit many jobs at the same time that are largely identical, you should submit them as array jobs. Array jobs are easier to manage, faster to submit, and they greatly reduce the load on the scheduler. An array job executes multiple independent copies of the same job script. These multiple copies are referred to as &#8220;tasks&#8221; and are scheduled independently as resources become available, i.e. the tasks are not scheduled all at once. The number of tasks to be executed is set using the <nobr><code>-t <span class=\"placeholder\">start-end[:step]<\/span><\/code> option<\/nobr> to the <code><span class=\"command\">qsub<\/span><\/code> command, where <code><span class=\"placeholder\">start<\/span><\/code> is the index of the first task (it has to be 1 or more, it can not be 0), <code><span class=\"placeholder\">end<\/span><\/code> is the index number of the last task, and <code><span class=\"placeholder\">step<\/span><\/code> is an optional step size (step size defaults to 1 if unspecified). Here&#8217;s an example of using this command:<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">scc % <\/span><span class=\"command\">qsub -t 1-25 <\/span><span class=\"placeholder\">myscript.sh<\/span>\r\n<\/code><\/pre>\n<p>The above command will submit an array job consisting of 25 tasks, numbered from 1 to 25. Since the step size was not specified, the default step size of 1 will be used. Each task will independently execute the <code><span class=\"placeholder\">myscript.sh<\/span><\/code> job file. The batch system sets the <code>SGE_TASK_ID<\/code> environment variable, which can be used inside the script to pass the task ID to the program. Below is an example of how you can utilize that environment variable in the job file:<\/p>\n<pre class=\"code-block\"><code><span class=command>#!\/bin\/bash -l<\/span>\r\n<span class=\"comment\"># Specify that we will be running an Array job with 25 tasks numbered 1-25<\/span>\r\n#$ -t 1-25\r\n<span class=\"comment\"># Request 1 core for my job<\/span>\r\n#$ -pe omp 1\r\n<span class=\"comment\"># Give a name to my job<\/span>\r\n#$ -N my_array_job\r\n<span class=\"comment\"># Join the output and error streams<\/span>\r\n#$ -j y\r\n\r\n<span class=\"comment\"># Run my R script and give it the $SGE_TASK_ID environment \r\n# variable as a command-line argument<\/span>\r\n<span class=\"command\">Rscript<\/span> <span class=\"placeholder\">myRfile.R<\/span> $SGE_TASK_ID\r\n<\/code><\/pre>\n<p>There is a more <a href=\"https:\/\/www.bu.edu\/tech\/support\/research\/system-usage\/running-jobs\/batch-script-examples\/#ARRAY\"> advanced example code here <\/a>as well.<\/p>\n<p>When running the <code><span class=\"command\">qstat -u<\/span> <span class=\"placeholder\"> USER_ID <\/span><\/code> command to check on the status of the Array job, the running tasks will be listed as separate lines, each with the corresponding task ID visible under the &#8220;ja-task-ID&#8221; column. All the remaining tasks that are not running, i.e. are queued, will be listed as a single line, their task IDs will be aggregated also under the &#8220;ja-task-ID&#8221; column (see the code block below, at the far right).<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">[aaly@scc1]$<\/span> <span class=\"command\">qstat -u<\/span> <span class=\"placeholder\">aaly<\/span>\r\njob-ID  prior   name       user         state submit\/start at     queue                          slots ja-task-ID \r\n-----------------------------------------------------------------------------------------------------------------\r\n4960245 0.11489 my_array_j aaly         r     02\/25\/2021 20:31:37 johnsonlab.q-pub@scc-cb4.scc.b    14 1\r\n4960245 0.11012 my_array_j aaly         r     02\/25\/2021 20:31:37 saimath-pub@scc-gc4.scc.bu.edu    14 2\r\n4960245 0.10774 my_array_j aaly         r     02\/25\/2021 20:31:37 saimath-pub@scc-gc4.scc.bu.edu    14 3\r\n4960245 0.10630 my_array_j aaly         r     02\/25\/2021 20:31:37 montilab-pub@scc-zl4.scc.bu.ed    14 4\r\n4960245 0.10535 my_array_j aaly         r     02\/25\/2021 20:31:37 montilab-pub@scc-zl4.scc.bu.ed    14 5\r\n4960245 0.10535 my_array_j aaly         r     02\/25\/2021 20:31:37 boas-pub@scc-x07.scc.bu.edu       14 6\r\n4960245 0.10535 my_array_j aaly         r     02\/25\/2021 20:31:37 boas-pub@scc-x07.scc.bu.edu       14 7\r\n4960245 0.10535 my_array_j aaly         r     02\/25\/2021 20:31:37 peloso-pub@scc-tr4.scc.bu.edu     14 8\r\n4960245 0.10535 my_array_j aaly         r     02\/25\/2021 20:31:37 kolaczyk-pub@scc-ym1.scc.bu.ed    14 9\r\n4960245 0.10535 my_array_j aaly         r     02\/25\/2021 20:31:37 anderssongroup-pub@scc-zk1.scc    14 10\r\n4960245 0.10535 my_array_j aaly         r     02\/25\/2021 20:31:37 apolkovnikov-pub@scc-zk2.scc.b    14 11\r\n4960245 0.00000 my_array_j aaly         qw    02\/25\/2021 20:29:33                                   14 12-25:1\r\n<\/code><\/pre>\n<p>In the above output of <code><span class=\"command\">qstat<\/span><\/code> the submitted array job is composed of 25 tasks. Tasks 1-11 are running, and each is listed as a separate entry. Tasks 12-25 are still in the queue waiting to be scheduled; they appear as a single entry.<\/p>\n<p>Notice all the tasks carry the same job &#8220;name&#8221;; here they are called <code><span class=\"placeholder\">my_array_job<\/span><\/code> (there&#8217;s no room to display the entire name in the column, so it is truncated). It is important to give a distinct name to your Array jobs. In addition, it is not recommended to assign a name to the output file using <code>-o<\/code> option in the qsub file. This is because all the tasks will write to the same file. <\/p>\n<p>In the event that an array job was submitted by mistake, simply delete all tasks with: <\/p>\n<pre class=\"code-block\">\r\n<code><span class=\"prompt\">scc1$<\/span> <span class=\"command\">qdel<\/span> <span class=\"placeholder\">job_ID<\/span>\r\n<\/code><\/pre>\n<p>Once the array job has finished, there will an output file for each task that has run. We can execute the <code><span class=\"command\">ls<\/span><\/code> command to see them as below:<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">[aaly@scc1]$<\/span> <span class=\"command\">ls<\/span>\r\nmyscript.sh               my_array_job.o4960245.12  my_array_job.o4960245.16  my_array_job.o4960245.2   my_array_job.o4960245.23  my_array_job.o4960245.4  my_array_job.o4960245.8\r\nmy_array_job.o4960245.1   my_array_job.o4960245.13  my_array_job.o4960245.17  my_array_job.o4960245.20  my_array_job.o4960245.24  my_array_job.o4960245.5  my_array_job.o4960245.9\r\nmy_array_job.o4960245.10  my_array_job.o4960245.14  my_array_job.o4960245.18  my_array_job.o4960245.21  my_array_job.o4960245.25  my_array_job.o4960245.6  myRfile.R\r\nmy_array_job.o4960245.11  my_array_job.o4960245.15  my_array_job.o4960245.19  my_array_job.o4960245.22  my_array_job.o4960245.3   my_array_job.o4960245.7\r\n<\/code><\/pre>\n<p>By running the <code><span class=\"command\">ls<\/span><\/code> command we see the output files of the array job listed. The output file naming format is <code><span class=\"placeholder\">job_name.o&ltjob_id&gt.&lttask_id&gt<\/span><\/code>. In the above example, the job name is <code><span class=\"placeholder\">my_array_job<\/span><\/code>, the job id is <code><span class=\"placeholder\">4960245<\/span><\/code>, and the task id is between 1 and 25. Thus, we see the corresponding output file for each task in the Array job.<\/p>\n<h2 style=\"margin-bottom: 1.em; margin-top: 2.5em;\"><a name=\"sge_request\"><\/a>Customize qsub with .sge_request<\/h2>\n<p>With the current batch scheduler, a user may create a <code>.sge_request<\/code> file (in their home directory) to customize preferred or frequently used <code><span class=\"command\">qsub<\/span><\/code> settings. These settings will be in effect for all subsequent <code><span class=\"command\">qsub<\/span><\/code> (or <code><span class=\"command\">qsh<\/span><\/code>) batch submissions. Here is what the <code>.sge_request<\/code> file might look like:<\/p>\n<pre class=\"code-block\"><code><span class=\"comment\"># .sge_request file must reside in home directory\r\n#<\/span>\r\n<span class=\"comment\"># Send me mail when job gets aborted or ends normally<\/span>\r\n-m ae\r\n\r\n<span class=\"comment\"># I want email sent to this email address (instead of default BU email)<\/span>\r\n-M <span class=\"placeholder\">myname@gmail.com<\/span>\r\n\r\n<span class=\"comment\"># I have multiple projects, I want jobs charged to this project<\/span>\r\n-P <span class=\"placeholder\">projectname<\/span><\/span><\/code><\/pre>\n<p>Your batch script, <code><span class=\"placeholder\">my_batch_script<\/span><\/code>, need not include those options already defined in <code>.sge_request<\/code>. However, if you do, they will take precedence over those in <code>.sge_request<\/code>. Furthermore, any option that appears on the <code><span class=\"command\">qsub<\/span><\/code> command line input will supersede that which is in <code>.sge_request<\/code> and <code>my_batch_script<\/code>. Here is an example<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">scc1$<\/span> <span class=\"command\">qsub<\/span> -m a -l h_rt=48:00:00 <span class=\"placeholder\">my_batch_script<\/span><\/code><\/pre>\n<p>The above batch job will only send email if the job is aborted, not if it ends normally as the <code>.sge_request<\/code> indicated. The combination of the other options specified in the <code>.sge_request<\/code> and on the command line will also take effect.<\/p>\n<h2 style=\"margin-bottom: 1.em; margin-top: 2.5em;\"><a name=\"jobenv\"><\/a>Job Environment<\/h2>\n<p>A few <i>pseudo<\/i> environment variables are allowed to be used in the path specified with the <code>-e<\/code> and <code>-o<\/code> options:<\/p>\n<p><code>$HOME<\/code> &#8211; home directory on execution machine<br \/>\n<code>$USER<\/code> &#8211; user ID of job owner<br \/>\n<code>$JOB_ID<\/code> &#8211; current job ID<br \/>\n<code>$JOB_NAME<\/code> &#8211; current job name (see -N option)<br \/>\n<code>$HOSTNAME<\/code>  &#8211; name of the execution host<br \/>\n<code>$TASK_ID<\/code>  &#8211; array job task index number<\/p>\n<p>This <i>pseudo<\/i> environment variables can only be used for the above scheduler options and cannot be used further in the script. They also should not be confused with the regular environment variables that cannot be used to within scheduler options, but can be used within the script:<\/p>\n<p><code>SGE_O_HOST<\/code>    &#8211; the name of the host on which the submitting client is running.<br \/>\n<code>SGE_TASK_ID<\/code>   &#8211; The  index  number  of  the current array job task.<br \/>\n<code>SGE_TASK_FIRST<\/code> &#8211; The  index  number of the first array job task.<br \/>\n<code>SGE_TASK_LAST<\/code> &#8211; The index number of the last array job task.<br \/>\n<code>SGE_TASK_STEPSIZE<\/code> &#8211; The  step  size of the array job specification. <br \/>\n<code>HOME<\/code> &#8211; The user&#8217;s home directory path.<br \/>\n<code>HOSTNAME<\/code>    &#8211; The hostname of the node on which the job is running.<br \/>\n<code>JOB_ID<\/code>      &#8211; A  unique  identifier assigned by the  scheduler when the job was submitted.<br \/>\n<code>JOB_NAME<\/code>    &#8211; The job name. <br \/>\n<code>NHOSTS<\/code>      &#8211; The number of hosts in use by a parallel job.<br \/>\n<code>NSLOTS<\/code>       &#8211; The number of queue slots in use by a parallel job.<\/p>\n<p>See the manual of the <code>qsub<\/code> command for the full list of the SGE environment.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Content Job dependency control Submitting Array Jobs Customize qsub settings with .sge_request Job Environment Job dependency control Out of necessity or for better throughput, an application may spawn a series of batch jobs which may be required to run in a specific order. For these applications, the job dependency can be controlled, in general, with&#8230;<\/p>\n","protected":false},"author":1692,"featured_media":0,"parent":137962,"menu_order":8,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/137978"}],"collection":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/users\/1692"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/comments?post=137978"}],"version-history":[{"count":4,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/137978\/revisions"}],"predecessor-version":[{"id":153998,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/137978\/revisions\/153998"}],"up":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/137962"}],"wp:attachment":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/media?parent=137978"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}