{"id":137976,"date":"2021-12-03T15:41:04","date_gmt":"2021-12-03T20:41:04","guid":{"rendered":"http:\/\/www.bu.edu\/tech\/?page_id=137976"},"modified":"2023-09-28T08:05:51","modified_gmt":"2023-09-28T12:05:51","slug":"allocating-memory-for-your-job","status":"publish","type":"page","link":"https:\/\/www.bu.edu\/tech\/support\/research\/system-usage\/running-jobs\/allocating-memory-for-your-job\/","title":{"rendered":"Allocating Memory for your Job"},"content":{"rendered":"<h2>Estimating Memory Demands<\/h2>\n<p>It can be challenging to estimate how much memory your job will require before submission. Benchmarking tests are available for specific applications that can provide a guide but initially it is best to run your job and review the SCC&#8217;s job status reports. Each job is allocated <b>virtual memory<\/b> throughout the job&#8217;s runtime. <b>Virtual memory<\/b> is the required amount of memory for the job to run and can be accessed with three commands: <code><span class=\"command\">qstat<\/span><\/code>, <code><span class=\"command\">top<\/span><\/code>, and <code><span class=\"command\">qacct<\/span><\/code>. <code><span class=\"command\">qstat<\/span><\/code> and <code><span class=\"command\">top<\/span><\/code> allow you to monitor your jobs&#8217; processes in real time and <code><span class=\"command\">qacct<\/span><\/code> is a full report available after a job has finished. Guidelines for submitting batch jobs with large memory requirements are available <a href=\"https:\/\/www.bu.edu\/tech\/support\/research\/system-usage\/running-jobs\/batch-script-examples\/#MEMORY\">here<\/a>.<\/p>\n<ul>\n<li><a href=\"#QSTAT\">qstat<\/a><\/li>\n<li><a href=\"#TOP\">top<\/a><\/li>\n<li><a href=\"#QCCT\">qacct<\/a><\/li>\n<\/ul>\n<h2 style=\"margin-bottom: 1.em;\"><a id=\"QSTAT\" name=\"QSTAT\"><\/a>qstat<\/h2>\n<p><code><span class=\"command\">qstat<\/span><\/code> is an SGE command that reports the status of jobs submitted to the cluster. To see more details of a specific job running on the cluster, you will need to run <code><span class=\"command\">qstat<\/span><\/code> with the <code><span class=\"command\">-j<\/span> <span class=\"placeholder\">job_ID<\/span><\/code> flag specifying the <code><span class=\"placeholder\">job_ID<\/span><\/code> assigned to your job which can be be found by <code><span class=\"command\">qstat -u<\/span> <span class=\"placeholder\">userID<\/span><\/code>.<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">scc % <\/span><span class=\"command\">qstat -u <\/span><span class=\"placeholder\">userID<\/span>\r\n<span class=\"output\">job-ID  prior   name       user         state submit\/start at     queue                          slots ja-task-ID\r\n-----------------------------------------------------------------------------------------------------------------\r\n<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">4717015<\/span><\/span> 0.10072 my_job1   userID      r     03\/01\/2018 09:35:08 p8@scc-pf2.scc.bu.edu              8\r\n4717016 0.10072 my_job2   userID      r     03\/01\/2018 09:35:08 p8@scc-pf2.scc.bu.edu              8\r\n<\/span>\r\n<span class=\"prompt\">scc % <\/span><span class=\"command\">qstat <\/span>-j <span class=\"placeholder\">4717015<\/span>\r\n<span class=\"output\">==============================================================\r\njob_number:                 4717015\r\nexec_file:                  job_scripts\/4717016\r\nsubmission_time:            Thu Mar  1 09:34:35 2018\r\nowner:                      userID\r\n...\r\njob_name:                   my_job1\r\nstdout_path_list:           NONE:NONE:\/projectnb\/scv\/userID\/scripts\/log\/\r\njobshare:                   0\r\nenv_list:                   PATH=\/projectnb\/scv\/userID\/scripts:\/bin:\/usr\/bin:\/usr\/local\/sbin:\/usr\/sbi\r\njob_args:                   sub001\r\nscript_file:                my_job1.qsub\r\nparallel environment:  omp8 range: 8\r\nverify_suitable_queues:     2\r\nproject:                    scv\r\n<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">usage    1:<\/span>                 cpu=7:58:19, mem=5319.88221 GBs, io=66.36036, vmem=15.690G,<span style=\"color: #1a206d; font-weight: 800;\"> maxvmem=15.886G<\/span><\/span>\r\nscheduling info:            (Collecting of scheduler job information is turned off)\r\n<\/span><\/code><\/pre>\n<p>The <b>usage 1<\/b> line contains the <b>maxvmem<\/b> which reports the maximum virtual memory that has been used during the cpu runtime. In this example, <code><span class=\"placeholder\">my_job1<\/span><\/code> requires 16GB of total memory during the first 8 hours of runtime.<\/p>\n<h2 style=\"margin-bottom: 1.em;\"><a id=\"TOP\" name=\"TOP\"><\/a>top<\/h2>\n<p><code><span class=\"command\">top<\/span><\/code> is a command that shows the active processes on a system. In order to see your active processes on the compute node your job is running on, you will need to run <code><span class=\"command\">top<\/span><\/code> on that compute node. We can do this by remotely accessing the compute node running your job using <code><span class=\"command\">ssh<\/span><\/code>. The compute node running your job can be identified using the <code><span class=\"command\">qstat -u<\/span> <span class=\"placeholder\">userID<\/span><\/code> command.<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">scc % <\/span><span class=\"command\">qstat -u<\/span> <span class=\"placeholder\">userID<\/span>\r\n<span class=\"output\">job-ID  prior   name       user         state submit\/start at     queue                          slots ja-task-ID\r\n-----------------------------------------------------------------------------------------------------------------\r\n4717015 0.10072 my_job1   userID      r     03\/01\/2018 09:35:08 p8@<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">scc-pf2<\/span><\/span>.scc.bu.edu              8\r\n4717016 0.10072 my_job2   userID      r     03\/01\/2018 09:35:08 p8@scc-pf2.scc.bu.edu              8\r\n<\/span>\r\n<span class=\"prompt\">scc % <\/span><span class=\"command\">ssh -t <\/span><span class=\"placeholder\">scc-pf2 <\/span>'<span class=\"command\">top -u <\/span><span class=\"placeholder\">userID<\/span>'\r\n<span class=\"output\">top - 14:37:07 up 40 days, 16:19,  7 users,  load average: 0.11, 0.21, 0.14\r\nTasks: 418 total,   2 running, 416 sleeping,   0 stopped,   0 zombie\r\nCpu(s):  1.3%us,  0.1%sy,  0.0%ni, 98.5%id,  0.1%wa,  0.0%hi,  0.0%si,  0.0%st\r\nMem:  132064072k total, 110925644k used, 21138428k free,   358992k buffers\r\nSwap:  8388604k total,    33376k used,  8355228k free, 107315220k cached\r\n\r\n   PID USER      PR  NI<span class=\"highlight\" style=\"color: #1a206d; font-weight: 800;\">  VIRT   RES <\/span> SHR S %CPU %MEM    TIME+  COMMAND\r\n 37182 userID    20   0<span class=\"highlight\"> 13396  1416 <\/span> 852 R  3.9  0.0   0:00.03 top\r\n 36370 userID    20   0<span class=\"highlight\"> 77648  4756 <\/span>1080 S  0.0  0.0   0:01.99 sshd\r\n 36510 userID    20   0<span class=\"highlight\"> 10.7g  2.1g <\/span> 78m S  0.0  1.3   0:22.97 my_matrix1.m\r\n 36510 userID    20   0<span class=\"highlight\"> 10.7g  2.1g <\/span> 78m S  0.0  1.3   0:22.97 my_matrix2.m\r\n 36510 userID    20   0<span class=\"highlight\"> 10.7g  2.1g <\/span> 78m S  0.0  1.3   0:22.97 my_matrix3.m\r\n 36510 userID    20   0<span class=\"highlight\"> 10.7g  2.1g <\/span> 78m S  0.0  1.3   0:22.97 my_matrix4.m\r\n 36371 userID    20   0<span class=\"highlight\">  9676  1916 <\/span>1384 S  0.0  0.0   0:00.03 bash\r\n 36475 userID    20   0<span class=\"highlight\"> 30944  5456 <\/span>2708 S  0.0  0.0   0:00.08 fslwish8.4\r\n 36502 userID    20   0<span class=\"highlight\"> 13432  1232 <\/span> 904 S  0.0  0.0   0:00.05 freeview\r\n 36504 userID    20   0<span class=\"highlight\">  600m   59m <\/span> 34m S  0.0  0.0   0:05.91 freeview.bin\r\n 36510 userID    20   0<span class=\"highlight\"> 3872m  412m <\/span> 78m S  0.0  0.3   0:22.97 MATLAB\r\n 37181 userID    20   0<span class=\"highlight\"> 92872  1840 <\/span> 872 S  0.0  0.0   0:00.00 sshd\r\n<\/span><\/code><\/pre>\n<blockquote style=\"border-left: 5px solid #eeeeee; margin: 15px 30px 0px 10px; padding-left: 20px;\"><p><em><span style=\"font-size: 90%;\"><b>Note:<\/b> In this example, the compute node is <code><span class=\"placeholder\">scc-pf2<\/span><\/code><\/span> which will need to be changed to the compute node allocated to your job. This is reported in the &#8216;queue&#8217; column of <code><span class=\"command\">qstat -u<\/span> <span class=\"placeholder\">userID<\/span><\/code> command.<\/em><\/p><\/blockquote>\n<p><b>VIRT<\/b> and <b>RES<\/b> represents the total amount of allocated memory (virtual) and actual physical memory (resident) for each process, respectively. In this example, four MATLAB scripts are running in parallel: my_matrix1.m, my_matrix2.m, my_matrix3.m, and my_matrix4.m. Each of these processes has been allocated 10.7GB of memory. You would need to request a minimum of 44GB for four cores, or 11GB per core:<\/p>\n<pre><code class=\"code-block\" style=\"padding: 0em 1em 0em 1em;\"><span style=\"color: #1a6d20; font-weight: 800;\">#!\/bin\/bash -l<\/span>\r\n<span style=\"color: #1a206d;\">#$ -P my_project<\/span>\r\n<span style=\"color: #1a206d;\">#$ -N my_matlab_job<\/span>\r\n<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">#$ -l mem_per_core=11G<\/span><\/span>\r\n<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">#$ -pe omp 4<\/span><\/span><\/code><\/pre>\n<h2 style=\"margin-bottom: 1.em;\"><a id=\"QACCT\" name=\"QACCT\"><\/a>qacct<\/h2>\n<pre class=\"code-block\"><code><span class=\"prompt\">scc % <\/span><span class=\"command\">qacct <\/span>-o <span class=\"placeholder\">userID <\/span>-d<span class=\"placeholder\"> 1 <\/span> -j\r\n<span class=\"output\">==============================================================\r\nqname        p-int\r\nhostname     scc-pi2.scc.bu.edu\r\ngroup        scv\r\nowner        userID\r\nproject      scv\r\ndepartment   defaultdepartment\r\njobname      my_job\r\njobnumber    4035924\r\n...\r\nqsub_time    Thu Jan 25 14:45:36 2018\r\nstart_time   Thu Jan 25 14:46:15 2018\r\nend_time     Fri Jan 26 02:46:16 2018\r\ngranted_pe   NONE\r\n<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">slots        8<\/span><\/span>\r\n...\r\ncpu          202.390\r\nmem          7478.277\r\nio           0.348\r\niow          0.000\r\n<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">maxvmem      63.953G<\/span><\/span>\r\n...\r\n<\/span><\/code><\/pre>\n<blockquote style=\"border-left: 5px solid #eeeeee; margin: 15px 30px 0px 10px; padding-left: 20px;\"><p><em><span style=\"font-size: 90%;\"><b>Note:<\/b> In this example, <code><span class=\"command\">-d<\/span><\/code> is the number of days of job summaries you want to view. See <code><span class=\"command\">man qacct<\/span><\/code> for more details.<\/span><\/em><\/p><\/blockquote>\n<p>The <b>slots<\/b> variable reports the number of cores requested for this job and the <b>maxvmem<\/b> reports the maximum virtual memory used for this job. In this example, <code><span class=\"placeholder\">my_job<\/span><\/code> would need to request 64GB of total memory, or 8GB per core to run optimally:<\/p>\n<pre><code class=\"code-block\" style=\"padding: 0em 1em 0em 1em;\"><span style=\"color: #1a6d20; font-weight: 800;\">#!\/bin\/bash -l<\/span>\r\n<span style=\"color: #1a206d;\">#$ -P my_project<\/span>\r\n<span style=\"color: #1a206d;\">#$ -N my_job<\/span>\r\n<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">#$ -l mem_per_core=8G<\/span><\/span>\r\n<span class=\"highlight\"><span style=\"color: #1a206d; font-weight: 800;\">#$ -pe omp 8<\/span><\/span><\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>Estimating Memory Demands It can be challenging to estimate how much memory your job will require before submission. Benchmarking tests are available for specific applications that can provide a guide but initially it is best to run your job and review the SCC&#8217;s job status reports. Each job is allocated virtual memory throughout the job&#8217;s&#8230;<\/p>\n","protected":false},"author":1692,"featured_media":0,"parent":137962,"menu_order":7,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/137976"}],"collection":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/users\/1692"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/comments?post=137976"}],"version-history":[{"count":5,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/137976\/revisions"}],"predecessor-version":[{"id":147900,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/137976\/revisions\/147900"}],"up":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/137962"}],"wp:attachment":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/media?parent=137976"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}