{"id":150675,"date":"2024-03-01T14:43:19","date_gmt":"2024-03-01T19:43:19","guid":{"rendered":"http:\/\/www.bu.edu\/tech\/?page_id=150675"},"modified":"2024-05-14T13:37:51","modified_gmt":"2024-05-14T17:37:51","slug":"gpus","status":"publish","type":"page","link":"https:\/\/www.bu.edu\/tech\/support\/research\/computing-resources\/external\/access\/gpus\/","title":{"rendered":"ACCESS GPU Resources"},"content":{"rendered":"<p><a name=\"top\"><\/a><\/p>\n<style>\n  p { text-align: justify; }\n  .image-container { float: right; width: 48%; margin-left: 20px; }\n<\/style>\n<div style=\"width:100%;\">\n<div class=\"image-container\">\n    <img loading=\"lazy\" src=\"\/tech\/files\/2024\/04\/delta_front_1-e1713292069940-636x280.png\" alt=\"\" width=\"636\" height=\"280\" class=\"alignnone size-medium wp-image-151607\" \/>\n  <\/div>\n<p>Many deep learning applications can be trained using a single or may require multiple GPUs. In general, SCC users are limited to using a maximum of 4 GPUs on a single node unless they have access to reserved buy-in resources. If your application requires more GPUs that are available on the SCC, we recommend requesting resources from <a href=\"https:\/\/access-ci.org\/\" rel=\"nofollow\">ACCESS<\/a>. For information about getting credits to access these resources please see this <a href=\"https:\/\/www.bu.edu\/tech\/support\/research\/computing-resources\/external\/access\/\">page<\/a>. We advise using the NCSA Delta system for distributed multi-node GPU computations. The full list of resource providers can be found <a href=\"https:\/\/access-ci.org\/resource-providers\/\">here<\/a>.<\/p>\n<p>Each of these clusters uses the <a href=\"https:\/\/slurm.schedmd.com\/sbatch.html\">SLURM<\/a> workload manager to schedule batch jobs on their system. We provide an overview of hardware details about each system below. For specific details on using SLURM on these clusters and complete hardware details please click on the documentation links in each section. At the end of this documentation we provide a link to a Github repository that contains two example codes which demonstrate how to run multiple-GPU distributed node computations on the Delta system.<\/p>\n<\/div>\n<p><strong>Sections<\/strong><\/p>\n<ul>\n<li><a href=\"#delta\">Delta<\/a><\/li>\n<li><a href=\"#other\">Additional Access GPU resources<\/a><\/li>\n<\/ul>\n<p>&nbsp;<\/p>\n<p><a name=\"delta\"><\/a><\/p>\n<h2>Delta<\/h2>\n<p>The University of Illinois NCSA <a href=\"https:\/\/docs.ncsa.illinois.edu\/systems\/delta\/en\/latest\/index.html\">Delta<\/a> system is designed to run applications on GPU nodes or hybrid CPU-GPU nodes. The Delta system is comprised of 5 node types. The following are the relevant 4 GPU node types:<\/p>\n<table width=\"715\" height=\"315\">\n<tbody>\n<tr>\n<td>Number of nodes<\/td>\n<td>Number of GPUs per node<\/td>\n<td>GPU type<\/td>\n<td>Memory<\/td>\n<\/tr>\n<tr>\n<td>100<\/td>\n<td>4<\/td>\n<td>NVIDIA A40<\/td>\n<td>40GB<\/td>\n<\/tr>\n<tr>\n<td>100<\/td>\n<td>4<\/td>\n<td>NVIDIA A100<\/td>\n<td>48GB<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>8<\/td>\n<td>NVIDIA A100<\/td>\n<td>40 GB<\/td>\n<\/tr>\n<tr>\n<td>1<\/td>\n<td>8<\/td>\n<td>AMD M100<\/td>\n<td>32 GB<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Users can log on to the Delta system by following the instructions at this <a href=\"https:\/\/docs.ncsa.illinois.edu\/systems\/delta\/en\/latest\/user_guide\/login.html\">link<\/a>. Delta users can ssh to command line login nodes. Alternatively, there is an Open OnDemand interface that is similar to the SCC.<\/p>\n<div style=\"margin-left: 20px;\">\n<h3>Example GPU codes for Delta<\/h3>\n<p>Follow this Github <a href=\"https:\/\/github.com\/bu-rcs\/access-use-case-demos\">link<\/a> for documentation on the example codes.<\/div>\n<p>&nbsp;<\/p>\n<p><a name=\"other\"><\/a><\/p>\n<h2>Additional Access GPU resources<\/h2>\n<div style=\"margin-left: 20px;\">\n<h3>Rockfish<\/h3>\n<p>The JHU <a href=\"https:\/\/www.arch.jhu.edu\/support\/\">Rockfish<\/a> system is a community-shared cluster housed at the Maryland Advanced Research and Computer Center in Baltimore. The GPU nodes that are available are:<\/p>\n<table width=\"715\" height=\"134\">\n<tbody>\n<tr>\n<td>Number of nodes<\/td>\n<td>Number of GPUs per node<\/td>\n<td>GPU type<\/td>\n<td>Memory<\/td>\n<\/tr>\n<tr>\n<td>18<\/td>\n<td>4<\/td>\n<td>NVIDIA A100<\/td>\n<td>40 GB<\/td>\n<\/tr>\n<tr>\n<td>6<\/td>\n<td>4<\/td>\n<td>NVIDIA A100<\/td>\n<td>80 GB<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>FASTER<\/h3>\n<p>The TAMU <a href=\"https:\/\/hprc.tamu.edu\/kb\/User-Guides\/FASTER\/#compute-nodes\">FASTER<\/a> system is a Dell x86 HPC cluster consisting of 180 compute nodes. Researchers with allocations on FASTER can request up to 10 composable GPUs. This means that GPU resources are added to a compute node on the fly. The GPU architectures that are composable to the compute nodes are:<\/p>\n<table width=\"715\" height=\"118\">\n<tbody>\n<tr>\n<td>Number of GPUS<\/td>\n<td>GPU type<\/td>\n<td>Memory<\/td>\n<\/tr>\n<tr>\n<td>200<\/td>\n<td>NVIDIA T4<\/td>\n<td>16 GB<\/td>\n<\/tr>\n<tr>\n<td>40<\/td>\n<td>NVIDIA A100<\/td>\n<td>40 GB<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>NVIDIA A10<\/td>\n<td>24 GB<\/td>\n<\/tr>\n<tr>\n<td>4<\/td>\n<td>NVIDIA A30<\/td>\n<td>24 GB<\/td>\n<\/tr>\n<tr>\n<td>8<\/td>\n<td>NVIDIA A40<\/td>\n<td>24 GB<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<div style=\"float: right;\">\n<p id=\"last-modified-timestamp\" style=\"margin: 0;\">Last updated: Loading&#8230;<\/p>\n<\/div>\n<p><script>\r\ndocument.addEventListener('DOMContentLoaded', function() {\r\n    \/\/ Get the content of the meta tag\r\n    var lastUpdatedContent = document.querySelector('meta[name=\"last-updated\"]').content;\r\n    \r\n    \/\/ Parse the content into a Date object\r\n    var lastUpdatedDate = new Date(lastUpdatedContent);\r\n    \r\n    \/\/ Format the date\r\n    var formattedDate = lastUpdatedDate.toLocaleDateString(undefined, { year: 'numeric', month: 'long', day: 'numeric' });\r\n    \r\n    \/\/ Update the HTML element with the formatted date\r\n    document.getElementById('last-modified-timestamp').innerHTML = 'Last updated: ' + formattedDate;\r\n});\r\n<\/script><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Many deep learning applications can be trained using a single or may require multiple GPUs. In general, SCC users are limited to using a maximum of 4 GPUs on a single node unless they have access to reserved buy-in resources. If your application requires more GPUs that are available on the SCC, we recommend requesting&#8230;<\/p>\n","protected":false},"author":1692,"featured_media":0,"parent":47915,"menu_order":2,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/150675"}],"collection":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/users\/1692"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/comments?post=150675"}],"version-history":[{"count":21,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/150675\/revisions"}],"predecessor-version":[{"id":152223,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/150675\/revisions\/152223"}],"up":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/47915"}],"wp:attachment":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/media?parent=150675"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}