Programming for GPUs using OpenACC in C/C++ : TechWeb : Boston University

Introduction

OpenACC is a directives-based API for code parallelization with accelerators, for example, NVIDIA GPUs. In contrast, OpenMP is the API for shared-memory parallel processing with CPUs. OpenACC is designed to provide a simple yet powerful approach to accelerators without significant programming effort. Programmers simply insert OpenACC directives before specific code sections, typically with loops, to engage the GPUs. This approach enables the compiler to target and optimize parallelism. In many cases of GPU computing, the programming efforts in OpenACC is much less than that in Nvidia’s CUDA programming language. For many large existing codes, rewriting them with CUDA is impractical if not impossible. For those cases, OpenACC offers a pragmatic alternative.

What you need to know or do on the SCC

To use OpenACC, compile your C (or C++) code with the Portland Group Inc. (PGI) compiler pgcc (pgCC for C++). You will need to load a module in order to use the PGI compiler:
```
scc1% module load nvidia-hpc/2023-23.5
```
After this, you can proceed with compilation. For example:
```
scc1% pgcc -o mycode -acc -Minfo mycode.c
```
In the above example, -acc turns on the OpenACC feature while -Minfo returns additional information on the compilation. For details, see the man page of pgcc:
```
scc1% man pgcc
```
To submit your code (with OpenACC directives) to a SCC node with GPUs
```
scc1% qsub -l gpus=1 -b y mycode
```
In the above example, 1 GPU (and in the absence of a multiprocessor request, 1 CPU) is requested.

Additional examples of GPU batch jobs are available here.

OpenACC Tutorial

Please refer to the RCS tutorial slides for OpenACC programming.

Relevant Links

OpenACC 3.0 specification

OpenACC Consulting

RCS staff scientific programmers can help you with your OpenACC code tuning. For assistance, please send email to help@scc.bu.edu.