If you are new to GPU computing with MATLAB, see the Useful Links section at the bottom of this page.

On the Shared Computing Cluster (SCC), a number of nodes are equipped with GPUs. To facilitate computing with GPUs via MATLAB, the Parallel Computing Toolbox provides utility functions capable of exploiting the GPUs for better computational performance. Demonstrated below is a matrix multiply example using the GPU:

%  gpuExample.m  %
function gpuExample(A, B)
%function gpuExample(A, B)
% This function computes matrix product of A and B on Client and GPU
% Collects walltime and check if both results agree (to 5 decimal place)
% A - MATLAB Client array (N x N)
% B - MATLAB Client array (N x N)
% Usage example:
% >> N = 3000; A = rand(N); B = rand(N);
% >> gpuExample(A, B)

C = A*B;    % matrix product on Client
tC = toc;
% copy A and B from Client to GPU
a = gpuArray(A); b = gpuArray(B);
c = a*b;    % matrix product on GPU
tgpu = toc;
CC = gather(c);   % copy data from GPU to Client
tg = toc;

disp(['Matrix multiply time on Client is ' num2str(tC)])
disp(['Matrix multiply time on GPU is ' num2str(tgpu)])
disp(['Time for gathering data from GPU back to Client is ' num2str(tg)])

% Verify that GPU and Client computations agree
tol = 1e-5;
if any(abs(CC-C) > tol)
    disp('Matrix product on Client and GPU disagree')
    disp('Matrix product on Client and GPU agree')

end   % function gpuExample 

There are two ways to run the GPU code:

  • For debugging and code development, run MATLAB job in interactive batch:
    To run a MATLAB code with GPU instructions, you need to run it on an SCC node with at least 1 GPU (the SCC login nodes do not have any).

    1. First, launch an interactive batch session with:
      scc1% qrsh -l gpus=1

      The “-l gpus=1” specifies that 1 GPU is requested. Without an explicit request (i.e., by default) a twelve-hour wallclock time limit is imposed.

    2. When an SCC node with GPUs is available, the interactive batch job will be accepted and a new X window appears.
    3. Launch MATLAB from this window:
      scc-ha2% matlab &

      In the MATLAB window, run the gpuExample.m script:

      >> N=3000; A=rand(N); B=rand(N);
      >> gpuExample
      Matrix multiply time on Client is 1.236
      Matrix multiply time on GPU is 0.000501
      Time for gathering data from GPU back to Client is 0.20443
      Matrix product on Client and GPU agree
  • For production, run MATLAB job in batch

    Create a batch script mybatch as follows:

    # Batch submission procedure: 
    # scc1% qsub mybatch
    # Note: A line of the form "#$ qsub_option" is interpreted
    #       by qsub as if "qsub_option" was passed to qsub on
    #       the commandline.
    # Set the hard runtime (aka wallclock) limit for this job,
    # default is 12 hours. Format: -l h_rt=HH:MM:SS
    #$ -l h_rt=12:00:00
    # Merge stderr into the stdout file, to reduce clutter.
    #$ -j y
    # Specifies number of GPUs wanted
    #$ -l gpus=1
    # end of qsub options
    matlab -nodisplay -singleCompThread -r "N=3000;gpuExample(rand(N),rand(N));exit"
    # end of script

    On the last statement, strings enclosed in double quotes (“) are valid MATLAB commands, including your own application m-files (without the .m suffix). The MATLAB exit command is required to quit MATLAB and end the batch job.

    Submit the batch job using the above batch script which requests 1 CPU and 1 GPU. For other options, please visit GPU Computing on the SCF

    scc1% qsub mybatch

    Use qstat to query the status of your job

    scc1:% qstat -u userid
    job-ID  prior   name       user         state submit/start at     queue . . .
     477578 0.00000 mybatch   userid           qw    03/14/2013 08:50:06

Useful Links

  1. GPU Programming in MATLAB
  2. GPU Computing on the SCF
  3. GPU Computing with MATLAB webinar
  4. MATLAB GPU Computing Support webpages