The login and compute nodes on the SCC consist of a variety of CPU architectures.  The available compilers have a number of different optimization options that can attempt to take advantage of specific features of the CPU architectures to produce faster programs.  The great majority of compute nodes use Intel processors. There are additionally a small number of compute nodes that use AMD processors with the Bulldozer and Epyc architectures, although access to these is limited to particular groups. The complete list of compute nodes, CPU model, and CPU architecture available on the SCC can be found on the Technical Summary page.

One key aspect of how different CPU architectures are distinguished from each other involves instructions that make use of additional processor hardware for specialized computations.  On the SCC a program compiled with options that optimize performance for a newer CPU architecture may be unable to run on older compute nodes.  These include single instruction, multiple data (SIMD) instruction sets. SIMD instructions allow for multiple pieces of data to have the same instruction performed on them in parallel.  Generally speaking, for scientific computing the main SIMD instruction sets of interest are those which apply to floating point calculations.  Using SIMD instructions in compiled code can sometimes improve program performance by factors of 2-10x.

On the SCC the significant ones that require attention when compiling are the AVX, AVX2, and AVX-5122 instruction sets.  The AVX and AVX2 instructions, for example, can perform operations on up to 8 32-bit floating point numbers simultaneously in a single step.

SIMD Instructions by CPU Architecture

When using optimization options with compilers on the SCC, care must be taken to compile a program that contains instructions that are compatible with the architecture of the compute node that will execute the program.  For some floating-point intensive codes, compiling in support for particular SIMD instructions can have a dramatic improvement in program performance.  The following table shows SIMD support by CPU architecture.

Intel Architecture SSE4.2 AVX AVX2 FMA4 AVX-512
Sandybridge Yes Yes No No No
Ivybridge Yes Yes No No No
Haswell Yes Yes Yes No No
Broadwell Yes Yes Yes No No
Skylake Yes Yes Yes No Yes
Cascadelake Yes Yes Yes No Yes
Icelake Yes Yes Yes No Yes
AMD Architecture SSE4.2 AVX AVX2 FMA4 AVX-512
Bulldozer Yes Yes No Yes No
Epyc Yes Yes Yes No No

When SIMD instruction output is specified in a compiler, the compute node that is used to run the software must have CPU support for the instructions.  The SCC queue system has options that allow for the specification of CPU architecture, and batch jobs need to match compiler options for jobs to run successfully.