12 Steps to a Fast Multipole Method on GPUs

by Dr Rio Yokota

Boston University

Fast multipole methods are used for gravitational N-body problems
Fast multipole methods are used for gravitational N-body problems

Why would I want to learn FMM?

The combination of algorithmic acceleration and hardware acceleration can have tremendous impact. The FMM is a fast algorithm for calculating matrix vector multiplications in O(N) time, and it runs very fast on GPUs. Its combination of high degree of parallelism and O(N) complexity make it an attractive solver for the Peta-scale and Exa-scale era. It has a wide range of applications, e.g. quantum mechanics, molecular dynamics, electrostatics, acoustics, structural mechanics, fluid mechanics, and astrophysics.

What are the prerequisites?

  • Basic understanding of linear algebra and calculus
  • Some experience with C or C++

The aim of this course

This course will provide a hands-on tutorial with simple exercises that will lead to a full FMM code that runs on GPUs. Each step is carefully planned so that there is no sudden jump in the difficulty. During the tutorial each attendee will have remote access to the GPU cluster in Nagasaki Japan. There will be Teaching Assistants walking around to assist you, so that no one is left behind.

The 12 steps

  • step 1   : direct N-body
  • step 2   : multipole expansion
  • step 3   : tree structure
  • step 4   : link lists, interaction lists
  • step 5   : treecode
  • step 6   : M2L translation
  • step 7   : single level FMM
  • step 8   : M2M,L2L translation
  • step 9   : multi-level FMM
  • step 10 : direct N-body on GPUs
  • step 11 : P2P on GPUs
  • step 12 : M2L on GPUs

Printable course description

PASI Yokota course

Supplementary material

Flow of FMM calculation
This diagram shows the flow of both the FMM and treecode algorithm, and introduces the various operations that take place.