Paper accepted in CiSE

in Publications
January 4th, 2012

A new paper authored by the ExaFMM team has been accepted, this time to appear in Computing in Science and Engineering, the joint publication of the IEEE Computer Society and the American Institute of Physics.

  • Title: “Heterogeneous N-body Simulations with Auto-Tuning for Heterogeneous Systems”
  • Authors: Rio Yokota and Lorena A. Barba
  • To appear: Computing in Science and Engineering (CiSE), May/June 2012 issue.
  • PreprintarXiv:1108.5815

This paper presents the new hybrid treecode/FMM formulation of ExaFMM, which maintains the O(N) complexity, but is able to perform both cell-cell and cell-particle interactions (i.e., it is both a treecode and an FMM code). The code also offers auto-tuning capability, being able to choose dynamically which type of interaction to perform: cell-cell, cell-particle or particle-particle. This feature is enabled by means of a dual-tree traversal technique, described in more detail in the Features section of this website.

The paper also discusses the advantage of multipole algorithms on GPU hardware, with the aid of the roofline model to show how it compares with other algorithms. Some of the recent work that is also described includes a many-GPU turbulence calculations on Tsubame 2.0 with 2048 GPUs, which achieved 0.5 petaflop/s in performance.

From the Conclusions of the paper:

The fact that the current method can automatically choose the optimal interactions, on a given heterogeneous system, alleviates the user from two major burdens. Firstly, the user does not need to decide among treecode or FMM, predicting which algorithm will be faster for a particular application given the accuracy requirements—they are now one algorithm.  Secondly, there is no need to tweak parameters, e.g., particles per cell, in order to achieve optimal performance on GPUs—the same code can run on any machine without changing anything. This feature is a requirement to developing a black-box software library for fast N-body algorithms on heterogeneous systems, which is our goal.