Hybrid treecode/FMM

We developed a new treecode-FMM hybrid algorithm with auto-tuning capabilities, that is O(N) and chooses the most efficient type of interaction. Auto-tuning is achieved by dynamically selecting the kernels during runtime, which requires a generic and flexible algorithm for traversing the tree.

ExaFMM uses a dual tree traversal for the target and source tree. See Figure 2. In the initial step, both trees are formed and their root cells are paired and pushed to an empty stack. Further iterations consist of popping a pair of cells form the stack, splitting the larger one, and applying an acceptance criterion to determine if multipole interaction is valid or not. If not, a new pair is pushed to the stack.

Figure 2 shows three cases: the ones on the left and right satisfy the multipole acceptance criterion (MAC), while the one in the center does not and is pushed to the stack.

Figure 2—Dual-tree traversal: illustration of the stack-based procedure.

Figure 2 (copyrighted) —Dual-tree traversal: illustration of the stack-based procedure. From “Hierarchical N-body simulations with auto-tuning for heterogeneous systems”, Rio Yokota, L A Barba. Computing in Science and Engineering (CiSE), 3 January 2012, IEEE Computer Society, doi:10.1109/MCSE.2012.1.