{"id":356,"date":"2010-09-03T14:00:18","date_gmt":"2010-09-03T18:00:18","guid":{"rendered":"https:\/\/www.bu.edu\/pasi\/?page_id=356"},"modified":"2010-10-28T19:15:31","modified_gmt":"2010-10-28T23:15:31","slug":"12-steps-to-having-a-fast-multipole-method-on-gpus","status":"publish","type":"page","link":"https:\/\/www.bu.edu\/pasi\/courses\/12-steps-to-having-a-fast-multipole-method-on-gpus\/","title":{"rendered":"12 Steps to a Fast Multipole Method on GPUs"},"content":{"rendered":"<h4>by Dr Rio Yokota<\/h4>\n<p><strong>Boston University<\/strong><\/p>\n<figure id=\"attachment412\" aria-describedby=\"caption-attachment412\" style=\"width: 242px\" class=\"wp-caption alignright\"><a href=\"\/pasi\/files\/2010\/09\/star_cluster.png\"><img loading=\"lazy\" class=\"size-full wp-image-412\" title=\"star_cluster\" src=\"\/pasi\/files\/2010\/09\/star_cluster.png\" alt=\"Fast multipole methods are used for gravitational N-body problems\" width=\"232\" height=\"294\" \/><\/a><figcaption id=\"caption-attachment412\" class=\"wp-caption-text\">Fast multipole methods are used for gravitational N-body problems<\/figcaption><\/figure>\n<h4>Why would I want to learn FMM?<\/h4>\n<p>The combination of algorithmic acceleration and hardware acceleration can have tremendous impact. The FMM is a fast algorithm for calculating matrix vector multiplications in O(N) time, and it runs very fast on GPUs.\u00a0Its combination of high degree of parallelism and O(N) complexity make it an attractive solver for the Peta-scale and Exa-scale era.\u00a0It has a wide range of applications, <em>e.g.<\/em> quantum mechanics, molecular dynamics, electrostatics, acoustics, structural mechanics, fluid mechanics, and astrophysics.<\/p>\n<h3><span style=\"font-weight: normal;\">What are the prerequisites?<\/span><\/h3>\n<ul>\n<li>Basic understanding of linear algebra and calculus<\/li>\n<li>Some experience with C or C++<\/li>\n<\/ul>\n<h3><span style=\"font-weight: normal;\">The aim of this course<\/span><\/h3>\n<p>This course will provide a hands-on tutorial with simple exercises that will lead to a full FMM code that runs on GPUs. Each step is carefully planned so that there is no sudden jump in the difficulty. During the tutorial each attendee will have remote access to the GPU cluster in Nagasaki Japan. There will be Teaching Assistants walking around to assist you, so that no one is left behind.<\/p>\n<h3><span style=\"font-weight: normal;\">The 12 steps<\/span><\/h3>\n<ul>\n<li>step 1 \u00a0 : direct N-body<\/li>\n<li>step 2 \u00a0 : multipole expansion<\/li>\n<li>step 3 \u00a0 : tree structure<\/li>\n<li>step 4 \u00a0 : link lists, interaction lists<\/li>\n<li>step 5 \u00a0 : treecode<\/li>\n<li>step 6 \u00a0 : M2L translation<\/li>\n<li>step 7 \u00a0 : single level FMM<\/li>\n<li>step 8 \u00a0 : M2M,L2L translation<\/li>\n<li>step 9 \u00a0 : multi-level FMM<\/li>\n<li>step 10 : direct N-body on GPUs<\/li>\n<li>step 11 : P2P on GPUs<\/li>\n<li>step 12 : M2L on GPUs<\/li>\n<\/ul>\n<h3>Printable course description<\/h3>\n<p style=\"padding-left: 30px;\"><a href=\"\/pasi\/files\/2010\/09\/PASI-Yokota-course.pdf\">PASI Yokota course<\/a><\/p>\n<h2><span style=\"font-weight: normal;\">Supplementary material<\/span><\/h2>\n<ul>\n<li><a href=\"http:\/\/www.ics.uci.edu\/~ihler\/papers\/ihler_area.pdf\">http:\/\/www.ics.uci.edu\/~ihler\/papers\/ihler_area.pdf<\/a><\/li>\n<li><a href=\"http:\/\/math.nyu.edu\/faculty\/greengar\/shortcourse_fmm.pdf\">http:\/\/math.nyu.edu\/faculty\/greengar\/shortcourse_fmm.pdf<\/a><\/li>\n<li><a href=\"http:\/\/www.umiacs.umd.edu\/~ramani\/cmsc878R\/index.html\">http:\/\/www.umiacs.umd.edu\/~ramani\/cmsc878R\/index.html<\/a><\/li>\n<\/ul>\n<figure id=\"attachment363\" aria-describedby=\"caption-attachment363\" style=\"width: 646px\" class=\"wp-caption alignnone\"><a href=\"\/pasi\/files\/2010\/09\/Picture-3.png\"><img loading=\"lazy\" class=\"size-medium wp-image-363\" title=\"Flow of FMM calculation\" src=\"\/pasi\/files\/2010\/09\/Picture-3-636x381.png\" alt=\"Flow of FMM calculation\" width=\"636\" height=\"381\" srcset=\"https:\/\/www.bu.edu\/pasi\/files\/2010\/09\/Picture-3-636x381.png 636w, https:\/\/www.bu.edu\/pasi\/files\/2010\/09\/Picture-3.png 919w\" sizes=\"(max-width: 636px) 100vw, 636px\" \/><\/a><figcaption id=\"caption-attachment363\" class=\"wp-caption-text\">This diagram shows the flow of both the FMM and treecode algorithm, and introduces the various operations that take place.<\/figcaption><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>by Dr Rio Yokota Boston University Why would I want to learn FMM? The combination of algorithmic acceleration and hardware acceleration can have tremendous impact. The FMM is a fast algorithm for calculating matrix vector multiplications in O(N) time, and it runs very fast on GPUs.\u00a0Its combination of high degree of parallelism and O(N) complexity [&hellip;]<\/p>\n","protected":false},"author":3555,"featured_media":0,"parent":321,"menu_order":11,"comment_status":"closed","ping_status":"open","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/pages\/356"}],"collection":[{"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/users\/3555"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/comments?post=356"}],"version-history":[{"count":28,"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/pages\/356\/revisions"}],"predecessor-version":[{"id":888,"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/pages\/356\/revisions\/888"}],"up":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/pages\/321"}],"wp:attachment":[{"href":"https:\/\/www.bu.edu\/pasi\/wp-json\/wp\/v2\/media?parent=356"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}