{"id":78164,"date":"2014-05-08T14:58:38","date_gmt":"2014-05-08T18:58:38","guid":{"rendered":"http:\/\/www.bu.edu\/tech\/?page_id=78164"},"modified":"2022-06-07T10:22:02","modified_gmt":"2022-06-07T14:22:02","slug":"profiling","status":"publish","type":"page","link":"https:\/\/www.bu.edu\/tech\/support\/research\/software-and-programming\/programming\/profiling\/","title":{"rendered":"Profiling"},"content":{"rendered":"<p>Profiling your code means measuring how long various parts of your application run. You can do this manually by writing your own code, checking the time before and after execution. An alternative option is to instruct the compiler to record profiling information for every function and subroutine in the program. Then with an external tool, such as <strong>gprof<\/strong>, you can explore 1) the total time spent calling each method and 2) the average time spent for each call of each method. This provides the evidence needed for deciding which methods need work to improve their performance. <\/p>\n<h2>An example using gprof<\/h2>\n<p>Consider <a href=\"http:\/\/rcs.bu.edu\/examples\/fortran\/examples\/profiling\/example.f90\">this example<\/a>, where two subroutines are called multiple times in a main program. To profile this example, you need to:<\/p>\n<ol>\n<li>Compile this program using the <code>-pg<\/code> flag: <code><span class=\"command\">gfortran<\/span> -pg -o <span class=\"placeholder\">example.exe example.f90<\/span><\/code><\/li>\n<li>Run the resulting executable; it will produce a file <code>gmon.out<\/code>, which holds profiling information.<\/li>\n<li>Run the command <code><span class=\"command\">gprof<\/span> -b <span class=\"placeholder\">example.exe<\/span> gmon.out<\/code><br \/>\nThe <code>-b<\/code> flag tells <code><span class=\"command\">gprof<\/span><\/code> to suppress tons of extra help <em>blurbs<\/em>. Try this again without the flag, and you&#8217;ll understand why I advised using it. There is also a <code>-p<\/code> flag that requests only the &#8220;Flat profile&#8221; section, which I discuss below, and <code>-P<\/code> requests only the &#8220;Call graph&#8221; section.<\/li>\n<\/ol>\n<p>The resulting output looks like this:<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">scc4%<\/span> <span class=\"command\">gprof<\/span> -b <span class=\"placeholder\">example.exe<\/span> gmon.out\r\n\r\n<span class=\"output\">Flat profile:\r\n\r\nEach sample counts as 0.01 seconds.\r\n  %   cumulative   self              self     total\r\n time   seconds   seconds    calls   s\/call   s\/call  name\r\n 64.25      5.52     5.52        5     1.10     1.10  do_slow_\r\n 36.47      8.65     3.13       10     0.31     0.31  do_fast_\r\n  0.00      8.65     0.00        1     0.00     8.65  MAIN__\r\n\r\n\r\n                        Call graph\r\n\r\n\r\ngranularity: each sample hit covers 2 byte(s) for 0.12% of 8.65 seconds\r\n\r\nindex % time    self  children    called     name\r\n                0.00    8.65       1\/1           main [2]\r\n[1]    100.0    0.00    8.65       1         MAIN__ [1]\r\n                5.52    0.00       5\/5           do_slow_ [3]\r\n                3.13    0.00      10\/10          do_fast_ [4]\r\n-----------------------------------------------\r\n                                                 <spontaneous>\r\n[2]    100.0    0.00    8.65                 main [2]\r\n                0.00    8.65       1\/1           MAIN__ [1]\r\n-----------------------------------------------\r\n                5.52    0.00       5\/5           MAIN__ [1]\r\n[3]     63.8    5.52    0.00       5         do_slow_ [3]\r\n-----------------------------------------------\r\n                3.13    0.00      10\/10          MAIN__ [1]\r\n[4]     36.2    3.13    0.00      10         do_fast_ [4]\r\n-----------------------------------------------\r\n\r\n\r\nIndex by function name\r\n\r\n   [1] MAIN__                  [4] do_fast_                [3] do_slow_\r\n<\/span><\/code><\/pre>\n<p>The above output shows a &#8220;Flat profile&#8221;, which orders the routines by the percent of the total time spent calling these routines. The &#8220;call graph&#8221; on the other hand, shows the total time spent calling each routing, including the time spent in routines called with in. For example, very little time is spent in the <code>MAIN__<\/code> program, as seen in the Flat Profile, self seconds column. But in the call graph, we see in section [2] that 8.65 seconds is spent as a result of calling the main program. This resulted in part from 5.52 seconds spent calling <code>do_slow_<\/code>, as seen in section [3].<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Profiling your code means measuring how long various parts of your application run. You can do this manually by writing your own code, checking the time before and after execution. An alternative option is to instruct the compiler to record profiling information for every function and subroutine in the program. Then with an external tool,&#8230;<\/p>\n","protected":false},"author":1692,"featured_media":0,"parent":64953,"menu_order":5,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/78164"}],"collection":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/users\/1692"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/comments?post=78164"}],"version-history":[{"count":23,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/78164\/revisions"}],"predecessor-version":[{"id":140700,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/78164\/revisions\/140700"}],"up":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/64953"}],"wp:attachment":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/media?parent=78164"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}