{"id":4618,"date":"2009-10-08T18:19:01","date_gmt":"2009-10-08T22:19:01","guid":{"rendered":"https:\/\/www.bu.edu\/tech\/research\/scv_import\/documentation\/software-help\/mathematics\/pct\/how_to_use_matlabpool\/"},"modified":"2019-07-25T16:31:27","modified_gmt":"2019-07-25T20:31:27","slug":"how_to_use_parpool","status":"publish","type":"page","link":"https:\/\/www.bu.edu\/tech\/support\/research\/software-and-programming\/common-languages\/matlab\/pct\/how_to_use_parpool\/","title":{"rendered":"How to Use parpool"},"content":{"rendered":"<ul>\n<li><a href=\"#parpool\">General Usage of parpool<\/a><\/li>\n<li><a href=\"#config\">Configuring parpool for Batch Submission<\/a><\/li>\n<\/ul>\n<h2><a name=\"parpool\"><\/a>General Usage of parpool<\/h2>\n<p>Invoking <strong>parpool<\/strong> submits a batch job to start a parallel environment.<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">parpool(n)<\/span><\/code><\/pre>\n<p>where <code>n<\/code> is the number of labs (i.e. workers). If, in addition, <code><span class=\"command\">spmd<\/span><\/code> is invoked, then this parallel environment is almost equivalent to <code><span class=\"command\">pmode<\/span><\/code> but without the interactive GUI layout. By invoking <code><span class=\"command\">spmd<\/span><\/code>, <code>labindex<\/code> and <code>numlabs<\/code>, as are available in <code><span class=\"command\">pmode<\/span><\/code>, are now available. If <code><span class=\"command\">spmd<\/span><\/code> is not enabled (as in a <code><span class=\"command\">parfor<\/span><\/code> application), <code>labindex<\/code> is not needed and is not available. However, there are times when you might need to know the number of processors, <code>numlabs<\/code>. In that situation, you can do the following:<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span>p = <span class=\"command\">gcp<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span>n = <span class=\"command\">p.NumWorkers<\/span>\r\nn = 4\r\n<\/code><\/pre>\n<ul>\n<li><code><span class=\"command\">parfor<\/span><\/code> can be used with <code><span class=\"command\">parpool<\/span><\/code> (but not inside <code><span class=\"command\">spmd<\/span><\/code>). If <code><span class=\"command\">parpool(n)<\/span><\/code> is not explicitly executed before, the <code><span class=\"command\">parfor<\/span><\/code> will automatically launch as many workers as possible; that is, the number of workers will be equal to the maximum number of physical cores on the node.<\/li>\n<li>\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">parfor<\/span> i=1:4, disp(['myid is ' num2str(labindex) '; i = ' num2str(i)]),end  # <span style=\"color: #222222; font-family: Consolas;\">labindex<\/span> is not enabled in parfor\r\n<span class=\"output\">myid is 1; i = 4\r\nmyid is 1; i = 3\r\nmyid is 1; i = 2\r\nmyid is 1; i = 1<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> for j=1:4, disp(['myid is ' num2str(labindex) '; j = ' num2str(j)]),end\r\n<span class=\"output\">myid is 1; j = 1\r\nmyid is 1; j = 2\r\nmyid is 1; j = 3\r\nmyid is 1; j = 4<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> x = 0; <span class=\"command\">parfor<\/span> i=1:10, x = x + i; end   <span class=\"comment\">% reduction operation<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> y = []; <span class=\"command\">parfor<\/span> i=1:10, y = [y, i]; end   <span class=\"comment\">% concatenation<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> z = []; <span class=\"command\">parfor<\/span> i=1:10, z(i) = i; end    <span class=\"comment\">% slicing<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> f = zeros(1,50); f(1) =  1; f(2) = 2;\r\n<span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">parfor<\/span> n=3:50\r\nf(n) = f(n-1) + f(n-2);\r\nend <span class=\"comment\">% Fibonacci numbers; not parallelizable because of data dependency. The op fails.<\/span>\r\n<span class=\"comment\">% Next is a reduction; but \"-\" violates associative rule, and the op fails<\/span>\r\n&gt;<span class=\"prompt\">&gt;<\/span> u = 0; <span class=\"command\">parfor<\/span> i=1:10, u = i - u; end<\/code><\/pre>\n<\/li>\n<li>\n<pre><code><span class=\"command\">spmd<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">spmd<\/span> x = labindex; disp(['for myid = ' num2str(labindex) '; x = ' num2str(x)]),end\r\n<span class=\"output\">\r\nLab 1:\r\n  for myid = 1; x = 1\r\nLab 2:\r\n  for myid = 2; x = 2\r\nLab 3:\r\n  for myid = 3; x = 3\r\nLab 4:\r\n  for myid = 4; x = 4<\/span><\/code><\/pre>\n<p>The first column of output above are the rank numbers printed automatically by <code><span class=\"command\">spmd<\/span><\/code>. Without activating <code><span class=\"command\">spmd<\/span><\/code>, you have no access to <code>labindex<\/code> (or <code>numlabs<\/code>).<\/li>\n<\/ul>\n<ul>\n<li><code>&gt;&gt; <span class=\"command\">delete(gcp)<\/span><\/code> % to close all pre-existing or dangling parpool jobs<\/li>\n<li>A nondistributed array is a shared array. Any change made to the array by a lab causes the change to be seen by all labs.<\/li>\n<li>A replicated array (nondistributed) resides on the MATLAB client. This array is visible from the labs.<\/li>\n<li>A variant array resides, in its entirety, in an individual lab&#8217;s workspace. This is a composite array. A composite array can communicate between the MATLAB client and labs directly. A composite array can be manipulated from the client. In this case, the {labnumber} must always be used. Here is an example:\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span> A = Composite();\r\n<span class=\"prompt\">&gt;&gt;<\/span> A = magic(4);   <span class=\"comment\">% A is a replicate array<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> A{2} = ones(4);   <span class=\"comment\">% Change A on lab 2 directly from MATLAB client<\/span><\/code><\/pre>\n<\/li>\n<li>A codistributed array is a single array divided into segments (e.g., columns of a 2D array), each residing in the workspace of a lab.<\/li>\n<li>Codistributed arrays<br \/>\nArray of this type are partitioned into segments, each residing in a different lab. This results in memory saving, ideal for large arrays.<\/p>\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">spmd<\/span>\r\nA = [11:18; 21:28; 31:38; 41:48]\r\nD = codistributed(A)\r\nend\r\n<span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">spmd<\/span>\r\nD\r\nsize(D)    <span class=\"comment\">% this will report size of the global array<\/span>\r\nL = getLocalPart(D)    <span class=\"comment\">% L is the same as D, locally<\/span>\r\nsize(L)    <span class=\"comment\">% this gives the size of the local array<\/span>\r\nend<\/code><\/pre>\n<\/li>\n<li>non-distributed arrays<br \/>\nArrays created on the MATLAB client, before or after parpool, are nondistributed arrays (replicated and variant arrays); entire array is stored on each lab.<\/li>\n<\/ul>\n<h4>Codistributed arrays<\/h4>\n<ul>\n<li>Why create a codistributed array?<br \/>\nFor more efficient parallel computing and memory usage.<\/li>\n<li>How to create a codistributed array?\n<ul>\n<li>Partitioning a larger array:\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">parpool(4)<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">spmd<\/span>\r\nA = [11:18;21:28;31:38;41:48];   <span class=\"comment\">% replicate array<\/span>\r\nD = codistributed(A)  <span class=\"comment\">% codistributed<\/span>\r\nend <\/code><\/pre>\n<\/li>\n<li>Building from smaller arrays:\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">parpool(4)<\/span>\r\n<span class=\"prompt\">&gt;&gt;<\/span> <span class=\"command\">spmd<\/span>\r\nA = [11:13; 21:23; 31:33; 41:43] + (labindex-1)*3;  <span class=\"comment\">% variant array<\/span>\r\nD = codistributed(A);\r\nend<\/code><\/pre>\n<\/li>\n<li>Using MATLAB constructor functions:\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span> D = zeros(1000, codistributor());<\/code><\/pre>\n<\/li>\n<li><code><span class=\"command\">codistributed<\/span><\/code> is normally used in the parallel environment, like <code><span class=\"command\">spmd<\/span><\/code> and <code><span class=\"command\">pmode<\/span><\/code>.<\/li>\n<li><code><span class=\"command\">codistributed<\/span><\/code> is similar to the scatter function in MPI.<\/li>\n<li><code><span class=\"command\">gather<\/span><\/code> is the opposite of &#8220;codistributed&#8221;.<\/li>\n<li><code><span class=\"command\">numel<\/span><\/code> is supported for codistributed arrays. It returns the size of the codistributed arrays.<\/li>\n<\/ul>\n<\/li>\n<li>Indexing into a Codistributed Array:\n<pre class=\"code-block\"><code><span class=\"prompt\">&gt;&gt;<\/span> <span class=\"commmand\">spmd<\/span>\r\nA = [11:18; 21:28; 31:38; 41:48];   <span class=\"comment\">% A is a replicated array<\/span>\r\nD = codistributed(A);    <span class=\"comment\">% D is A distributed on labs; saves memory<\/span>\r\nL = getLocalPart(D);                   <span class=\"comment\">% L is a local copy of D<\/span>\r\nL(3,:)   <span class=\"comment\">% prints row 3 of local array<\/span>\r\nn = size(L,2);  <span class=\"comment\">% column size of L<\/span>\r\nL(3,1:n) <span class=\"comment\">% same as above<\/span>\r\nD(3,:)   <span class=\"comment\">% PCT treats D as if localPart(D)<\/span>\r\nD(3,1:end) <span class=\"comment\">% same as above<\/span>\r\nC = getCodistributor(D)  <span class=\"comment\">% obtain info of the codistributed array<\/span>\r\ns = C.Partition   <span class=\"comment\">% size of local partitions<\/span>\r\nlast = sum(s(1:labindex));       <span class=\"comment\">% global ending element index for local worker<\/span>\r\nfirst = last - s(labindex) + 1;  <span class=\"comment\">% global beginning element index<\/span>\r\nD(3,first:end) <span class=\"comment\">% first:end is global indexing; D is hence global<\/span>\r\nD(3,1:n)       <span class=\"comment\">% explicit indexing causes D to be treated as global;<\/span>\r\n               <span class=\"comment\">% fails on labs 2 to 4<\/span>\r\nend<\/code><\/pre>\n<\/li>\n<\/ul>\n<h2><a name=\"config\"><\/a>Configuring <code><span class=\"command\">parpool<\/span><\/code> for Batch Submission<\/h2>\n<p>In order to use <code><span class=\"command\">parpool<\/span><\/code> in a batch job, two things need to be configured: (1) the number of assigned cores to avoid the process reaper and (2) a temporary directory for log files to avoid simultaneous parpools from writing to the same directory. To do this, include the following code in your script replacing <code><span class=\"placeholder\">pipeline<\/span><\/code> with the name of your script and <code><span class=\"placeholder\">parameter<\/span><\/code> with some unique input available to your script such as your job ID:<\/p>\n<pre><code>\r\n% Set up 'local' parpool cluster\r\npc = <span class=\"command\">parcluster(<\/span>'local'<span class=\"command\">)<\/span>\r\n\r\n% Discover number of available cores assigned by SGE\r\nnCores = <span class=\"command\">str2num(getenv(<\/span>'NSLOTS'<span class=\"command\">))<\/span>\r\n\r\n% Setup directory for temporary parpool files\r\nparpool_tmpdir = ['~\/.matlab\/local_cluster_jobs\/<span class=\"placeholder\">pipeline<\/span>\/<span class=\"placeholder\">pipeline_<\/span>' <span class=\"placeholder\">parameter<\/span> ]\r\n<span class=\"command\">mkdir(<\/span>parpool_tmpdir<span class=\"command\">)<\/span>\r\n<span class=\"command\">pc.JobStorageLocation<\/span> = parpool_tmpdir\r\n\r\n% Start parpool\r\n<span class=\"command\">parpool(pc,nCores)<\/span>\r\n<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>General Usage of parpool Configuring parpool for Batch Submission General Usage of parpool Invoking parpool submits a batch job to start a parallel environment. &gt;&gt; parpool(n) where n is the number of labs (i.e. workers). If, in addition, spmd is invoked, then this parallel environment is almost equivalent to pmode but without the interactive GUI&#8230;<\/p>\n","protected":false},"author":1692,"featured_media":0,"parent":4613,"menu_order":1,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/4618"}],"collection":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/users\/1692"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/comments?post=4618"}],"version-history":[{"count":32,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/4618\/revisions"}],"predecessor-version":[{"id":122528,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/4618\/revisions\/122528"}],"up":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/pages\/4613"}],"wp:attachment":[{"href":"https:\/\/www.bu.edu\/tech\/wp-json\/wp\/v2\/media?parent=4618"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}