Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Official documentation is provided at http://www.mathworks.com/help/toolbox/distcomp/Image Removed.

 

...

The Parallel Computing Toolbox and Distributed Computing Server are installed on the CIP pool's workstations. Follow these steps to configure MATLAB for parallel job execution on the CIP cluster:

 

  1. Start MATLAB and open the Configurations Manager via Parallel->Manage Configurations....
  2. Click on File->New->jobmanager
  3. Fill out the Form as follows in the screenshot: 
  4. Press OK
  5. In the Configurations Manager window click on Start Validation.
  6. All test stages must pass within a few minutes.
  7. Click on File->New->jobmanager
  8. Fill out the Form as follows in the screenshot: 
    • Configuration name: cip-cuda
    • Job manager hostname: cipserv.cip.ph.tum.de
    • Job manager name: CIP-CUDA-R2011a
  9. Press OK
  10. In the Configurations Manager window click on Start Validation.
  11. All test stages must pass within a few minutes.
  12. Close the Configurations Manager window.

Your MATLAB is now configured for parallel job execution.

The cip-cuda configuration will execute your jobs only on CUDA capable workers. At the moment this is only the workstation cuda1

...

If you have a CIP account but want to use MATLAB on a private/university computer that does not belong to the CIP pool, you can still run MATLAB workers on the CIP pool workstations. You must first install MATLAB and the Parallel Computing Toolbox on your computer. Use the same instructions as above to configure your MATLAB installation.

Your hostname, i.e. the hostname of your computer, must be valid, resolvable and your computer must accept incoming TCP connection from the CIP subnet 10.152.84.0/24. You can view the hostname that MATLAB uses by entering pctconfig at the MATLAB prompt. If MATLAB uses a wrong hostname, you can change the hostname by entering pctconfig('hostname', '<NEW-HOSTNAME>') at the MATLAB prompt. This command must be executed in every MATLAB session, so it might be useful to put it in your startup.m file.

...

The following MATLAB script creates twenty 1500x1500 matrices, stores them in cell array a and inverts them in parallel.

 

clear

%% data generation
n=1500;
for i=1:20
    a{i} = rand(n,n);
end

%% parallel run
parfor i=1:numel(a)
    ia{i} = inv(a{i});
end

Save this code snippet as dcsdemo.m.

Before you run this script you have to initialize the MATLAB pool of workers that will execute your code in parallel on a set of nodes. This is done by calling matlabpool [poolsize] from the MATLAB command prompt, where [poolsize] represents the number of workers you wish to use. matlabpool 4 would start workers on 4 machines. When you are finished call matlabpool close to quit the workers and release the licenses.

Here is a sample session:

>> matlabpool 4
Starting matlabpool using the 'cip' configuration ... connected to 4 labs.
>> dcsdemo
>> whos
  Name      Size                Bytes  Class     Attributes

  a         1x20            360002240  cell                
  i         1x1                     8  double              
  ia        1x20            360002240  cell                
  n         1x1                     8  double              

>> matlabpool close
Sending a stop signal to all the labs ... stopped.
>>

Of course you may also call matlabpool from your script.

For further information please read the users guide at http://www.mathworks.com/help/toolbox/distcomp/brb2x2l-1.html.

 

...

The underlying Sun Grid Engine is probably not able to allocate enough slots for your job.

Please run qstat -f | grep matlab.q and count the number of available hosts. Each host is represented by a row. Failed or overloaded hosts are marked with aau or E in the last column; available hosts have no marker in the last column. Example output:

$ qstat -f | grep matlab.q
matlab.q@c2quad1.cip.ph.tum.de BIP   0/0/4          10.16    lx24-amd64    
matlab.q@c2quad2.cip.ph.tum.de BIP   0/0/4          4.07     lx24-amd64    
matlab.q@iseven1.cip.ph.tum.de BIP   0/0/1          0.55     lx24-amd64    
matlab.q@iseven2.cip.ph.tum.de BIP   0/0/1          8.51     lx24-amd64    
matlab.q@iseven3.cip.ph.tum.de BIP   0/0/1          0.32     lx24-amd64    
matlab.q@iseven4.cip.ph.tum.de BIP   0/0/1          0.26     lx24-amd64    au
matlab.q@iseven5.cip.ph.tum.de BIP   0/0/1          0.76     lx24-amd64    au
matlab.q@iseven6.cip.ph.tum.de BIP   0/0/1          0.47     lx24-amd64 

In this example output 6 hosts are available because iseven4, iseven5 have failed.

If you try to request more slots than available hosts SGE will wait (forever) until the required number of hosts becomes available. This causes MATLAB to hang. To fix this problem you have to adjust the ClusterSize in your MATLAB Parallel Preferences. Click on Parallel->Manage Configuration..., then double click on the CIP parallel configuration. In the Generic Scheduler Configuration Properties change Number of workers available to scheduler (ClusterSize) dialog to be equal or lower than the number of available SGE hosts. Alternatively write to mailto:trouble@ph.tum.de and request a reset of the failed hosts.