This post shows how I set up MATLAB to perform parallel calculations on a multi-core server which is part of an existing cluster that is managed using Torque and Moab. I did this without making any changes to the cluster submission system. The installation documentation for the MATLAB Parallel Computing Toolbox and Distributed Computing Server is poor. The instructions make a lot of assumptions that don’t fit into an existing “production” cluster environment. For example, they assume that a GUI is available to configure the cluster (this step is not required for my method!) Further, the instructions give the impression that MATLAB components must be “installed” on both the head node and compute nodes by an administrator. THIS IS FALSE! Ordinary users can run concurrent (multicore) MATLAB jobs without administrative privileges!
The following method also allows you to run MATLAB with more than 8 or 12 cores-in fact, you can use 16 cores or more on a single machine! It also may be possible to distribute tasks across multiple nodes (hosts), but that’s for an upcoming post. See my repository of PBS submit script examples on GitHub.
Customize the script that launches the MATLAB Distributed Computing Server
Make a copy of the script
and modify it to suite your cluster. In my case, I only had to set the variables PIDBASE, LOCKBASE, LOGBASE, and CHECKPOINTBASE to point to locations in the /tmp directory of each compute node so that ordinary users can write files to those locations. My full script will appear on GitHub.
Create a submit script that runs MATLAB
I created a Torque submit script that automatically starts a MATLAB DCE server, a job manager, and a set of worker processes on the node. The full script will be available on GitHub; I will only highlight a few snippets below. First, start the MATLAB distributed computing server (mdce) on the compute node using the launch script that I created:
export MATLABPARTEMP=/tmp/tmp$PBS_JOBID mdce start -clean -mdcedef /apps/CompVis/MATLAB/R2011a/toolbox/distcomp/bin/mdce_def_STOKES.sh
Note that I have exported an environment variable that points to a directory within /tmp that will hold all our temporary files; we will clean up this directory when the job is done. My custom mdce launch script needs this environment variable.
Next, start the job manager on the compute node:
startjobmanager -clean -name MyJobManager -v
Start worker daemons on the compute node:
for (( i=1; i<=$NP; i++)); do echo "Starting worker process $i"; startworker -jobmanagerhost localhost -jobmanager MyJobManager -name worker$i; done
Now, start MATLAB and run your code.
When the job is done, we need to shut down all the daemons in reverse order:
## Cleanup sleep 3 for (( i=1; i<=$NP; i++)); do echo "Stopping worker process $i"; stopworker -clean -name worker$i; done stopjobmanager -clean -name MyJobManager mdce stop -clean
Finally, let’s clean up the /tmp directory on the compute node:
rm -rfv $MATLABPARTEMP
Optional: Run MATLAB interactively
If you have used qsub -I … to obtain an interactive session on a compute node, you may manually run the commands above and run parallel commands in an interactive MATLAB session on the compute node. This approach is very helpful for debugging!