UMBC High Performance Computing Facility

Please note that this page is under construction. We are documenting the
240-node cluster maya that will be available after Summer 2014.
Currently, the 84-node cluster tara still operates independently,
until it becomes part of maya at the end of Summer 2014.
Please see the 2013 Resources Pages under the Resources tab for tara information.

How to run MATLAB programs on maya

For more information about the software, see the MATLAB website. To use MATLAB on the cluster, the MATLAB module must be loaded.

[araim1@maya-usr1 ~]$ module load matlab

% Generate two 100x100 matrices with random contents: A=rand(100); B=rand(100); % Multiply the two matrices: AB=A*B; % Calculate the sum of the contents: sumAB=sum(AB(:)); % Save the AB and sumAB variables to the Matlab save file out.mat: save out.mat AB sumAB;

Download: ../code/matrixmultiply-matlab/matrixmultiply.m

#!/bin/bash #SBATCH --job-name=matrixmultiply #SBATCH --output=slurm.out #SBATCH --error=slurm.err #SBATCH --partition=develop matlab -nodisplay -r "matrixmultiply, exit"

Download: ../code/matrixmultiply-matlab/run.slurm

[araim1@maya-usr1 matrixmultiply-matlab]$ sbatch run.slurm sbatch: Submitted batch job 2621 [araim1@maya-usr1 matrixmultiply-matlab]$

>> load out.mat

[araim1@maya-usr1 matrixmultiply-matlab]$ cat slurm.out < M A T L A B (R) > Copyright 1984-2008 The MathWorks, Inc. Version 7.6.0.324 (R2008a) February 10, 2008 To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com. [araim1@maya-usr1 matrixmultiply-matlab]$

#!/bin/bash #SBATCH --job-name=plotsine #SBATCH --output=slurm.out #SBATCH --error=slurm.err #SBATCH --partition=develop matlab -nodisplay -r "plotsine, exit"

Download: ../code/plotsine-matlab/run.slurm

zero_to_2pi=linspace(0,2*pi,1000); them_sine=sin(zero_to_2pi); plot(zero_to_2pi,them_sine); print -dpng sine.png print -deps sine.eps print -djpeg sine.jpeg

Download: ../code/plotsine-matlab/plotsine.m

[araim1@maya-usr1 plotsine-matlab]$ ls run.slurm plotsine.m sine.eps sine.jpeg sine.png slurm.err slurm.out [araim1@maya-usr1 plotsine-matlab]$

The encapsulated postscript file (sine.eps) will be in greyscale since I used -deps instead of -depsc. Here are links to the three output files if you want to download them

First grab the following C files, which are also used in How to check memory usage

The following C code is written in a specific form which Matlab can interface to. It will call the get_memory_usage_kb function defined in the C files above. The code retrieves the VmRSS and VmSize quantities for the current process (see How to check memory usage for more information), and returns them as a pair to Matlab.#include <sys/types.h> #include <unistd.h> #include "mex.h" #include "memory.h" void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { int* data; long vmrss; long vmsize; if (nrhs > 0) { mexErrMsgTxt("Too many input arguments."); } get_memory_usage_kb(&vmrss, &vmsize); plhs[0] = mxCreateNumericMatrix(1, 1, mxUINT32_CLASS, mxREAL); data = mxGetData(plhs[0]); data[0] = vmrss; plhs[1] = mxCreateNumericMatrix(1, 1, mxUINT32_CLASS, mxREAL); data = mxGetData(plhs[1]); data[0] = vmsize; }

Download: ../code/check_memory-matlab/getmemusage.c

#!/bin/bash #SBATCH --job-name=matlab-mex #SBATCH --output=slurm.out #SBATCH --error=slurm.err #SBATCH --partition=batch #SBATCH --ntasks-per-node=1 matlab -nodisplay -r "mex getmemusage.c memory.c, exit"

Download: ../code/check_memory-matlab/run.slurm

[araim1@maya-usr1 check_memory-matlab]$ sbatch run.slurm Submitted batch job 22421 [araim1@maya-usr1 check_memory-matlab]$ ls getmemusage.c getmemusage.mexa64 memory.c memory.h run.slurm slurm.err slurm.out [araim1@maya-usr1 check_memory-matlab]$

[araim1@maya-usr1 check_memory-matlab]$ matlab -nodisplay < M A T L A B (R) > Copyright 1984-2009 The MathWorks, Inc. Version 7.9.0.529 (R2009b) 64-bit (glnxa64) August 12, 2009 To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com. >> [vmrss, vmsize] = getmemusage vmrss = 101140 vmsize = 933132 >> A = rand(5000, 5000); >> [vmrss, vmsize] = getmemusage vmrss = 298572 vmsize = 1128448 >>

**Parallel for**provides a simple way to parallelize "for" loops.**spmd**provides a "single process multiple data" programming model, which can be thought of as a simplified MPI- Support for
**distributed data structures**

poolobj = parpool(8); spmd msg = sprintf('Hello world from process %d of %d', labindex, numlabs); end for i=1:poolobj.NumWorkers disp(msg{i}); end delete(poolobj);

Download: ../code/matlab-pct-hello/driver.m

After the "spmd" block, the "msg" data is available as a data structure that can be manipulated in serial. Here it consists of eight strings; we loop through and print each one. Notice here that the number of workers can be accessed by "poolobj.NumWorkers".

#!/bin/bash #SBATCH --job-name=matlab-pct #SBATCH --output=slurm.out #SBATCH --error=slurm.err #SBATCH --partition=develop #SBATCH --ntasks-per-node=8 matlab -nodisplay -r "driver, exit"

Download: ../code/matlab-pct-hello/run.slurm

[araim1@maya-usr1 matlab-pct-hello]$ sbatch run.slurm [araim1@maya-usr1 matlab-pct-hello]$ cat slurm.err [araim1@maya-usr1 matlab-pct-hello]$ cat slurm.out < M A T L A B (R) > Copyright 1984-2014 The MathWorks, Inc. R2014a (8.3.0.532) 64-bit (glnxa64) February 11, 2014 To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com. Starting parallel pool (parpool) using the 'local' profile ... connected to 8 workers. Hello world from process 1 of 8 Hello world from process 2 of 8 Hello world from process 3 of 8 Hello world from process 4 of 8 Hello world from process 5 of 8 Hello world from process 6 of 8 Hello world from process 7 of 8 Hello world from process 8 of 8 Parallel pool using the 'local' profile is shutting down. [araim1@maya-usr1 matlab-pct-hello]$

poolobj = parpool(8); x = zeros(1, 40); parfor i = 1:40 x(i) = i; end delete(poolobj); x

Download: ../code/matlab-pct-parfor/driver.m

#!/bin/bash #SBATCH --job-name=matlab-parallel-toolkit #SBATCH --output=slurm.out #SBATCH --error=slurm.err #SBATCH --partition=develop #SBATCH --ntasks-per-node=8 matlab -nodisplay -r "driver, exit"

Download: ../code/matlab-pct-parfor/run.slurm

[araim1@maya-usr1 matlab-pct-parfor]$ sbatch run.slurm [araim1@maya-usr1 matlab-pct-parfor]$ cat slurm.err [araim1@maya-usr1 matlab-pct-parfor]$ cat slurm.out < M A T L A B (R) > Copyright 1984-2014 The MathWorks, Inc. R2014a (8.3.0.532) 64-bit (glnxa64) February 11, 2014 To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com. Starting parallel pool (parpool) using the 'local' profile ... connected to 8 workers. Parallel pool using the 'local' profile is shutting down. x = Columns 1 through 13 1 2 3 4 5 6 7 8 9 10 11 12 13 Columns 14 through 26 14 15 16 17 18 19 20 21 22 23 24 25 26 Columns 27 through 39 27 28 29 30 31 32 33 34 35 36 37 38 39 Column 40 40 [araim1@maya-usr1 matlab-pct-parfor]$

[hu6@maya-usr1 ~]$ module list Currently Loaded Modulefiles: 1) cuda60/toolkit/6.0.37 3) gcc/4.8.2 2) matlab/r2014a 4) slurm/14.03.6

**Computationally intensive**Heavy computation can be done on the GPU with few data transfer.**Massively parallel**Similar task is performed repeatedly on different data.

gpuDeviceCount gpuDevice A = ones(10, 'single', 'gpuArray'); B = 5 .* eye(10, 'single', 'gpuArray'); C = A * B; C_host = gather(C); C_host

Download: ../code/matlab-gpu/driver_gpu.m

#!/bin/bash #SBATCH --job-name=matlab-gpu #SBATCH --output=slurm.out #SBATCH --error=slurm.err #SBATCH --partition=batch #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --gres=gpu matlab -nodisplay -r "driver_gpu, exit"

Download: ../code/matlab-gpu/run_gpu.slurm

[hu6@maya-usr1 sec6_gpu]$ cat slurm.out < M A T L A B (R) > Copyright 1984-2014 The MathWorks, Inc. R2014a (8.3.0.532) 64-bit (glnxa64) February 11, 2014 To get started, type one of these: helpwin, helpdesk, or demo. For product information, visit www.mathworks.com. ans = 1 ans = CUDADevice with properties: Name: 'Tesla K20m' Index: 1 ComputeCapability: '3.5' SupportsDouble: 1 DriverVersion: 6 ToolkitVersion: 5.5000 MaxThreadsPerBlock: 1024 MaxShmemPerBlock: 49152 MaxThreadBlockSize: [1024 1024 64] MaxGridSize: [2.1475e+09 65535 65535] SIMDWidth: 32 TotalMemory: 5.0327e+09 FreeMemory: 4.9211e+09 MultiprocessorCount: 13 ClockRateKHz: 705500 ComputeMode: 'Default' GPUOverlapsTransfers: 1 KernelExecutionTimeout: 0 CanMapHostMemory: 1 DeviceSupported: 1 DeviceSelected: 1 C_host = 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

The command gpuDeviceCount will return the number of GPUs you have access now, my slurm file requested one node and one GPU, but you can request two GPU by having gres=gpu:2.

The command gpuDevice returns the property of the GPU, including type, the max block and thread size supported, GPU memory, etc.

In the computation part, I setup matrix A with all enties equal to 1, setup matrix B as a diagonal matrix with diagonal entries all equal to 5. The two matrices are already in GPU memory since I specified gpuArray at the beginning, but you can also copy existing data from CPU to GPU. One thing to notice is matrix C is calculated on GPU and therefore stays in GPU memory, one has to copy it to CPU memory by calling gather(), to display or use in other code that may execute on CPU.

Detailed documentation on GPU programming with MATLAB is available from GPU Computing.