Using Job Arrays

If you have a large number of similar, small to medium sized jobs, the use of a slurm job array is recommended. It is convenient to use and allows the workload manager to operate more efficiently than with a large number of individual single jobs. A job array is created by adding the --array option to the sbatch command or using #SBATCH --array= in your submision script.

The use of job arrays is best explained using an example. Let's assume that you have an executable called exec that is located at path_to_exec/ and that requires an input file with extension *.inp for running. Let's furthermore assume that you have a folder with 100 different *.inp files that you want to run. You can then start these 100 jobs as an array using the job script shown below.

#!/bin/bash
#SBATCH --job-name=jobArrayExampe
#SBATCH --array=0-99
#SBATCH --time=00:05:00
#SBATCH --ntasks=1

# find the corresponding input files in the current folder
i=0
INP=""
while read line
do
    if [ {$SLURM_ARRAY_TASK_ID} == {$i} ]; then
       INP="$line"
       break
    fi
    (( i++ ))
done < <(ls *.inp)

if [ "$INP" ]; then
       path_to_exec/exec "$INP"
else
       echo "Job array size larger than no. of input files"
fi

The job array gets a single unique job ID. The IDs of the individual tasks are constructed by appending an index to the job ID of the entire job array. These indices start from 0 and run to 99 (--array=0-99). These indices are also appended to the names of the stderr and stdout files.

When submitting the above script to sbatch, the variable SLURM_ARRAY_TASK_ID gets replaced with all indices provided with the --array option.

In case a few jobs from the array fail, say, the ones with IDs 4, 7, and 11, you can rerun them by providing --array=4,7,11 to the sbatch command.

It is not recommended to bundle more than 250 jobs in one job array.

Main Navigation

Contents

Using Job Arrays