Contents
Using Job Arrays
If you have a large number of similar, small to medium sized jobs, the use of a slurm job array is recommended. It is convenient to use and allows the workload manager to operate more efficiently than with a large number of individual single jobs. A job array is created by adding the --array option to the sbatch command or using #SBATCH --array= in your submision script.
The use of job arrays is best explained using an example. Let's assume that you have an executable called exec that is located at path_to_exec/ and that requires an input file with extension *.inp for running. Let's furthermore assume that you have a folder with 100 different *.inp files that you want to run. You can then start these 100 jobs as an array using the job script shown below.
#!/bin/bash #SBATCH --job-name=jobArrayExampe #SBATCH --array=0-99 #SBATCH --time=00:05:00 #SBATCH --ntasks=1 # find the corresponding input files in the current folder i=0 INP="" while read line do if [ {$SLURM_ARRAY_TASK_ID} == {$i} ]; then INP="$line" break fi (( i++ )) done < <(ls *.inp) if [ "$INP" ]; then path_to_exec/exec "$INP" else echo "Job array size larger than no. of input files" fiThe job array gets a single unique job ID. The IDs of the individual tasks are constructed by appending an index to the job ID of the entire job array. These indices start from 0 and run to 99 (--array=0-99). These indices are also appended to the names of the stderr and stdout files.
When submitting the above script to sbatch, the variable SLURM_ARRAY_TASK_ID gets replaced with all indices provided with the --array option.
In case a few jobs from the array fail, say, the ones with IDs 4, 7, and 11, you can rerun them by providing --array=4,7,11 to the sbatch command.
It is not recommended to bundle more than 250 jobs in one job array.