Bundling Jobs
It is better to submit multiple application sruns within a larger job, bundling them together, rather than many small jobs.
An example job array is presented below. In this example, there’s a process that needs to compress a huge series of files, where each task is not dependent on any of the others.
First, a script makes a list file of all teh compressions commands in commands.list. Then, submit this job:
sbatch --array=0-$(($COUNT/24)) preparelist.sbatch
Where COUNT is the number of lines in commands.list and 24 is how many lines per array task.
preparelist.sbatch:
#!/bin/bash -xe
#SBATCH -A emslXXXXXX
#SBATCH -N 1
#SBATCH -n 16
#SBATCH -t 2:00:00
#SBATCH --job-name PrepareArray
#SBATCH -p short
NUM=24
S=$(($SLURM_ARRAY_TASK_ID*$NUM+1))
E=$(((1+$SLURM_ARRAY_TASK_ID)*$NUM))
sed -n ${S},${E}p commands.list > /scratch/commands
parallel :::: /scratch/commands
Each array task runs 24 of the commands from the list and as this example asked for only 1 node per array task this works pretty well within the time limit. This farms out a bunch of small tasks and you can wait for just one Job ID to finish before knowing that things are done.