Multi-Node Jobs
Large parallel Abaqus jobs can be run on across multiple nodes using MPI.
Tip
Multi-node parallelism is best-suited to Abaqus/Explicit. Abaqus/Standard (implicit) does not scale well across multiple nodes.
Job Script Template
Here is an example job submission script for a multi-node Abaqus job:
1#!/usr/bin/bash -l
2#
3#SBATCH --job-name=my_job
4#SBATCH --nodes=2
5#SBATCH --ntasks-per-node=28
6#SBATCH --cpus-per-task=1
7#SBATCH --time=0:10:00
8#SBATCH --mem-per-cpu=4000M
9#SBATCH --account=aero012345
10
11# Load modules
12module load apps/abaqus/2018
13# module load languages/intel/2020-u4 # BlueCrystal (Phase 4)
14# module load lang/intel-parallel-studio-xe/2020 # BluePebble
15
16# Unset SLURM's Global Task ID for ABAQUS's PlatformMPI to work
17unset SLURM_GTIDS
18
19# Get allocated nodes for Abaqus
20env_file=abaqus_v6.env
21node_list=$(scontrol show hostname ${SLURM_NODELIST} | sort -u)
22mp_host_list="["
23for host in ${node_list}; do
24 mp_host_list="${mp_host_list}['$host', ${SLURM_CPUS_ON_NODE}],"
25done
26mp_host_list=$(echo ${mp_host_list} | sed -e "s/,$/]/")
27echo "mp_host_list=${mp_host_list}" >> ${env_file}
28
29# Launch Abaqus
30abaqus job=<job-name> cpus=$((SLURM_NTASKS_PER_NODE*SLURM_NNODES)) user=<usub-file> mp_mode=mpi double=both interactive
There are number of important differences with the single-node job script example:
More than one node is requested
We request multiple tasks per node (distributed parallelism), instead of multiple cpus per task (thread-based parallelism)
We have extra lines to inform Abaqus platform MPI about the nodes that have been allocated via the job scheduler
We must use
mp_mode=mpi
for multi-node parallelism
How to use
Follow the same steps are described for the single-node example except you can scale the paralellism by changing the number of nodes (line 4).