Tasks

  1. Create and configure a prologue and epilogue for tasks follwing the documentation at https://slurm.schedmd.com/prolog_epilog.html. For example to let users know details of their jobs:

#!/bin/bash
#
# TASK prologue script. to be run by slurmstepd

trap "exit 0" 1 2 3 15 20

if [[ -n $SLURM_JOB_ID ]]; then
    SLURM_JOB_STDOUT=`/usr/bin/scontrol show job ${SLURM_JOB_ID} | grep -i stdout | cut -f2 -d '='`
    if [[ -s $SLURM_JOB_STDOUT ]]; then
        exit 0
    fi
fi

if [[ $SLURM_PROCID -eq 0 ]]
then
    echo "print ==================================================="
    echo "print Begin TASK Prologue $(date)"
    echo "print ==================================================="
    echo "print Job ID:           $SLURM_JOB_ID"
    echo "print Username:         $SLURM_JOB_USER"

    if [[ -n $SLURM_JOB_GID ]]; then
        GROUP=`grep $SLURM_JOB_GID /etc/group | cut -d ":" -f 1`
        echo "print Group:            $GROUP"
    fi

    echo "print Job Name:         $SLURM_JOB_NAME"
    echo "print Resources List:   nodes=$SLURM_JOB_NUM_NODES:ppn=$SLURM_JOB_CPUS_PER_NODE:ntasks=$SLURM_NTASKS"
    echo "print Queue:            $SLURM_JOB_PARTITION"

    [ -z $SLURM_JOB_ACCOUNT ] || echo "print Account:          $SLURM_JOB_ACCOUNT"
    [ -z $SLURM_JOB_NODELIST ] || echo "print Nodes:      $SLURM_JOB_NODELIST"
    [ -z $SLURM_GPUS ] || echo "print GPUs:             $SLURM_GPUS"

    echo "print ==================================================="
    echo "print End TASK Prologue $(date)"
    echo "print ==================================================="
fi
exit 0
#EOF

These scripts are activated by setting the following configuration options:

TaskEpilog=/var/spool/slurm/task_epilogue
TaskPlugin=task/affinity,task/cgroup
TaskProlog=/var/spool/slurm/task_prologue
  1. (Merit) block users from login into nodes unless they have active jobs running in a given machine. Use the instructions at https://slurm.schedmd.com/pam_slurm_adopt.html. TIP: The relevant PAM configuration is:

- name: Ensure PAM module is used
  ansible.builtin.blockinfile:
    path: /etc/pam.d/sshd
    insertafter: account    required     pam_nologin.so
    content: |
      account    sufficient   pam_slurm_adopt.so
      account    required     pam_access.so