Cluster and SLURM and snakemake

Notes about submitting computing jobs using SLURM (via Snakemake) to CBCB cluster:

Note that UMIACS has its own wiki page here: https://wiki.umiacs.umd.edu/umiacs/index.php/SLURM

File system

Files are stored on the network file system so they persist across cluster nodes (and also anywhere else that file system can be accessed).

Downloading files

You can download from (and upload to) the cluster with SFTP using your regular credentials. Clients like Transmit make this easy (I think Finder on Mac can do it as well).

Alternatively you can use scp to download/upload.

QOS

When you submit a job to the cluster you will need to choose a Quality of Service (QOS). The QOS that you choose will define limits like max runtime or max number of CPUs.

To list the available "Quality of Service"s:

To show all options (pipe to less so that long lines dont wrap):

sacctmgr show qos | less -S

To eliminate empty columns from the output:

sacctmgr show qos format=Name,Priority,MaxWall,MaxJobsPU,MaxTRES

Queue

To check the job queue and see currently running or pending jobs:

squeue

To check the queue with details about time limits:

squeue -l

To check the queue but only show your jobs:

squeue -u my_username

Tasks and CPUs

Good StackOverflow post about how and why to specify the number of tasks and number of CPUs: https://stackoverflow.com/questions/51139711/hpc-cluster-select-the-number-of-cpus-and-threads-in-slurm-sbatch

Time limits

Be careful about specifying time limits when using .yml cluster config files when using Snakemake. Write times as strings (surround with quotes in the yml) and also surround with quotes in the shell command.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly