Skip to content

Commit

Permalink
checking up on running jobs
Browse files Browse the repository at this point in the history
  • Loading branch information
KasperSkytte committed Dec 20, 2024
1 parent e630a3d commit 2803d0d
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 7 deletions.
14 changes: 9 additions & 5 deletions docs/slurm/accounting.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,12 +86,16 @@ $ sacctmgr show qos format="name,priority,usagefactor,mintres%20,maxtrespu,maxjo
highprio 1 2.000000 cpu=1,mem=512M cpu=512 2000
```

See all account associations for your user and the QOS's you are allowed to use:
See details about account associations, allowed QOS's, and more, for your user:
```
$ sacctmgr list association user=$USER format=account%10s,user%20s,qos%20s
Account User QOS
---------- -------------------- --------------------
root [email protected] highprio,normal
# your user
$ sacctmgr show user withassoc where name=$USER
User Def Acct Admin Cluster Account Partition Share Priority MaxJobs MaxNodes MaxCPUs MaxSubmit MaxWall MaxCPUMins QOS Def QOS
---------- ---------- --------- ---------- ---------- ---------- --------- ---------- ------- -------- -------- --------- ----------- ----------- -------------------- ---------
[email protected]+ root None biocloud root 1 1 highprio,normal normal
# all users
$ sacctmgr show user withassoc | less
```

### Undergraduate students
Expand Down
2 changes: 1 addition & 1 deletion docs/slurm/jobcontrol.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ scontrol write batch_script <jobid>
## Modifying job attributes
Only a few job attributes can be changed after a job is submitted and **NOT** running yet. These attributes include:

- wall clock limit
- time limit
- job name
- job dependency
- partition or QOS
Expand Down
5 changes: 4 additions & 1 deletion docs/slurm/jobsubmission.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,10 @@ $ srun --cpus-per-task 8 --mem 16G --time 1-00:00:00 minimap2 <options>

The terminal will be blocked for the entire duration, hence for larger jobs it's ideal to submit a job through [`sbatch`](#non-interactive-jobs) instead, which will run in the background.

[`srun`](https://slurm.schedmd.com/archive/slurm-23.02.6/srun.html) is also used if multiple tasks (separate processes) must be run within the same resource allocation (job) already obtained through [`salloc`](https://slurm.schedmd.com/archive/slurm-23.02.6/salloc.html) or [`sbatch`](#non-interactive-jobs), see [example](#multi-node-multi-task-example) below. SLURM tasks can then span multiple compute nodes at once to distribute highly parallel work at any scale.
[`srun`](https://slurm.schedmd.com/archive/slurm-23.02.6/srun.html) is also used to run multiple tasks/steps (parallel processes) within an already obtained resource allocation (job), see [example](#multi-node-multi-task-example) below. SLURM tasks can then span multiple compute nodes at once to distribute highly parallel work at any scale.

???+ tip "Checking up on running jobs using `srun`"
As the `srun` command can be used to run commands within any job allocation your user has been granted, it comes in handy to inspect already running jobs. This is possible by obtaining an interactive shell within the job allocation using `srun --jobid <jobid> --time=00:10:00 --pty /bin/bash`, and then run for example `htop`, `top`, or `nvidia-smi` to inspect CPU and memory usage, and verify that it behaves as expected and is fully utilizing all CPUs allocated etc. Note that only a single interactive terminal can be active within the same job allocation at any one time.

### Graphical apps (GUI)
In order to run graphical programs simply append the [`--x11` option](https://slurm.schedmd.com/archive/slurm-23.02.6/srun.html#OPT_x11) to `salloc` or `srun` and run the program. The graphical app will then show up in a window on your own computer, while running inside a SLURM job on the cluster:
Expand Down

0 comments on commit 2803d0d

Please sign in to comment.