A website-based resource monitor for SLURM systems, tailored specifically for the UVA Rivanna computing environment.
This project extends the original implementation developed for the Visual Geometry Group, Oxford, with enhancements and customizations to better serve the specific needs of our research group at the University of Virginia.
- Parses the results from the
sinfo
command every 1 seconds to update CPU/GPU resource usage. - Hosts statistics on an internally accessible webpage, providing a convenient overview of system status.
- [10/31/2024]: Add allocations.
- [10/31/2024]: Launched customized version for UVA Rivanna.
To install necessary dependencies, run:
pip install -r requirements.txt
To launch the web monitor:
python app.py --host localhost --port 8080
Access the website at localhost:8080
. Adjust the host and port as needed for your setup.
Modify the index.html to customize the header, footer, and formatting to suit your group's preferences.
For command-line usage:
python slurm_web/slurm_gpustat.py
Alternatively, add this alias to your .bash_profile
:
alias slurm_gpustat=‘python ~/slurm_web/slurm_gpustat.py’
To view the statistics of only the available resources, run:
python available_resources.py
This project is based on the original slurm_gpustat
tool developed by Samuel Albanie and slurm_web
developed by Tengda Han. It has been modified and maintained for the UVA Rivanna system by the UVA CV Lab, with the aim of providing enhanced monitoring tools for Rivanna.
Further documentation and updates can be found at the original slurm_gpustat repository.