Skip to content

Commit

Permalink
Simple examples with documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
richardreeve committed Apr 25, 2024
1 parent 6ef7ef0 commit ab53664
Show file tree
Hide file tree
Showing 5 changed files with 52 additions and 20 deletions.
59 changes: 47 additions & 12 deletions examples/HPC/README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,53 @@
# MPI testing
# HPC testing

All of the testing here can be run directly from the root directory of a
checked out version of the [EcoSISTEM package][ecosistem-git], with the
scripts and the project environment being found in the `examples` folder and
its `examples/HPC` subfolder.

> You will probably need to install the relevant packages on the login node or
> some other node with internet access, as by default the package will attempt
> to install the package on startup.
## Multithreaded testing

An example of running a standard (multithreaded) job, in this case on a node
with 2 processors of 64 cores each. By default any run is multithreaded,
and will run in parallel on all available threads, so here the job runs with
128 threads.

```sh
sbatch examples/HPC/demo-threads.bash
```

## MPI testing

To run the code using MPI, you may need to configure it correctly. Here we use
Julia's MPI libraries (installed with the MPI package), but on HPC it is
usually the case that the HPC's own MPI libraries will be (potentially much)
faster, as they will be configured to take advantage of the exact topology and
hardware of the system.

These tests use a simple, but relatively large, example of 256 x 256 grid
and 64k species. The MPI testing run from EcoSISTEM package directory using
Julia's built-in MPI libraries. Note that this uses the MPIRun.jl code, and
a folder is set in there (SAVEDIR), which may not be appropriate.
containing 64k species. The MPI testing run from EcoSISTEM package directory
using Julia's built-in MPI libraries. Note that this uses the MPIRun.jl code,
and a folder is set in there (SAVEDIR) for outputs that may not be appropriate.

### Comparison of different process vs thread counts on a single node with 2 processors x 32 cores

The first example runs one task on each processor, with 32 threads per task
(one thread per core). The second runs one task per core, with each process
running single-threaded.

## Comparison of different process vs thread counts on a single node with 2 processors x 32 cores
```sh
sbatch examples/HPC/demo-MPI-threads.bash
sbatch examples/HPC/demo-MPI-processes.bash
```

sbatch -J MPIRun-1x1x64 examples/HPC/MARS-demo-MPI-1x1x64.bash
sbatch -J MPIRun-1x2x32 examples/HPC/MARS-demo-MPI-1x2x32.bash
sbatch -J MPIRun-1x8x8 examples/HPC/MARS-demo-MPI-1x8x8.bash
sbatch -J MPIRun-1x64x1 examples/HPC/MARS-demo-MPI-1x64x1.bash
### Comparison with four nodes and a mixture of multi-threading and multi-process

## Comparison with multiple nodes
```sh
sbatch examples/HPC/demo-MPI-nodes.bash
```

sbatch -J MPIRun-4x8x8 examples/HPC/MARS-demo-MPI-4x8x8.bash
sbatch -J MPIRun-4x64x1 examples/HPC/MARS-demo-MPI-4x64x1.bash
[ecosistem-git]: https://github.com/EcoJulia/EcoSISTEM.jl.git
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@

############# LOADING MODULES (optional) #############
module load apps/julia
module load mpi/openmpi
julia --project=examples -e 'using Pkg; Pkg.instantiate(); Pkg.build("MPI"); using MPI; MPI.install_mpiexecjl(destdir = "bin", force = true)'

############# ENVIRONMENT #############
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@

############# LOADING MODULES (optional) #############
module load apps/julia
module load mpi/openmpi
julia --project=examples -e 'using Pkg; Pkg.instantiate(); Pkg.build("MPI"); using MPI; MPI.install_mpiexecjl(destdir = "bin", force = true)'

############# ENVIRONMENT #############
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,22 +8,21 @@
#SBATCH --time=0-12:00:00 # time limit for the whole run, in the form of d-hh:mm:ss, also accepts mm, mm:ss, hh:mm:ss, d-hh, d-hh:mm
#SBATCH --mem=256G # memory required per node, in the form of [num][M|G|T]
#SBATCH --nodes=1 # number of nodes to allocate, default is 1
#SBATCH --ntasks=1 # number of Slurm tasks to be launched, increase for multi-process runs ex. MPI
#SBATCH --cpus-per-task=64 # number of processor cores to be assigned for each task, default is 1, increase for multi-threaded runs
#SBATCH --ntasks-per-node=1 # number of tasks to be launched on each allocated node
#SBATCH --ntasks=2 # number of Slurm tasks to be launched, increase for multi-process runs ex. MPI
#SBATCH --cpus-per-task=32 # number of processor cores to be assigned for each task, default is 1, increase for multi-threaded runs
#SBATCH --ntasks-per-node=2 # number of tasks to be launched on each allocated node
#SBATCH --threads-per-core=1 # Threads per core

############# LOADING MODULES (optional) #############
module load apps/julia
module load mpi/openmpi
julia --project=examples -e 'using Pkg; Pkg.instantiate(); Pkg.build("MPI"); using MPI; MPI.install_mpiexecjl(destdir = "bin", force = true)'

############# ENVIRONMENT #############
# Set the number of OpenMP threads to 1 to prevent
# any threaded system libraries from automatically
# using threading. Then manually set Julia threads
export OMP_NUM_THREADS=1
export JULIA_NUM_THREADS=64
export JULIA_NUM_THREADS=32

############# MY CODE #############
bin/mpiexecjl --project=examples -n 1 julia -t 64 --project=examples examples/HPC/MPIRun.jl
bin/mpiexecjl --project=examples -n 2 julia -t 32 --project=examples examples/HPC/MPIRun.jl
File renamed without changes.

0 comments on commit ab53664

Please sign in to comment.