Skip to content

Commit

Permalink
Merge pull request #19 from UCL/DanGiles-docs-update
Browse files Browse the repository at this point in the history
Update README-Scotch.md
  • Loading branch information
DanGiles authored Aug 21, 2023
2 parents 0f5c901 + c33a912 commit 5053cb6
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README-Scotch.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Follow the usual installation [instructions](https://gitlab.cosma.dur.ac.uk/swif
Running with Scotch
----------------

Scotch decomposes the SWIFT spatial domain and maps it to the available compute - taking into consideration the communication cost between components of the architecture. In order for this to be carried out the user needs to generate an appropriate architecture file. This architecture file should mirror the set up of the cluster being used. Scotch provides optimised architecture files which capture most HPC set ups. As we will be targetting NUMA regions on Cosma 8 we have modelled the architecture as a `tleaf` structure.
Scotch carries out a _mapping_ of a _source_ (or process) graph onto a _target_ (or architecture) graph. The weighted _source_ graph is generated by SWIFT and it captures the computation and communication cost across the computational domain. The _target_ graph defines the communication cost across the available computing architecture. Therefore, to make use of the Scotch _mapping_ alogrithms a target architecture file (_target.tgt_) must be generated and it should mirror the set up of the cluster being used. Scotch provides optimised architecture files which capture most HPC set ups. As we will be targetting NUMA regions on Cosma 8 we have modelled the architecture as a `tleaf` structure.

In the following examples it is assumed that one mpi rank is mapped to each Cosma 8 NUMA region. This enforces that `cpus-per-task=16` is defined in the SLURM submission script. The Cosma 8 nodes consist of 8 NUMA regions per node, with 4 NUMA regions per socket. Example `tleaf`files for various setups are given below, where the intrasocket communication cost between NUMA regions is set at _5_, intranode but across sockets is set at _10_ and the internode cost is set at _1000_. These weightings are estimated values but have been shown to give satisfactory results in the testcases explored.

Expand Down

0 comments on commit 5053cb6

Please sign in to comment.