Version: AMD Zen HPL 2022-11
- This binary executable was built with AVX512 support, and will only run properly on systems that support AVX512 instructions
- Specifically: AMD “Zen4”-based processors such as the AMD 4th Generation EPYC™ CPUs.
- The binary will NOT run on AMD “Zen3”-based or prior processors.
- The binary was built on Red Hat® Enterprise Linux® 8.6 and runs without issue on Red Hat® Enterprise Linux® 9 and UBUNTU® 22.04.
- OpenMPI 4: This binary was built against OpenMPI 4.1.4 and should run without issue as long as OpenMPI 4 is in the PATH.
- Boost : ON
- Transparent Hugepages : always
- SMT : OFF
- NPS : 4
- Determinism : Power
How to Run:
- Modify the supplied HPL.dat file, according to the community tuning guide
- By default, this will run a very small problem. For peak performance, a larger value for ‘N’ should be chosen, such that the memory use will be close to 90% of system memory. Ideally, the N value will be a multiple of the NB value.
- Other than selection of ‘N’, the supplied file is a reasonable starting place for most Zen4 systems
- Peak single-node performance is typically found with 1 MPI rank per socket, and as many threads per socket as there are physical cores. This corresponds to P = 1, Q = 2.
- AMD Zen HPL introduces a new hybrid panel broadcast mechanism, which can be enabled by setting BCAST = 7
- Check the ‘run.sh’ script.
- By default, it sets the number of threads per rank to be the number of cores per socket, and the number of MPI ranks to 2.
- If the HPL.dat file is changed to a different P & Q, these will need to be adjusted accordingly.
- (Optional) Clean up the system, and set various system parameters.
- As root, invoke ‘reset-system.sh’, which will clean up system memory, tune for Transparent Hugepages, disable NUMA balancing, set the CPU governor and enable CPU boost.
- Invoke “./run.sh’