Navigation
- Spack usage disclaimer, copyright and trademark notice
- Introduction to SPACK
- Getting Started
- Build Customization
- Technical Support
AMD Toolchain with SPACK
Micro Benchmarks/Synthetic
SPACK HPC Applications
Introduction
The Weather Research and Forecasting (WRF) Model is a next-generation mesoscale numerical weather prediction system designed for both atmospheric research and operational forecasting applications. WRF features two dynamical cores, a data assimilation system, and a software architecture supporting parallel computation and system extensibility. The model serves a wide range of meteorological applications across scales from tens of meters to thousands of kilometers.
WRF official website: https://www.mmm.ucar.edu/weather-research-and-forecasting-model
Note: The “stderr” and “stdout” are lost when Spack exits because “stdout” and “stderr” are stored in a Python™ string (GitHub Link). Build might fail, if default /tmp is smaller than “stdout”, to avoid this failure always set TMPDIR.
Example: export TMPDIR=$HOME/temp
Build WRF using Spack
Reference to add external packages to Spack: Build Customization (Adding external packages to Spack).
# Format For Building WRF $ spack -d install - v -j 16 wrf@<Version> %aocc@<Version> target=<zen2 /zen3 > build_type=dm+sm ^jemalloc ^hdf5@<Version Number>+fortran ^netcdf-c@<Version> ^netcdf-fortran@<Version> ^openmpi@<Version>+cxx fabrics=auto |
# WRF 3.9.1.1 # Example: For Building WRF 3.9.1.1 with AOCC 3.1 $ spack -d install - v -j 16 wrf@3.9.1.1 %aocc@3.1.0 target=zen3 build_type=dm+sm ^jemalloc ^hdf5@1.12.0+fortran ^netcdf-c@4.7.0 ^netcdf-fortran@4.4.4 ^openmpi@4.0.5+cxx fabrics=auto |
# WRF 3.9.1.1 # Example: For Building WRF 3.9.1.1 with AOCC 3.0 $ spack -d install - v -j 16 wrf@3.9.1.1 %aocc@3.0.0 target=zen3 build_type=dm+sm ^jemalloc ^hdf5@1.8.21+fortran ^netcdf-c@4.7.0 ^netcdf-fortran@4.4.4 ^openmpi@4.0.3+cxx fabrics=auto |
# Example: For Building WRF 3.9.1.1 with AOCC 2.3 $ spack -d install - v -j 16 wrf@3.9.1.1 %aocc@2.3.0 target=zen2 build_type=dm+sm ^jemalloc ^hdf5@1.8.21+fortran ^netcdf-c@4.7.0 ^netcdf-fortran@4.4.4 ^openmpi@4.0.3+cxx fabrics=auto |
# Example: For Building WRF 3.9.1.1 with AOCC 2.2 $ spack -d install - v -j 16 wrf@3.9.1.1 %aocc@2.2.0 target=zen2 build_type=dm+sm ^jemalloc ^hdf5@1.8.21+fortran ^netcdf-c@4.7.0 ^netcdf-fortran@4.4.4 ^openmpi@4.0.3+cxx fabrics=auto |
# WRF 4.2 # Example: For Building WRF 4.2 with AOCC 3.1.0 $ spack -d install - v -j 16 wrf@4.2 %aocc@3.1.0 target=zen3 build_type=dm+sm ^jemalloc ^hdf5@1.12.0+fortran ^netcdf-c@4.7.0 ^netcdf-fortran@4.4.4 ^openmpi@4.0.5+cxx fabrics=auto |
# Example: For Building WRF 4.2 with AOCC 3.0 $ spack -d install - v -j 16 wrf@4.2 %aocc@3.0.0 target=zen3 build_type=dm+sm ^jemalloc ^hdf5@1.8.21+fortran ^netcdf-c@4.7.0 ^netcdf-fortran@4.4.4 ^openmpi@4.0.3+cxx fabrics=auto |
# Example: For Building WRF 4.2 with AOCC 2.3 $ spack -d install - v- -j 16 wrf@4.2 %aocc@2.3.0 target=zen2 build_type=dm+sm ^jemalloc ^hdf5@1.8.21+fortran ^netcdf-c@4.7.0 ^netcdf-fortran@4.4.4 ^openmpi@4.0.3+cxx fabrics=auto |
# Example: For Building WRF 4.2 with AOCC 2.2 $ spack -d install - v -j 16 wrf@4.2 %aocc@2.2.0 target=zen2 build_type=dm+sm ^jemalloc ^hdf5@1.8.21+fortran ^netcdf-c@4.7.0 ^netcdf-fortran@4.4.4 ^openmpi@4.0.3+cxx fabrics=auto |
Please use any combination of below components/Applications and its versions.
Component/Application | Versions Applicable |
WRF | 4.2,3.9.1.1 |
AOCC | 3.1.0, 3.0.0, 2.3.0, 2.2.0 |
AOCL | 3.0, 2.3 |
Specifications and Dependencies
Symbol | Meaning |
-d | To enable debug output |
-v | To enable verbose |
@ | To specify version number |
% | To specify compiler |
-j 16 | To build in parallel |
build_type=dm+sm | Currently AOCC supports only this build type |
^jemalloc | To build with jemalloc dependency |
^hdf5+fortran | To build with hdf5 dependency with fortran enabled |
^netcdf-c | To build with netcdf-c dependency |
^netcdf-fortran | To build with netcdf-fortran dependency |
^openmpi +cxx | Use Open MPI for build with cxx enabled |
fabrics=auto | Use fabrics=auto variant for Open MPI, by default fabrics=none |
Obtaining Benchmarks
WRF 3.9.1.1
There are two commonly used WRF data sets:
- Conus 12km benchmark – Single domain, medium size. 12km CONUS, Oct. 2001. 48-hour, 12km resolution case over the Continental U.S. (CONUS) domain October 24, 2001 with a time step of 72 seconds. The benchmark period is hours 25-27 (3 hours), starting from a restart file from the end of hour 24 (provided).
- Conus 2.5 km benchmark – Single domain, large size. 2.5 km CONUS, June 4, 2005. Latter 3 hours of a 9-hour, 2.5km resolution case covering the Continental U.S. (CONUS) domain June 4, 2005 with a 15 second time step. The benchmark period is hours 6-9 (3 hours), starting from a restart file from the end of the initial 6 hour period
The Conus 12 km benchmark is a bit small for today’s machines. The Conus 2.5 km benchmark uses a 17 GB restart file and is preformatted to use Parallel NetCDF. However, the namelist.input file can be altered to use sequential I/O instead in case there is no parallel file system like Lustre/BeeGFS/GPFS available.
Running WRF on AMD 2nd Gen EPYCTM (Rome) Processors
WRF can be used for a variety of workloads but is commonly run as a benchmark using data sets Conus 2.5km and Conus 12km.
The following steps are recommended to run Conus 12km benchmark on AMD EPYC 7742 processor model which has 128 cores (SMT OFF) per Node.
Setting Environment |
# Format for loading WRF module into environment build with AOCC $ spack load wrf@<Version> %aocc@<Version> |
# Example to load WRF 3.9.1.1 build with AOCC 2.3 $ spack load wrf@3.9.1.1 %aocc@2.3.0 |
# Locate and go to WRF installation directory $ spack cd -i wrf@3.9.1.1 %aocc@2.3.0 |
Runtime Environment settings |
# Go to WRF installed directory and execute following steps $ cd test /em_real $ rm namelist.input $ ln -s /<path_to_conus_data> /conus_12km/ * . $ ulimit -s unlimited |
$ export WRF_HOME=/<WRF installation directory> |
# common settings for AMD 2nd and 3rd Gen EPYC $ export PBV=CLOSE $ export OMP_NUM_THREADS=4 $ export OMP_PROC_BIND=TRUE $ export OMP_STACKSIZE= "16M" $ export PE=4 |
$ rm -rf rsl.* 1node1tile wrfout* |
# Try with 4,8,16,32,64,128,196 $ export WRF_NUM_TILES=128 |
# Open MPI binding used for AMD EPYC 7002 Series Processors $ export RESOURCE=L3cache $ export ITE=1 |
# Run command using Open MPI $ mpirun -np 32 --bind-to core --map-by ppr:$ITE:$RESOURCE:pe=$PE numactl -l $WRF_HOME /main/wrf .exe |
Running WRF on AMD 3rd Gen EPYCTM (Milan) Processors
Setting Environment |
# Format for loading WRF module into environment build with AOCC $ spack load wrf@<Version> %aocc@<Version> |
# Example to load WRF 3.9.1.1 build with AOCC 3.1.0 $ spack load wrf@3.9.1.1 %aocc@3.1.0 |
# Locate and go to WRF installation directory $ spack cd -i wrf@3.9.1.1 %aocc@3.1.0 |
Runtime Environment settings |
# Go to WRF installed directory and execute following steps $ cd test /em_real $ rm namelist.input $ ln -s /<path_to_conus_data> /conus_12km/ * . $ ulimit -s unlimited |
$ export WRF_HOME=/<WRF installation directory> |
# common settings for AMD 2nd and 3rd Gen EPYC $ export PBV=CLOSE $ export OMP_NUM_THREADS=4 $ export OMP_PROC_BIND=TRUE $ export OMP_STACKSIZE= "16M" $ export PE=4 |
$ rm -rf rsl.* 1node1tile wrfout* |
# Try with 4,8,16,32,64,128,196 $ export WRF_NUM_TILES=128 |
# Open MPI binding used for AMD EPYC 7003 Series Processors $ export RESOURCE=numa $ export ITE=4 |
# Run command using Open MPI $ mpirun -np 32 --bind-to core --map-by ppr:$ITE:$RESOURCE:pe=$PE numactl -l $WRF_HOME /main/wrf .exe |
Calculating benchmark performance numbers
Once you have run the benchmarks, execute below commands bench.sh, stats.sh and getmean.sh to calculate the benchmark performance values. The contents of these files are listed below.
These commands are not target specific.
get mean |
# To get the statistics from rsl.out.* # Create scripts for bench.sh, getmean.sh and stats.awk from below code blocks |
$ export SCRIPTS=/<scripts path> $ cat rsl.out.* > 1node1tile $ $SCRIPTS /bench .sh 1node1tile $ $SCRIPTS /getmean .sh 1node1tile |
bench.sh |
grep "Timing for main" $1 | awk 'BEGIN{t=0;at=0;i=0;}{t=t+$9;i=i+1;}END{at=t/i;print "\nAverage Time: " at " sec/step over " i " time steps\n"}' |
getmean.sh |
#!/bin/bashgrep "Timing for main" $1 | tail -149 | awk '{print $9}' | awk -f $SCRIPTS /stats . awk |
stats.awk |
#!/bin/bash
BEGIN{ a = 0.0 ; i = 0 ; max = -999999999 ; min = 9999999999 } { i ++ a += $1 if ( $1 > max ) max = $1 if ( $1 < min ) min = $1 } END{ printf ( "---\n%10s %8d\n%10s %15f\n%10s %15f\n%10s %15f\n%10s %15f\n%10s %15f\n" , "items:" ,i, "max:" ,max, "min:" ,min, "sum:" ,a, "mean:" ,a/(i*1.0), "mean/max:" ,(a/(i*1.0)) /max ) } |