EPYC™ system management software (E-SMS) stack comprises of kernel modules, user space libraries and tools to manage power, performance aspects via In-band and Out-of-band of the EPYC™ line of server CPUs from AMD.

E-SMS In-band stack

E-SMS In-band stack is a Linux® software stack based on in-band interfaces such as MSRs and HSMP (Host System Management Port)

  • Kernel modules
    • amd_hsmp driver :Upstreamed Linux® kernel driver under pdx86/amd that provides and IOCTL interface to the in-band system management functionality
    • amd_edac modules : Upstreamed Linux® kernel module under edac subsystem to provide error counts for the memory devices.
    • amd_mce modules: Upstreamed Linux® kernel module under MCE framework to handle the SMIs and provide error decoding and log them to dmesg
    • amd_energy driver: Open-sourced Linux® driver, reports per core and per socket energy consumption via hwmon attributes (privileged user)
  • User space libraries and tools
    • E-SMI In-band library: The EPYC™ System Management Interface In-band Library, is a C-library for Linux® , which provides APIs for In-band user space software to monitor and control the CPU’s power, energy, performance, and other system management functionality.
    • E-SMI tool: Command line tool with options for the features supported on the platform
    • amd_smi_exporter: The AMD SMI Exporter provides AMD EPYC CPU & Datacenter GPU metrics to the Prometheus server.
    • Rasdaemon: Includes, error decoding and logging support for AMD EPYC™ CPUs.

E-SMS Out of-band stack

APML Suite (E-SMS Out-of-band stack) Is a Linux® software stack based on AMD’s out-of-band Advanced Platform Management Link (APML) interface, that’s targeted to run on the BMC. The APML is an I3C or I2C slave interface.

  • Kernel modules:APML kernel modules are built and run on the BMC, connected to the AMD processors via the APML interface. These Out-of-tree kernel modules are open-sourced (APML modules).
    • apml_sbtsi moduleIs based on up-streamed Linux® driver sbtsi_temp.c under hwmon subsystem to report per socket temperature and threshold management, in kernel.
    • apml_sbrmi module: Is based on up-streamed Linux® driver sbrmi.c under hwmon subsystem to report per socket power consumption, limits and control the power limit
    • Both “apml_sbtsi” and “apml_sbrmi” modules, register a misc_device to provide IOCTL interface to user space, allowing them to run these custom protocols.
  • User space libraries and tools 
    •   APML Library:(Formerly E-SMI OOB Library), is a C library for Linux®, which provides APIs for the OOB (BMC) user space software to monitor and control the CPU’s power, energy, performance, temperature, and other system management functionality.
      • APML tool: Command line tool with options for the features supported on the platform.APML tool is released along with the library.