EPYC™ System Management Software (E-SMS) stack comprises of kernel modules, user space libraries, and tools to manage power, performance aspects through In-Band and Out-of-Band of the AMD EPYC server CPUs.

E-SMS In-band Stack

E-SMS In-Band stack is a Linux® software stack based on In-Band interfaces, such as Model-specific Registers (MSRs) and Host System Management Port (HSMP).

  • Kernel modules
    • amd_hsmp driver: Upstreamed Linux kernel driver under pdx86/amd that provides a device input and output control (IOCTL) interface to the In-Band system management functionality.
    • amd_edac modules: Upstreamed Linux kernel module under edac subsystem to provide error counts for the memory devices.
    • amd_mce modules: Upstreamed Linux kernel module under Machine Check Exception (MCE) framework to handle the SMIs and provide error decoding and log them in dmesg.
    • amd_energy driver: Open-sourced Linux driver that reports per core and per socket energy consumption through the hwmon attributes (privileged user).
  • User space libraries and tools
    • E-SMI In-band library: EPYC System Management Interface In-Band Library (E-SMI) is a C-library for Linux that provides APIs for In-Band user space software to monitor and control the CPU power, energy, performance, and other system management functionality.
    • E-SMI tool: Command line tool with options for the features supported on the platform.
    • amd_smi_exporter: AMD SMI Exporter provides AMD EPYC CPU and datacenter GPU metrics to the Prometheus server.
    • Rasdaemon: Includes error decoding and logging support for AMD EPYC CPUs.

E-SMS Out of-band stack

APML Suite (E-SMS Out-of-Band stack) is a Linux software stack based on AMD’s Out-of-Band Advanced Platform Management Link (APML) interface that’s targeted to run on Baseboard Management Controller (BMC). APML is an I3C or I2C slave interface.

  • Kernel modules: APML kernel modules are built and run on the BMC, connected to the AMD processors through the APML interface. These Out-of-tree kernel modules are open-sourced (APML modules).
    • apml_sbtsi module: Based on upstreamed Linux driver sbtsi_temp.c under hwmon subsystem to report per socket temperature and threshold management in kernel.
    • apml_sbrmi module: Based on upstreamed Linux driver sbrmi.c under hwmon subsystem to report per socket power consumption and controls the power limit.
    • Both “apml_sbtsi” and “apml_sbrmi” modules register a misc_device to provide IOCTL interface to user space, allowing them to run the custom protocols.
  • User space libraries and tools 
    •   APML Library, formerly E-SMI OOB Library: It is a C library for Linux that provides APIs for the OOB (BMC) user space software to monitor and control the CPU power, energy, performance, temperature, and other system management functionality.
      • APML tool: Command line tool with options for the features supported on the platform. This tool is released along with the library.