NEW, AMD uProf 5.0 is now available (October 10, 2024)

AMD uProf (“MICRO-prof”) is a performance analysis tool-suite for x86 based applications running on Windows, Linux, and FreeBSD operating systems. It provides performance metrics for AMD “Zen”-based processors and AMD Instinct™ MI Series accelerators. AMD uProf enables the developer to better understand the performance bottlenecks, optimization scope, and evaluate improvements.

AMD uProf Offers:

  • Performance Analysis – to identify runtime performance bottlenecks of the application.
  • System Analysis – to monitor system performance metrics.
  • Power Profiling – to monitor thermal and power characteristics of the system.
  • Remote Profiling – to connect to remote Linux systems (from a Windows host system), trigger collection/translation of data on the remote system and report it in local GUI.

AMD uProf can effectively be used to:

  • Performance characterization of workloads to understand their memory/compute boundedness and pipeline utilization.
  • Analyze the performance of one or more processes or the entire system.
  • Characterize the performance bottlenecks (hotspots & micro-architecture) in the source code.
  • Identify ways to optimize the source code for better performance and power efficiency.
  • Examine the behavior of kernel, drivers, and system modules.
  • Analyze thread concurrency.
  • Analyze load and compute imbalance issues in HPC workloads using OpenMP and MPI tracing.
  • Observe frequency, thermal and power characteristics (Power profiling).
  • Observe system metrics, such as Instructions Per Clock (IPC), core effective frequency, and memory bandwidth.
  • Visualize heterogenous application (running on MI systems) runtime behavior.
  • Monitor GPU hardware components, kernels, dispatch information performance metrics of the kernels running on MI systems.

What’s New in AMD uProf 5.0

  • Support for new Zen5 processors and MI300A in all AMDuProf tools. Refer uProf Release Notes for the complete feature list.
  • Virtualization support – AWS, Hyper-V
  • Fixes to enhance overall security on Windows platform.

AMDuProfPcm

  • HTML report to display the performance metrics in various charts and timeline graphs.
  • HTML report with interactive graph to visualize classic Roofline data for CPU applications.
  • Compare two profile sessions and generate a HTML report.

AMDuProfSys

  • New metrics support – HSMP*.
  • Time series reporting of profile data.
  • Consolidated report for DF metrics.

AMDuProfCLI & AMDuProf

CPU Profiling:

  • Profiling of Java applications by attaching to Java process during runtime.
  • New IBS specific predefined views.
  • Reduced collection overhead, faster data processing, and report generation.
  • Simplified CLI options.
  • OS support - Win11 23H2.

User mode Sampling and Tracing:

  • Hotspots – a new profile type to identify the hottest inclusive and exclusive time-consuming functions supporting C, C++, Java, and Python applications on Linux
    • Callstack stitching of OpenMP applications.
    • Supported Python profiling for Python versions 3.10, 3.11, and 3.12.
    • Mixed mode callstack of Python applications.
  • Overview – a new profile type to visualize heterogenous application (on MI300A) runtime behavior.
  • Threading Analysis – improved GUI rendering.

HPC - OpenMP and MPI Tracing:

  • OpenMP tracing of GCC compiled applications.
  • Aggregated reporting of OpenMP parallel region instances.
  • MPI Fortran2008 support for Open MPI implementation.
  • Reduced collection overhead, faster data processing and report generation.

GPU Profiling and Tracing:

  • ROCm 6.2 support for GPU Tracing and Profiling.
  • Monitor GPU hardware components, kernels, dispatch information performance metrics of the kernels running on MI systems.

GUI

  • Hardware accelerated timeline views for large datasets.
  • Callstack tracing on per-thread basis in timelines to visualize a thread’s execution control flow on Linux only.
  • GPU SMI* metrics like GPU power, temperature, VRAM usage, etc., in timeline.
  • GPU HIP / HSA summary tables for GPU tracing.
  • Re-ordering of source lines based on execution or line number order in source view. (This helps in determining compiler driven optimized code generation vis-a-vis code written by programmers)
  • Faster import of profile sessions with large datasets in GUI.
  • Fixing GUI scaling issues and dark-mode compatibility on Linux.
  • Parallel debug symbol file downloads on Windows.
  • GUI keyboard shortcuts.

*

  • HSMP – Host System Management Port; Used to monitor power and thermal characteristics of various components.
  • DF – Infinity Fabric which connects logic blocks on a chip/chiplet (example: between the graphics/CPU core and the memory controller), between chiplets on a package, and between CPU sockets on a 2-socket EPYC motherboard.
  • SMI – System Management interface

For a complete list of features added in this release, refer to the release notes.

Spack Support

Supports Spack open-source utility for flexible package management.

Operating Systems

  • AMD uProf supports the 64-bit version of the following operating systems:
  • Microsoft®
    • Windows® 10 (up to 22H2)
    • Windows 11 (up to 23H2)
    • Windows Server 2019 and 2022
  • Linux
    • Ubuntu® 22.04 and later
    • RHEL® 8.6 and later
    • SLES & openSUSE® Leap 15.5*, Debian 12
    • RHEL based distros - Rocky Linux 9.3*, Alma Linux 9.4
  • FreeBSD® 13

Virtualization

  • Linux KVM
  • Windows Hyper-V
  • VMware ESXi
  • Citrix Xen

Cloud Environments

  • AWS
  • Azure

Containers

  • Docker (on Linux)

For OS support on AMD EPYC™ processors, refer to AMD website (https://www.amd.com/en/products/processors/server/epyc/minimum-operating-system.html).

*- Sanity tested. Support subject to commitment to compatibility with Red Hat Enterprise Linux.

Compilers and Application Environment

AMD uProf supports following application environments:

  • Languages – C, C++, Fortran, Assembly, Java, Python and .NET
  • Programs compiled with standard x86-64 compilers
    • AMD AOCC
    • Microsoft and Intel compilers
    • GNU and LLVM compilers
  • Parallelism – OpenMP and MPI
  • Applications compiled with and without optimization and/or debug information

Features by OS

Feature Linux Windows FreeBSD
System Analysis
AMDuProfPCM*# Yes Yes Yes
AMDuProfSys*# Yes Yes No
CPU Profiling
Overview Analysis Yes No No
Hotspots Analysis Yes No No
Threading Analysis Yes No No
Micro-architecture Analysis Yes Yes Yes
Instruction Based Sampling (IBS) Yes Yes No
Timer Based Profiling (TBP) Yes Yes No
Cache Analysis Yes Yes No
Java App Profiling Yes Yes Yes
Python Profiling Yes No No
Call Stack Sampling – Native (C, C++, and FORTRAN) Yes Yes Yes
Call Stack Sampling – Java Yes No No
MPI Code Profiling Yes No No
OpenMP Tracing Yes No No
MPI API Tracing Yes No No
OS Tracing Yes No No
GPU Analysis
GPU Profiling# Yes No No
GPU Tracing Yes No No
Power Profiling
Live Power Profile Yes Yes No
User Interface
Graphical Interface Yes Yes No
Command Line Yes Yes Yes
API
Profile Control API Yes Yes No
Power Profiler API Yes Yes No
Instrument API Yes No No
  • * Feature available only on AMD EPYC™ processors
  • # Command line interface only

Resources and Technical Support

Documentation

Support

For support options, refer to Technical Support.

AMD Community

For moderated forums, refer to the AMD Community.

Download with End User License Agreement

File Name Version Size Launch Date OS Bitness Description
AMDuProf-5.0.1174.exe 5.0 118.72 MB 10/10/2024 Windows 64-bit MD5: 558da464a495a1876b8b2d524f8ec7dc
AMDuProf_Linux_x64_5.0.1479.tar.bz2 5.0 267.29 MB 10/10/2024 Linux 64-bit MD5: 90cb6ea91e65df34c4cf3913c1b301a3
amduprof_5.0-1479_amd64.deb 5.0 304.23 MB 10/10/2024 Linux 64-bit MD5: fddf51f2be2914fa29d6797f2c5cc2fd
amduprof-5.0-1479.x86_64.rpm 5.0 301.70 MB 10/10/2024 Linux 64-bit MD5: 6cf749963b9eef4c31f5fa71164eb289
bpftracer_source.tar.bz2 5.0 5.1K 10/21/2024 Linux 64-bit MD5: 35aeb0ebcd303834c60823ed07fe9352
AMDuProf_FreeBSD_x64_5.0.1223.tar.bz2 5.0 117.28 MB 12/05/2024 FreeBSD 64-bit MD5: 4a06b302e5a85fc8f4f7a4e8e44e8a70