NEW, AMD uProf 5.0 is now available (October 10, 2024)
AMD uProf (“MICRO-prof”) is a performance analysis tool-suite for x86 based applications running on Windows, Linux, and FreeBSD operating systems. It provides performance metrics for AMD “Zen”-based processors and AMD Instinct™ MI Series accelerators. AMD uProf enables the developer to better understand the performance bottlenecks, optimization scope, and evaluate improvements.
AMD uProf Offers:
- Performance Analysis – to identify runtime performance bottlenecks of the application.
- System Analysis – to monitor system performance metrics.
- Power Profiling – to monitor thermal and power characteristics of the system.
- Remote Profiling – to connect to remote Linux systems (from a Windows host system), trigger collection/translation of data on the remote system and report it in local GUI.
AMD uProf can effectively be used to:
- Performance characterization of workloads to understand their memory/compute boundedness and pipeline utilization.
- Analyze the performance of one or more processes or the entire system.
- Characterize the performance bottlenecks (hotspots & micro-architecture) in the source code.
- Identify ways to optimize the source code for better performance and power efficiency.
- Examine the behavior of kernel, drivers, and system modules.
- Analyze thread concurrency.
- Analyze load and compute imbalance issues in HPC workloads using OpenMP and MPI tracing.
- Observe frequency, thermal and power characteristics (Power profiling).
- Observe system metrics, such as Instructions Per Clock (IPC), core effective frequency, and memory bandwidth.
- Visualize heterogenous application (running on MI systems) runtime behavior.
- Monitor GPU hardware components, kernels, dispatch information performance metrics of the kernels running on MI systems.
What’s New in AMD uProf 5.0
- Support for new Zen5 processors and MI300A in all AMDuProf tools. Refer uProf Release Notes for the complete feature list.
- Virtualization support – AWS, Hyper-V
- Fixes to enhance overall security on Windows platform.
AMDuProfPcm
- HTML report to display the performance metrics in various charts and timeline graphs.
- HTML report with interactive graph to visualize classic Roofline data for CPU applications.
- Compare two profile sessions and generate a HTML report.
AMDuProfSys
- New metrics support – HSMP*.
- Time series reporting of profile data.
- Consolidated report for DF metrics.
AMDuProfCLI & AMDuProf
CPU Profiling:
- Profiling of Java applications by attaching to Java process during runtime.
- New IBS specific predefined views.
- Reduced collection overhead, faster data processing, and report generation.
- Simplified CLI options.
- OS support - Win11 23H2.
User mode Sampling and Tracing:
- Hotspots – a new profile type to identify the hottest inclusive and exclusive time-consuming functions supporting C, C++, Java, and Python applications on Linux
- Callstack stitching of OpenMP applications.
- Supported Python profiling for Python versions 3.10, 3.11, and 3.12.
- Mixed mode callstack of Python applications.
- Overview – a new profile type to visualize heterogenous application (on MI300A) runtime behavior.
- Threading Analysis – improved GUI rendering.
HPC - OpenMP and MPI Tracing:
- OpenMP tracing of GCC compiled applications.
- Aggregated reporting of OpenMP parallel region instances.
- MPI Fortran2008 support for Open MPI implementation.
- Reduced collection overhead, faster data processing and report generation.
GPU Profiling and Tracing:
- ROCm 6.2 support for GPU Tracing and Profiling.
- Monitor GPU hardware components, kernels, dispatch information performance metrics of the kernels running on MI systems.
GUI
- Hardware accelerated timeline views for large datasets.
- Callstack tracing on per-thread basis in timelines to visualize a thread’s execution control flow on Linux only.
- GPU SMI* metrics like GPU power, temperature, VRAM usage, etc., in timeline.
- GPU HIP / HSA summary tables for GPU tracing.
- Re-ordering of source lines based on execution or line number order in source view. (This helps in determining compiler driven optimized code generation vis-a-vis code written by programmers)
- Faster import of profile sessions with large datasets in GUI.
- Fixing GUI scaling issues and dark-mode compatibility on Linux.
- Parallel debug symbol file downloads on Windows.
- GUI keyboard shortcuts.
*
- HSMP – Host System Management Port; Used to monitor power and thermal characteristics of various components.
- DF – Infinity Fabric which connects logic blocks on a chip/chiplet (example: between the graphics/CPU core and the memory controller), between chiplets on a package, and between CPU sockets on a 2-socket EPYC motherboard.
- SMI – System Management interface
For a complete list of features added in this release, refer to the release notes.
Spack Support
Supports Spack open-source utility for flexible package management.
Operating Systems
- AMD uProf supports the 64-bit version of the following operating systems:
- Microsoft®
- Windows® 10 (up to 22H2)
- Windows 11 (up to 23H2)
- Windows Server 2019 and 2022
- Linux
- Ubuntu® 22.04 and later
- RHEL® 8.6 and later
- SLES & openSUSE® Leap 15.5*, Debian 12
- RHEL based distros - Rocky Linux 9.3*, Alma Linux 9.4
- FreeBSD® 13
Virtualization
- Linux KVM
- Windows Hyper-V
- VMware ESXi
- Citrix Xen
Cloud Environments
- AWS
- Azure
Containers
- Docker (on Linux)
For OS support on AMD EPYC™ processors, refer to AMD website (https://www.amd.com/en/products/processors/server/epyc/minimum-operating-system.html).
*- Sanity tested. Support subject to commitment to compatibility with Red Hat Enterprise Linux.
Compilers and Application Environment
AMD uProf supports following application environments:
- Languages – C, C++, Fortran, Assembly, Java, Python and .NET
- Programs compiled with standard x86-64 compilers
- AMD AOCC
- Microsoft and Intel compilers
- GNU and LLVM compilers
- Parallelism – OpenMP and MPI
- Applications compiled with and without optimization and/or debug information
Features by OS
Feature | Linux | Windows | FreeBSD |
System Analysis | |||
AMDuProfPCM*# | Yes | Yes | Yes |
AMDuProfSys*# | Yes | Yes | No |
CPU Profiling | |||
Overview Analysis | Yes | No | No |
Hotspots Analysis | Yes | No | No |
Threading Analysis | Yes | No | No |
Micro-architecture Analysis | Yes | Yes | Yes |
Instruction Based Sampling (IBS) | Yes | Yes | No |
Timer Based Profiling (TBP) | Yes | Yes | No |
Cache Analysis | Yes | Yes | No |
Java App Profiling | Yes | Yes | Yes |
Python Profiling | Yes | No | No |
Call Stack Sampling – Native (C, C++, and FORTRAN) | Yes | Yes | Yes |
Call Stack Sampling – Java | Yes | No | No |
MPI Code Profiling | Yes | No | No |
OpenMP Tracing | Yes | No | No |
MPI API Tracing | Yes | No | No |
OS Tracing | Yes | No | No |
GPU Analysis | |||
GPU Profiling# | Yes | No | No |
GPU Tracing | Yes | No | No |
Power Profiling | |||
Live Power Profile | Yes | Yes | No |
User Interface | |||
Graphical Interface | Yes | Yes | No |
Command Line | Yes | Yes | Yes |
API | |||
Profile Control API | Yes | Yes | No |
Power Profiler API | Yes | Yes | No |
Instrument API | Yes | No | No |
- * Feature available only on AMD EPYC™ processors
- # Command line interface only
Resources and Technical Support
Documentation
- AMD uProf User Guide
- AMD uProf Release Notes
- Prior versions: AMD uProf Archive
Support
For support options, refer to Technical Support.
AMD Community
For moderated forums, refer to the AMD Community.
Download with End User License Agreement
File Name | Version | Size | Launch Date | OS | Bitness | Description |
AMDuProf-5.0.1174.exe | 5.0 | 118.72 MB | 10/10/2024 | Windows | 64-bit | MD5: 558da464a495a1876b8b2d524f8ec7dc |
AMDuProf_Linux_x64_5.0.1479.tar.bz2 | 5.0 | 267.29 MB | 10/10/2024 | Linux | 64-bit | MD5: 90cb6ea91e65df34c4cf3913c1b301a3 |
amduprof_5.0-1479_amd64.deb | 5.0 | 304.23 MB | 10/10/2024 | Linux | 64-bit | MD5: fddf51f2be2914fa29d6797f2c5cc2fd |
amduprof-5.0-1479.x86_64.rpm | 5.0 | 301.70 MB | 10/10/2024 | Linux | 64-bit | MD5: 6cf749963b9eef4c31f5fa71164eb289 |
bpftracer_source.tar.bz2 | 5.0 | 5.1K | 10/21/2024 | Linux | 64-bit | MD5: 35aeb0ebcd303834c60823ed07fe9352 |
AMDuProf_FreeBSD_x64_5.0.1223.tar.bz2 | 5.0 | 117.28 MB | 12/05/2024 | FreeBSD | 64-bit | MD5: 4a06b302e5a85fc8f4f7a4e8e44e8a70 |