Skip navigation links
Tools
SDKs
Libraries
Samples & Demos
Docs
Zones
Community
Support
AMD APP Profiler
Skip Navigation LinksHome > Tools > AMD APP Profiler

Overview

The AMD APP Profiler is a performance analysis tool that gathers data from the OpenCL™ run-time and AMD Radeon™ GPUs during the execution of an OpenCL™ application. We can then use this information to discover bottlenecks in an application and find ways to optimize the application’s performance for AMD platforms.

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos.

What's New

What's New in Version 2.4

  • Support for AMD APP SDK v2.6.
  • Added a kernel occupancy analyzer, which calculates and displays a kernel occupancy number estimating the number of in-flight wavefronts on a compute unit as a percentage of the theoretical maximum number of wavefronts that the compute unit can support
  • Added support for collecting symbol information when collecting an application trace, allowing navigation from the API Trace view to the source code that called an API
  • Improved OpenCL™ analysis module:
    • Added detection of non-optimized data transfer operations
    • Added detection of redundant synchronization operations
    • Improved detection of unnecessary blocking write operations
    • Improved analysis in multithreaded applications (fixed false positives)
  • Added support for specifying which OpenCL™ APIs will be traced
  • Added ability to rename sessions in the Session Explorer Window
  • Added ability to automatically delete profiler sessions when closing a Microsoft® Visual Studio® solution
  • Added support for modifying the parameters used to initiate a profiler session
  • Added support for multiple-GPU systems when collecting performance counters
  • Improved the CLPerfMarkerAMD library
  • Improved performance when using timeout mode
  • In the session window, "GPRs" column has been renamed "VGPRs" (vector GPRs)
  • Fixed a problem with loading saved counters from a file
  • Fixed a problem where the performance counter values for some kernel dispatch operations were reported as all zeros
  • Fixed a problem with missing GPU timestamps in an application trace when enabling the "Write trace data periodically during program execution" option
  • Removed Data Transfer data from the Session view for OpenCL™ applications. It is recommended that you use the Application Trace view to get information on data transfers
  • Preview: Support for profiling with AMD Radeon™ HD7000 series GPUs (requires AMD APP SDK v2.6 and an AMD Catalyst version that supports this hardware)
  • Features

    • Collect OpenCL™ Application Trace
      • View and debug the input parameters and output results for all OpenCL™ API calls
      • Search the API calls
      • Navigate to the source code that called an OpenCL™ API
      • Specify which OpenCL™ APIs will be traced
    • Collect GPU Performance Counters of AMD Radeon™ graphics cards
      • Show kernel resource usages
      • Show the number of instructions executed by the GPU
      • Show the GPU utilization
      • Show the GPU memory access characteristics
      • Measure kernel execution time
    • OpenCL™ Timeline visualization
      • Visualize the application high level structure
      • Visualize kernel execution and data transfer operations
      • Visualize host code execution
      • Annotate host code in the timeline with performance markers using the included CLPerfMarkerAMD library
    • OpenCL™ Application Summary pages
      • Find incorrect or inefficient usage of the OpenCL™ API using the OpenCL™ analysis module
      • Find the API hotspots
      • Find the bottleneck between kernel execution and data transfer operations
      • Find the top 10 data transfer and kernel execution operations
    • OpenCL™ Kernel Occupancy Viewer
      • Calculates and displays a kernel occupancy number, which estimates the number of in-flight wavefronts on a compute unit as a percentage of the theoretical maximum number of wavefronts that the compute unit can support
      • Find out which kernel resource (GPR usage, LDS size, or Work-group size) is currently limiting the number of in-flight wavefronts
      • Displays graphs showing how kernel occupancy would be affected by changes in each kernel resource
    • Display the AMD IL and ISA (hardware disassembly) code of the kernel for OpenCL™ kernels and DXASM code for DirectCompute kernels.

    Getting Started

    Requirements

    • Microsoft Windows Vista or 7 (32 bit/64 bit) or Linux 32-bit/64-bit
    • [Optional] Microsoft Visual Studio 2008 and 2010 (Standard/Professional/Team System Edition)
    • To profile OpenCL™ applications:
      • AMD APP SDK v2.6 or later
      • [GPU device] AMD Catalyst with OpenCL™ GPU support (11.7 or newer)
      • [GPU device] AMD Radeon™ HD 4000 series or newer
    • To profile DirectCompute applications:
      • Microsoft DirectX run-time (June 2010 or later)
      • AMD Radeon™ HD 5000 series or newer

    AMD APP Profiler Screenshot


    Figure 1: Click to enlarge.

    Support

    Resources

    Download

Reset My View

File NameLaunch DateBitnessDescription
Linux®
AMDAPPProfiler-v2.4.1317-lnx.tgz (2.12MB)12/19/201132-bit/64-bitAMD APP Profiler for OpenCL on Linux platforms
Windows®
AMDAPPProfiler-v2.4.1297.msi12/19/201132-/64-bitAMD APP Profiler for OpenCL on Windows platforms