Hi everyone, there is a ton to announce this month!  In our ongoing effort to improve productivity and application performance, our development teams have been busy making improvements across our entire line-up of developer tools for compute acceleration.  We’re releasing new versions of AMD CodeXL, the Bolt C++ template library and the AMD APP SDK with quality, performance and feature improvements aimed at making your lives easier.

Here’s a glimpse into what’s new with each:

AMD CodeXL 1.2: AMD CodeXL is a unified developer tool suite that includes:

  • GPU Debugger: Debugging tool for OpenCL™/OpenGL API calls and OpenCL™ kernel
  • CPU Profiler: A profiling suite for tuning application performance on AMD CPU
  • GPU Profiler: A GPU profiler for OpenCL and DirectCompute applications on AMD APUs/GPUs
  • Static Analyzer: Analyze OpenCL kernels statically to estimate performance

New for AMD CodeXL Version 1.2 are a number of new features expanding platform support and providing improvements to the developer experience:

  • The CPU Profiler has a number of improved capabilities including:
    • A new user interface design to help ease navigation and use of key feature
    • Support for profiling of Java/.NET application
    • Support for Time-Based Profiling on Intel Win8 platform
    • Support for AMD’s recently announced Kabini APU public registers
  • The Kernel Analyzer now has a new analysis module for Southern Islands devices that performs emulation of kernel workloads
  • We’ve also updated the AMD CodeXL tutorial to help familiarize you with all of the above.

You can Download AMD CodeXL 1.2 here. On the landing page you will find the quick start guide, release notes and other useful info on AMD CodeXL. Additionally, the AMD CodeXL forum is a place to exchange ideas, seek support and get updates on AMD CodeXL.

Bolt 1.0: To recap, Bolt represents a big step toward AMD’s ultimate vision of making heterogeneous computing a ubiquitous, easy to use component of all mainstream programming environments. Bolt provides C++ developers with an STL compatible library of high level constructs for accelerating data parallel applications. Code written using STL or other C++ template libraries (example: TBB) can be converted to Bolt in minutes.The Bolt library contains accelerated kernel code for many useful functions like Sort, Scan, Reduce and others, so you won’t need to learn OpenCL™ or C++ AMP APIs to get the benefits of heterogeneous acceleration.

Since the Bolt Beta was released in the spring, our team has been focused on improving both performance and quality to get Bolt ready for real-world applications.  The initial release of Bolt provides acceleration on AMD GPUs and APUs as well as DX11 based platforms, but to make it truly cross-platform, we’ve now added CPU fall-back paths for every Bolt function so Bolt applications will run correctly on any PC platform.   We understand the importance of cross-platform support to developers, and a key part of the Bolt vision is to provide the best performance path possible for future platforms – so keep the feedback coming.  And of course, Bolt is an open-source project.  Contributions are always welcome!

The source to Bolt is available on GitHub: https://github.com/HSA-Libraries/Bolt/releases/v1.0GA. For further details on Bolt features and capabilities along with Bolt architectural details and how-to information, visit the Blogs from our technical staff. See also the webinar presented by a member of AMD’s technical staff providing an introduction to the powerful capabilities of “Bolt” available here.

AMD APP SDK 2.8.1:  AMD’s APP SDK pulls together everything a developer needs to leverage the processing power of heterogeneous compute.  OpenCL™ is the primary mechanism for achieving this, but our goal is to also enable you to accelerate applications with the programming paradigm you are already using (Bolt C++ template library mentioned above is our most recent step in this direction).

New to APP SDK 2.8.1:

Bolt: With the launch of Bolt 1.0, we’ve added several samples to demonstrate the use of the features of Bolt 1.0.These showcase the usage of valuable Bolt APIs such as scan, sort, reduce and transform. Other new samples highlight the ease of porting from STL and the performance benefits achieved over equivalent STL implementations. We’ve also included samples to demonstrate the different fallback options available in Bolt 1.0 when no GPU is available. This includes fallback to multicore-CPU if TBB libraries are installed, or falling all the way back to serial-CPU if needed to ensure your code runs correctly on any platform.

AMD has been working closely with the OpenCV open source community to add heterogeneous acceleration capability to the world’s most popular computer vision library. These changes are already integrated into OpenCV and are readily available for developers who want to improve performance and efficiency of their computer vision applications. We’ve included samples to illustrate these improvements and highlight how simple it is to include them in your app.  For more information on the latest OpenCV enhancements, please see Harris’ blog.

GCN: AMD recently launched its new Graphics Core Next (GCN) architecture on several AMD products.GCN is based on a scalar architecture vs. the VLIW vector architecture of prior generations, so hand-tuned vectorization to optimize hardware utilization is no longer needed. We’ve modified several samples in AMD APP SDK 2.8.1 to show the ease of writing scalar code as compared to vectorization.

Download AMD’s APP SDK 2.8.1 here

Please visit us at AMD developer central where you will find resources including community forums to help you create your Heterogeneous Computing solutions. As always, we love hearing from you to help us keep improving!

Thank you,

Marty Johnson is a Director of Product Management at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only.  Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.

1For best results with APP SDK 2.8.1, we recommend you update to AMD Catalyst 13.6 beta2 drivers or newer.

2 Responses

  1. rahul garg

    Great :)
    Now a feature request: Can you update the OpenCL programming guide to list the features and optimization guidelines for new products that have come out in 2013? Also, I believe the description of fp64 capabilities of Trinity GPU is wrong in the guide. The guide lists no fp64 extensions while Trinity GPU definitely supports cl_amd_fp64 at least.

  2. fromChina

    I’m from China and the speed to download is about 20 KB/s very poor.
    And damn if the connection broken you had to download the whole file again.

    The nvidia cuda / Intel sdk is very very fast.

    please fix it.

Leave a Reply