The Heterogeneous Interface for Portability (HIP) is AMD’s dedicated GPU programming environment for designing high performance kernels on GPU hardware. HIP is a C++ runtime API and programming language that allows developers to create portable applications on different platforms. This means that developers can write their GPU applications and with very minimal changes be able to run their code in any environment. This module provides in-depth training on programming with HIP.


Download the Presentation
➤  Deep Dive into GPU and Performance Optimizations
Learn about the GPU Programming Model and other basics to help optimize code performance.
Watch Video

 

➤  Your First HIP Code: Vector Add
Use HIP API’s to write a simple vector add application and compile it two different ways.
Watch Video
Walk Through Lab 1
Download Lab 1
Walk Through Lab 2
Download Lab 2

 

➤  HIP using ROCm Profiler: Matrix Transpose
Learn about the profiler, a tool that can help determine the bottlenecks of an application and other characteristics, and its use in optimizations.
Watch Video
See an example
Download the Lab

 

➤  Matrix Transpose Part 2: Naïve Version
The Naïve Matrix Transpose is a basic transpose kernel that can achieve a fraction of the effective bandwidth of the copy kernel because of the way it reads data.
Watch Video
See an example
Download the Lab

 

➤  Matrix Transpose Part 3: Optimized LDS Version
The Local Data Share is a user-managed cache available of AMD GPU’s that enables data-sharing within threads of the same thread-block. It allows for 100x faster reads and writes than global memory and can optimize the naïve matrix transpose throughput.
Watch Video
See an example
Download the Lab

 

➤  Matrix Transpose Part 4: Key Takeaways
Walk through the key summaries and takeaways of the module.
Watch Video

 

➤  Debugging Tips and Tricks
Learn about the expert tips and tricks that can help when dealing with debugging program crashes.
Watch Video
See an example
Download the Lab

 

➤  Debugging Tips and Tricks: Wrap-up
Learn some more debugging tips when writing and compiling GPU applications.
Watch Video

 

CUDA to HIP

HIP code can run on multiple platforms and provides the much-needed code portability in today’s world.HIP also allows for easy porting of code from CUDA, allowing developers to run CUDA applications on ROCm with ease. This module will walk through the porting process in detail through examples.


Download the Presentation
➤  Converting a simple application
Learn how to port a simple vector add application written in CUDA and run it on a ROCm GPU.
Watch Video
Download the Lab

 

➤  Porting Deep Learning CUDA-CNN to HIP
The Convolutional Neural Networks (CNN) are one of the most popular class of machine learning algorithms and have multiple important uses. Learn how to convert machine learning applications using HIP in this module.
Watch Video
See an example
Download the Lab

 

➤  Porting Machine Learning K-means to HIP
KMeans is a popular unsupervised ML algorithm based on the idea of clustering data into three different groups. This tutorial will port a CUDA KMeans application to HIP.
Watch Video
See an example
Download the Lab

 

➤  Wrap-up: Porting from CUDA to HIP
Walk through the key takeaways and conclusions of the module.
Watch Video