ACML End of Life Notice: We have transitioned our math libraries from a proprietary, closed source codebase (ACML) to open source solutions, providing developers with a set of open source libraries targeted to those who want to  accelerate computations on GPUs, APUs and CPUs. Please visit to learn more.


Simple interface to take advantage of latest hardware innovations

ACML tunes for the latest hardware so you can easily tap into new processor features, including:

  • SSE, SSE2, SSE3, AVX, FMA4, FMA3
  • Multi-cores
  • discrete GPUs
  • integrated GPUs

Blazing fast development of scientific and High Performance Computing projects

With tuned implementations of industry standard math libraries and other frequently used scientific subroutines, ACML enables you to accelerate projects such as:

  • Weather modeling
  • Finite element analysis
  • Computational Fluid Dynamics
  • Financial analysis
  • Oil and gas applications
  • and many more…

Easy path to multi-threading

ACML’s aggressively tuned OpenMP versions mean that you don’t have to worry about managing sophisticated threading models or complex debugging. Whether you are using dynamic or static linking, Windows® or Linux® 32- or 64-bit operating systems, multi-threading just works. Multi-threaded routines are available for the Level 3 BLAS, many LAPACK routines, and the 2D and 3D FFTs. 

Support for the Latest AMD Opteron™ Processors

AMD Core Math Library (ACML) is specifically designed to support multi-threading and other key features of AMD’s next-generation processors. ACML currently supports OpenMP, and future releases will expand upon its support of multi-platform, shared memory multiprocessing. Beginning with the 5.0 release, ACML also features hand-tuned “Bulldozer” support for the  *GEMM matrix multiplication routines, the CFFT complex-complex Fast Fourier Transforms, and more.

Support for OpenCL™ devices

AMD Core Math Library (ACML) is designed to offload computation to detected OpenCL devices discovered at runtime.  If an individual BLAS or FFT call is determined to be large enough to benefit from OpenCL acceleration, it will transparently offload the computation for the user, with no user level code changes.  The logic that determines whether a problem should be offloaded to an OpenCL device is written in the newly introduced ACMLScript scripting language.


A simple set of benchmarks for a few key routines are included with ACML. Download and install ACML, then look in the performance directory in examples to find these benchmarks.