AOCL-BLIS

AOCL-BLIS is a high-performant implementation of the Basic Linear Algebra Subprograms (BLAS). The BLAS was designed to provide the essential ke4rnels of matrix and vector computation and are the most commonly used and computationally intensive operations in dense numerical linear algebra. Select kernels have been optimized for the AMD “Zen”-based processors, for example, AMD EPYCTM, AMD RyzenTM, AMD RyzenTM ThreadripperTM processors by AMD and others.

AMD offers the optimized version of BLIS (AOCL-BLIS) that supports C, FORTRAN, and C++ template interfaces for the BLAS functionalities.

Highlights of AOCL-BLIS 4.0

  • Following MatMul APIs for INT8 and Brain Floating Point (bfloat16) types are added and optimized with post-ops support:
    • aocl_gemm_u8s8s32os32 and aocl_gemm_u8s8s32os8 using AVX-512-VNNI
    • aocl_gemm_u8s8s16os16 and aocl_gemm_u8s8s16os8 using AVX2
    • aocl_gemm_bf16bf16f32of32 and aocl_gemm_bf16bf16f32obf16 using AVX-512
  • SGEMM with packed/reorder buffer support (aocl_gemm_f32f32f32f32)
  • Dynamic dispatch supports AMD “Zen4” configuration
  • Optimizations and performance improvements for DGEMM, DGEMMT, SGEMM, ZGEMM, and DTRSM
  • Framework design changes

The package containing AOCL-BLIS Library binaries that includes optimizations for AMD processors, examples and documentation are available in the Download section below.

Source code for AOCL-BLIS will be available shortly on GitHub (https://github.com/amd/blis).

AOCL-libFLAME

AOCL-libFLAME is a high performant implementation of Linear Algebra PACKage (LAPACK). LAPACK provides routines for solving systems of linear equations, least-squares problems, eigenvalue problems, singular value problems, and the associated matrix factorizations. It is extensible, easy to use, and available under an open-source license. libFLAME is a C-only implementation. Applications relying on standard Netlib LAPACK interfaces can utilize libFLAME with virtually no changes to their source code.

From AOCL 4.0, AMD optimized version of libFLAME(AOCL-libFLAME) is compatible with LAPACK 3.10.1 specification. In combination with the AOCL-BLIS library, which includes optimizations for the AMD “Zen”-based processors, libFLAME enables running high performing LAPACK functionalities on AMD platforms. AOCL-libFLAME supports C, FORTRAN, and C++ template interfaces (for a subset of APIs) for the LAPACK APIs.

Highlights of AOCL-libFLAME 4.0

  • Upgrade to LAPACK 3.10.1 specification that includes several bug fixes from Netlib LAPACK
  • Improved performance of the following APIs:
    • Eigen Value routine (ZGGEV)
    • SVD routines (DGESDD, CGESDD, and ZGESDD)
  • Logging feature supports timing for real double precision libFLAME APIs
  • AOCL-Progress feature that provides progress update on API computations running for a long time is extended for more APIs: {S/C/Z}GETRF, {S/D}POTRF,{S/D}GEQRF, {S/C/D/Z}GBTRF

The packages containing AOCL-libFLAME binaries, examples and documentation are available in the Download section below.

Source code for AOCL-libFLAME will be available shortly on GitHub (https://github.com/amd/libflame).

For prior versions of AOCL-BLIS and AOCL-libFLAME, refer to BLAS Library Archive.

Download:

File Name Version Size Launch Date OS Bitness Description
Binary packages compiled with AOCC 4.0

File Name

Version

4.0

Size

25 MB

Launch Date

11/10/2022

OS

Ubuntu, SLES, CentOS, and RHEL

Bitness

64-bit

Description

AOCC compiled AOCL-BLIS library binary package sha256 Checksum: d12b4dbb55598e7eb746d25cfc4e3417927619a4c522c5771208154dd21a4391

File Name

Version

4.0

Size

34 MB

Launch Date

11/10/2022

OS

Ubuntu, SLES, CentOS, and RHEL

Bitness

64-bit

Description

AOCC compiled AOCL-libFLAME Library binary package sha256 Checksum: 094021a92a3fce5c10eebe09ead85df983df876beb44d1bbb6223fc3a70ee8d1

File Name

Version

4.0

Size

18.9 MB

Launch Date

11/10/2022

OS

Ubuntu, SLES, CentOS, and RHEL

Bitness

64-bit

Description

AOCC compiled HPL benchmark binary optimized for AMD EPYCTM and AMD RyzenTM processors that uses multi-threaded AOCL-BLIS library. sha256 Checksum: 85b2a1cecf34376662f5b2826a15f5a378e520e839286e30a43d8c427c2367e5

File Name

Version

4.0

Size

19.9 MB

Launch Date

11/10/2022

OS

Ubuntu, SLES, CentOS, and RHEL

Bitness

64-bit

Description

AOCC compiled HPL benchmark binary optimized for AMD EPYCTM and AMD RyzenTM processors that uses multi-threaded AOCL-BLIS library. sha256 Checksum: 6cf15b59101b99536354c45f543c94ba9e7373dbb70f917adeac81d6d48994e8
Binary packages compiled with GCC 11.2

File Name

Version

4.0

Size

28 MB

Launch Date

11/10/2022

OS

Ubuntu, SLES, CentOS, and RHEL

Bitness

64-bit

Description

GCC compiled AOCL-BLIS library binary package sha256 Checksum: 5a3e67bfa504c2a8cb2a6e1d2bed017e9487ceb22ca5b3f367f084d6f73d0137

File Name

Version

4.0

Size

36 MB

Launch Date

11/10/2022

OS

Ubuntu, SLES, CentOS, and RHEL

Bitness

64-bit

Description

GCC compiled AOCL-libFLAME Library binary package sha256 Checksum: 01b587be9e8bea873a6f93b8150ec080d9fa11e3edcba8c33efbaf7f4d7ebae7

File Name

Version

4.0

Size

32.5 MB

Launch Date

11/10/2022

OS

Ubuntu, SLES, CentOS, and RHEL

Bitness

64-bit

Description

GCC compiled HPL benchmark binary optimized for AMD EPYCTM and AMD RyzenTM processors that uses multi-threaded AOCL-BLIS library. sha256 Checksum: 297ab8eaa073826501eff07a3061ddc22ab7d7b50d55a6dc5b678554fe39772b