AMD BLIS

BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries. The framework was designed to isolate essential kernels of computation that enable optimized implementations of most of its commonly used and computationally intensive operations. The optimizations are done for single and double precision routines. AMD has extensively optimized the implementation of BLIS for AMD processors.

Highlights of AMD BLIS 3.1

  • New Dynamic Dispatch – Single Binary supports AMD “Zen”, AMD “Zen2”, and AMD “Zen3” processors
  • New AOCL Dynamic – Dynamically modify threads at run-time for DGEMM, DGEMMT, DTRSM, and DSYRK
    • By default, OpenMP runtime library sets the number of threads
  • Improvements in:
    • DGEMM for skinny matrix shapes
    • ZGEMM for AMD “Zen2” and AMD “Zen3”
    • DTRSM for small matrix sizes
    • DSYRK, xGEMV, and DOTV

Highlights of AMD BLIS 3.0

  • Includes support for AMD’s Zen3 architecture. Build can auto detect if it is running on AMD “Zen3” and enable features and optimizations specific to AMD “Zen3” architecture
  • Improved performance of ?dotv, ?gemv, ?axpyv for complex and double complex datatypes
  • Includes support for copy transposition routines
  • New BLAS extension APIs added including cblas_?cabs1, cblas_i?amin, cblas_?axpby, cblas_?gemm_batch, and cblas_?gemm3m
  • Debug trace and input logging support added for more BLIS APIs

The package containing BLIS Library binaries that includes optimizations for AMD processors, examples and documentation are available in the Downloads section below.

Source code for AMD BLIS will be available shortly on GitHub https://github.com/amd/blis.

AMD libFLAME

libFLAME is a portable library for dense matrix computations, providing much of the functionality present in LAPACK. It includes a compatibility layer, FLAPACK, which includes complete LAPACK implementation. The library provides scientific and numerical computing communities with a modern, high-performance dense linear algebra library that is extensible, easy to use, and available under an open source license. libFLAME is a C-only implementation and does not depend on any external FORTRAN libraries including LAPACK. There is an optional backward compatibility layer, lapack2flame that maps LAPACK routine invocations to their corresponding native C implementations in libFLAME. This allows legacy applications to start taking advantage of libFLAME with virtually no changes to their source code.

In combination with AMD optimized BLIS library, libFLAME enables running high-performing LAPACK functionalities on AMD processors. The performance of libFLAME can be improved by linking with the AMD optimized BLIS library.

Highlights of AMD libFLAME 3.1

  • Support for new APIs of LAPACK 3.10.0 specification
  • Optimized LU, LDLT, QR, and Cholesky factorization routines
  • Optimized ZGEEV routines
  • Increased coverage of tracing and logging support for libFLAME APIs
  • C++ interface support extended for more libFLAME APIs

Highlights of AMD libFLAME 3.0

  • New APIs to compute partial LDLT factorization of a symmetric matrix using packed storage: ?spffrt2 and spffrtx
  • New APIs to perform complete or incomplete LU factorization without pivoting of a general matrix: ?getrfnp and ?getrfnpi
  • Test suite now supports LAPACK API tests for LU, Cholesky and QR operations
  • Several bug fixes including handling denormal numbers in SVD functions
  • New API to get version number of the library, FLA_Get_AOCL_Version()
  • Library function tracing and input logging support added

The packages containing libFLAME binaries, examples and documentation are available in the Downloads section below.

Source code for AMD libFLAME will be available shortly on GitHub https://github.com/amd/libflame.

Refer here for prior versions of AMD BLIS and libFLAME documentation and downloads.

Download:

File Name Version Size Launch Date OS Bitness Description
AOCC Compiled Binary Packages

File Name

Version

3.1

Size

18.7 MB

Launch Date

12/10/2021

OS

Ubuntu, SLES, CentOS, RHEL

Bitness

64-bit

Description

AOCC compiled BLIS library binary package sha256 Checksum: 603a1cc819b57512c71148a48011a2da1a74d5327d692222d6815ab6541ff99f

File Name

Version

3.1

Size

33 MB

Launch Date

12/10/2021

OS

Ubuntu, SLES, CentOS, RHEL

Bitness

64-bit

Description

AOCC compiled libFLAME Library binary package sha256 Checksum: ca9930b54e8729830905fc5803cf03079292594323735f86befe01bc8ae75ec6

File Name

Version

3.1

Size

10.4 MB

Launch Date

12/10/2021

OS

Ubuntu, SLES, CentOS, RHEL

Bitness

64-bit

Description

AOCC compiled HPL benchmark binary optimized for AMD EPYCTM and AMD RyzenTM processors that uses multi-threaded AMD BLIS library. sha256 Checksum: 1cd082e8ad5dbcc36594a3df6df6a1b47e1eeb788e1c4305f8aa2f5067e57a77
Binary packages compiled with GCC 11.1

File Name

Version

3.1

Size

25.1 MB

Launch Date

12/10/2021

OS

Ubuntu, SLES, CentOS, RHEL

Bitness

64-bit

Description

GCC compiled BLIS library binary package sha256 Checksum: 4b385987fc6508f002d6fe91983ab91469f3f4e588c2afd06d33d73055c27408

File Name

Version

3.1

Size

33.2 MB

Launch Date

12/10/2021

OS

Ubuntu, SLES, CentOS, RHEL

Bitness

64-bit

Description

GCC compiled libFLAME Library binary package sha256 Checksum: 8069145b6494a22d21ce80a51edaf40cc4ff03743cbe894b95528af0904442c5

File Name

Version

3.1

Size

23.2 MB

Launch Date

12/10/2021

OS

Ubuntu, SLES, CentOS, RHEL

Bitness

64-bit

Description

GCC compiled HPL benchmark binary optimized for AMD EPYCTM and AMD RyzenTM processors that uses multi-threaded AMD BLIS library. sha256 Checksum: ef718016d7ee2ad5243d646d4d7ac99ddf3c3e12d2ecbd602897f8abe46d4176