BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries. The framework was designed to isolate essential kernels of computation that enable optimized implementations of most of its commonly used and computationally intensive operations. The optimizations are done for single and double precision routines. AMD has extensively optimized the implementation of BLIS for AMD processors.
Highlights of AMD BLIS 3.1
- New Dynamic Dispatch – Single Binary supports AMD “Zen”, AMD “Zen2”, and AMD “Zen3” processors
- New AOCL Dynamic – Dynamically modify threads at run-time for DGEMM, DGEMMT, DTRSM, and DSYRK
- By default, OpenMP runtime library sets the number of threads
- Improvements in:
- DGEMM for skinny matrix shapes
- ZGEMM for AMD “Zen2” and AMD “Zen3”
- DTRSM for small matrix sizes
- DSYRK, xGEMV, and DOTV
Highlights of AMD BLIS 3.0
- Includes support for AMD’s Zen3 architecture. Build can auto detect if it is running on AMD “Zen3” and enable features and optimizations specific to AMD “Zen3” architecture
- Improved performance of ?dotv, ?gemv, ?axpyv for complex and double complex datatypes
- Includes support for copy transposition routines
- New BLAS extension APIs added including cblas_?cabs1, cblas_i?amin, cblas_?axpby, cblas_?gemm_batch, and cblas_?gemm3m
- Debug trace and input logging support added for more BLIS APIs
The package containing BLIS Library binaries that includes optimizations for AMD processors, examples and documentation are available in the Downloads section below.
Source code for AMD BLIS will be available shortly on GitHub https://github.com/amd/blis.
libFLAME is a portable library for dense matrix computations, providing much of the functionality present in LAPACK. It includes a compatibility layer, FLAPACK, which includes complete LAPACK implementation. The library provides scientific and numerical computing communities with a modern, high-performance dense linear algebra library that is extensible, easy to use, and available under an open source license. libFLAME is a C-only implementation and does not depend on any external FORTRAN libraries including LAPACK. There is an optional backward compatibility layer, lapack2flame that maps LAPACK routine invocations to their corresponding native C implementations in libFLAME. This allows legacy applications to start taking advantage of libFLAME with virtually no changes to their source code.
In combination with AMD optimized BLIS library, libFLAME enables running high-performing LAPACK functionalities on AMD processors. The performance of libFLAME can be improved by linking with the AMD optimized BLIS library.
Highlights of AMD libFLAME 3.1
- Support for new APIs of LAPACK 3.10.0 specification
- Optimized LU, LDLT, QR, and Cholesky factorization routines
- Optimized ZGEEV routines
- Increased coverage of tracing and logging support for libFLAME APIs
- C++ interface support extended for more libFLAME APIs
Highlights of AMD libFLAME 3.0
- New APIs to compute partial LDLT factorization of a symmetric matrix using packed storage: ?spffrt2 and spffrtx
- New APIs to perform complete or incomplete LU factorization without pivoting of a general matrix: ?getrfnp and ?getrfnpi
- Test suite now supports LAPACK API tests for LU, Cholesky and QR operations
- Several bug fixes including handling denormal numbers in SVD functions
- New API to get version number of the library, FLA_Get_AOCL_Version()
- Library function tracing and input logging support added
The packages containing libFLAME binaries, examples and documentation are available in the Downloads section below.
Source code for AMD libFLAME will be available shortly on GitHub https://github.com/amd/libflame.
Refer here for prior versions of AMD BLIS and libFLAME documentation and downloads.