BLIS is a portable software framework for instantiating high-performance BLAS-like dense linear algebra libraries. The framework was designed to isolate essential kernels of computation that, when optimized, enable optimized implementations of most of its commonly used and computationally intensive operations. Select kernels have been optimized for the AMD EPYCTM processor family. The optimizations are done for single and double precision routines.
Highlights of BLIS 3.0
- Includes support for AMD’s Zen3 architecture. Build can auto detect if it is running on zen3 and enable features and optimizations specific to zen3 architecture.
- Improved performance of ?dotv, ?gemv, ?axpyv for complex and double complex datatypes
- Includes support for copy transposition routines
- New BLAS extension APIs added including cblas_?cabs1, cblas_i?amin, cblas_?axpby, cblas_?gemm_batch, cblas_?gemm3m
- Debug trace and input logging support added for more BLIS APIs.
The package containing BLIS Library binaries which includes optimizations for the AMD EPYC™ processor family, examples and documentation are available in the Downloads section below.
Source code is available on GitHub https://github.com/amd/blis.
libFLAME is a portable library for dense matrix computations, compatible with Netlib LAPACK specification. It includes a compatibility layer, FLAPACK, which includes complete LAPACK implementation. The library provides scientific and numerical computing communities with a modern, high-performance dense linear algebra library that is extensible, easy to use, and available under an open source license. libFLAME is a C-only implementation and does not depend on any external FORTRAN libraries including LAPACK. There is an optional backward compatibility layer, lapack2flame that maps LAPACK routine invocations to their corresponding native C implementations in libFLAME. This allows legacy applications to start taking advantage of libFLAME with virtually no changes to their source code.
In combination with AMD optimized BLIS library, libFLAME enables running high performing LAPACK functionalities on AMD platforms. The performance of libFLAME on AMD platforms can be improved by just linking with the AMD optimized BLIS.
Highlights of libFLAME 3.0
- New APIs to compute partial LDLT factorization of a symmetric matrix using packed storage: ?spffrt2 and spffrtx
- New APIs to perform complete or incomplete LU factorization without pivoting of a general matrix: ?getrfnp and ? getrfnpi
- Test suite now supports LAPACK API tests for LU, Cholesky and QR operations
- Several bug fixes including handling denormal numbers in SVD functions
- New API to get version number of the library, FLA_Get_AOCL_Version()
- Library function tracing and input logging support added
The package containing libFLAME binaries, examples, and documentation are available in the following Downloads section:
Source code is available on GitHub https://github.com/amd/libflame.
Refer here for older versions