In March 2008, AMD initiated SSEPlus, an open-source project to help developers write high performing SSE code. The SSEPlus library simplifies SIMD development through optimized emulation of SSE instructions, CPUID wrappers, and fast versions of key SIMD algorithms. SSEPlus is available under the Apache v2.0 license.

Originally created as a core technology in the Framewave open-source library, SSEPlus greatly enhances developer productivity. It provides known-good versions of common SIMD operations with focused platform optimizations. By taking advantage of the optimized emulation, a developer can write algorithms once and compile for multiple target architectures. This feature also allows developers to use future SSE instructions before the actual target hardware is available.


SSEPlus Project - Features
C/C++ APIs similar to SSE compiler intrinsics

CPUID management functions

Optimized emulation of SSE3, SSSE3, SSE4A, and SSE4.1 instructions

Implementations optimized for multiple target architectures

Hundreds of additional high performance SIMD functions

New SIMD operations include arithmetic and logical functions, fixed accuracy math, sophisticated packing and unpacking operations, trigonometry, and more

Macros and include files to help developer productivity while managing multiple target architectures

Active development and community participation


Developers no longer have to redevelop their algorithms to write for multiple SSE revisions

Simplified CPUID checking

Simplified maintenance of code that targets different SSE instruction mixes

SSEPlus provides containers to hold instructions that are desirable in hardware (e.g., 32 bit integer divide)

Helps developers use and implement instructions that match their own algorithms

Optimize code once for target hardware while at the same time ensuring that generated code conforms to the target hardware

Related Resources

SSEPlus Project Overview

SSEPlus project on

Framewave Project