In March 2008, AMD initiated SSEPlus, an open-source project to help developers write high performing SSE code. The SSEPlus library simplifies SIMD development through optimized emulation of SSE instructions, CPUID wrappers, and fast versions of key SIMD algorithms. SSEPlus is available under the Apache v2.0 license.
Originally created as a core technology in the Framewave open-source library, SSEPlus greatly enhances developer productivity. It provides known-good versions of common SIMD operations with focused platform optimizations. By taking advantage of the optimized emulation, a developer can write algorithms once and compile for multiple target architectures. This feature also allows developers to use future SSE instructions before the actual target hardware is available.
C/C++ APIs similar to SSE compiler intrinsics
CPUID management functions
Optimized emulation of SSE3, SSSE3, SSE4A, and SSE4.1 instructions
Implementations optimized for multiple target architectures
Hundreds of additional high performance SIMD functions
New SIMD operations include arithmetic and logical functions, fixed accuracy math, sophisticated packing and unpacking operations, trigonometry, and more
Macros and include files to help developer productivity while managing multiple target architectures
Active development and community participation
Developers no longer have to redevelop their algorithms to write for multiple SSE revisions
Simplified CPUID checking
Simplified maintenance of code that targets different SSE instruction mixes
SSEPlus provides containers to hold instructions that are desirable in hardware (e.g., 32 bit integer divide)
Helps developers use and implement instructions that match their own algorithms
Optimize code once for target hardware while at the same time ensuring that generated code conforms to the target hardware