Skip navigation links
Tools
SDKs
Libraries
Samples & Demos
Docs
Zones
Community
Support
OpenCL™ Optimization Case Study: Diagonal Sparse Matrix Vector Multiplication 
Skip Navigation LinksHome > Docs > Articles & Whitepapers
Bryan Catanzaro  5/10/2010 

Conclusion

Taking a close look at optimizing DIA sparse matrix vector multiply has illustrated several techniques for getting good performance with OpenCL™ C code:

  1. Pay attention to the interplay between SIMD execution and your data structure.
  2. Align and densify accesses as much as possible.
  3. Use local memory to eliminate off-chip memory accesses.
  4. Vectorize your code for greater efficiency.
  5. Use OpenCL™ images for intermediate-sized data structures with hard-to-predict access patterns, but lots of reuse, when targeting the GPU.
  6. When targeting the CPU, consider tailoring the amount of parallelism you express to the natural parallelism of the processor

With these techniques, we've been able to construct a high-performance DIA sparse matrix vector multiply routine that efficiently uses the resources of the ATI Radeon HD 5870 GPU.

The same code that worked well on the GPU also provides decent performance on the CPU, and slightly adjusting the parallelism of the computation to better fit the CPU improved CPU performance a bit as well.

As you write your own code, careful attention to these principles will help you achieve high performance results. Good luck!

References

[1] S. Williams, L. Oliker, R. Vuduc, J. Shalf, K. Yelick, J. Demmel. Optimization of Sparse Matrix-Vector Multiplication on Emerging Multicore Platforms. Parallel Computing, vol. 35, no. 3, pp. 178-194, 2009.

OpenCL™ and the OpenCL™ logo are trademarks of Apple Inc. used by permission by Khronos.

Back to top
«1 2 3 4 5 6 7 8 9 10 11 »
2010 Advanced Micro Devices, Inc. AMD, the AMD Arrow logo, AMD Opteron, AMD Athlon, AMD Turion, AMD Sempron, AMD Phenom, ATI Radeon, Catalyst, AMD LIVE!, and combinations thereof, are trademarks of Advanced Micro Devices, Inc. Microsoft and Windows are registered trademarks of Microsoft Corporation in the United States and/or other jurisdictions. Linux is a registered trademark of Linus Torvalds. Other names are for informational purposes only and may be trademarks of their respective owners.

This website may be linked to other websites which are not in the control of and are not maintained by AMD. AMD is not responsible for the content of those sites. AMD provides these links to you only as a convenience, and the inclusion of any link to such sites does not imply endorsement by AMD of those sites. AMD reserves the right to terminate any link or linking program at any time.
Printer Friendly Version
Table Of Contents