The formerly codenamed AMD “Magny-Cours” processor (part of the Family 10h processor family) introduces some key technology advancements that build on the foundation laid by preceding processors, formerly codenamed AMD “Barcelona” ,“Shanghai” and “Istanbul”. With “Barcelona,” we introduced an array of innovations in processor design and features, including native quad-core architectureand a new L3 cache shared across the processor cores. The AMD “Shanghai” release brought additional enhancements including improved scalability,availability and increased the L3 cache. The AMD “Istanbul” processor provided even more enhancements for software developers such as an even larger shared L3 cache, a total of six physical cores on die, a new probing filter called HT Assist to help increase bandwidth , several new power features as well as I/O virtualization. “Magny-Cours” adds even more cores, for a total of up to 12-cores per processor, as well as enhancing features such as power, virtualization anddirect connect architecture.
There are a number of software visible features that can be leveraged to make your applications perform better and be ready to scale across multiple cores. Visit this page regularly for updated information and practical guidance on how to take advantage of all the new features in the latest Family 10h processors.
The following software development tools and resources have been optimized for Family 10h processors:
AMD Core Math Library (ACML) ACML is specifically designed to support multi-threading and other key features of AMD’s next-generation processors. ACML currently supports OpenMP, and features hand-tuned “Barcelona”, “Shanghai”, “Istanbul” and “Magny Cours” support for BLAS matrix multiplication routines, and the CFFT complex-complex Fast Fourier Transforms. The newly released ACML 4.4.0 includes further tuning of ZGEMM and real-complex FFTs.
GNU Toolset The GNU Toolset, including the GCC compiler, the glibc project, and the binutils, have been optimized for AMD Family 10h processors.
Microsoft Visual Studio® compilers The Visual Studio 2008 tools feature improved instruction selection, optimized register allocation, and enhanced 128-bit floating-point performance when used with AMD Family 10h processors.
x86 Open 64 Compiler Suite The x86 Open64 compiler system is a high performance, production quality code generation tool designed for high performance parallel computing workloads. The x86 Open64 environment provides the developer the essential choices when building and optimizing C, C++, and Fortran applications targeting 32-bit and 64-bit Linux platforms.
Previous new feature flags for Family 10h functions :
Feature identification bits for new instructions
The following documents contain the latest information on the formerly codenamed “Magny-Cours” Family 10h processors.
There are several new features in power and virtualization, but the most prominent new feature is the increase in cores to 8 and 12 on each processor made possible by our Direct Connect Architecture. This technical article outlines what enhancements were made and how they will benefit your code.
New features in AMD’s upcoming Barcelona chip dramatically boost performance of floating-point arithmetic and greatly accelerate access to cache.
Take advantage of the many architectural innovations in the "Barcelona" processor through Orcas-based tools and AMD libraries.
AMD’s new chip architecture extends a long tradition of giving developers the features they need to execute their code blindingly fast. What's in it for you?
AMD (Family 10h) Processor Software Visible Features blog series
“Magny-Cours” blogs
Previous “Shanghai” blogs
Previous “Barcelona” blogs
Virtualization
Shanghai-based Dell Systems take top scores for VMmark 8 core and 16 core systems. Shanghai-based Dell Systems take top scores for VMmark 8 core and 16 core systems. http://www.vmware.com/products/vmmark/results.html This VMware performance white paper evaluating RVI performance with the Shanghai processor concludes that "the current VMware VMM leverages these features quite well, resulting in performance gains of up to 42% for MMU-intensive benchmarks and up to 500% for MMU-intensive microbenchmarks." http://www.vmware.com/resources/techresources/1079 HP ProLiant DL585 G5 earns #1 virtualization performance record on VMmark benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ proliant_dl585_vmmark_080408.pdf The very first independent Nested Paging Virtualization tests (2 socket servers running Xen with database and web serving workloads and featuring AMD-V (RVI)). http://www.anandtech.com/weblog/showpost.aspx?i=467 HPC “Jaguar,” the AMD Opteron-based system by Cray at Oak Ridge National Labs, is the first entirely x86-based system to break the Petaflop barrier. http://www.marketwatch.com/news/story/Cray-Supercomputer-Oak-Ridge-Smashes/story.aspx?guid=%7B25D20E9B-D6BD-4CA5-B7F6-3484D9616D7C%7D Web Serving HP ProLiant DL585 G5 and DL385 G5 AMD Opteron servers lead with 4P, 2P world record performances on the SPECweb®2005 Benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ hp_proliant_dl585_385_specweb2006_073008.pdf (Please note that Dual-Core AMD Opteron processors also hold the SPECWeb2005 performance records for 2P and 4P servers.) Database An 8 socket Shanghai-based HP system achieves the top x86-based score with Oracle and a 2 socket Shanghai-based HP system achieves the top x86-based score with SQL Server 2005. http://www.sap.com/solutions/benchmark/sd2tier.epx AnandTech is "quite surprised that Shanghai was able to meet and, in some cases, pass Harpertown at various workload levels in some of the benchmarks." http://www.anandtech.com/showdoc.aspx?i=3456&p=7 HP ProLiant DL585 G5 with Quad-Core AMD Opteron processors takes #1 4-socket worldwide price/performance record again on TPC-C benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ hp_proliant%20dl585_tpc_080208.pdf HP ProLiant DL785 G5 achieves #1 8P non-clustered performance and price/performance on TPC-H@300GB benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ dl785g5-tpch300gb-0708.pdf Business Applications HP ProLiant BL465c G5 server blade posts HP’s first Quad-Core AMD Opteron™ blade result on Oracle Applications Standard Benchmark (small model, single DB instance). ftp://ftp.compaq.com/pub/products/servers/benchmarks/ hp_proliant_bl460c%20_siebel_perf_brief_051408.pdf HP ProLiant DL585 G5 achieves #1 4-processor Windows result on two-tier SAP® Sales and Distribution Standard Application Benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ dl585g5_2tsapsd_071408.pdf HP ProLiant DL785 G5 takes #1 8-processor Windows result with new Quad-Core AMD Opteron™ processors on two-tier SAP® Sales and Distribution Standard Application Benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ dl785g5_2tsapsd_may08.pdf HP ProLiant servers show excellent performance scalability with new Quad-Core AMD Opteron processors on two-tier SAP® Sales and Distribution (SD) Standard Application Benchmark (2 socket and 4 socket blades and servers). ftp://ftp.compaq.com/pub/products/servers/benchmarks/ HP_ProLiant_DL385_BL685c_2tSAPSD_March2708.pdf Java Application Serving Quad-Core AMD Opteron processor-based Sun X4600 server sets x86 SPECjbb2005 world record (8 socket server). http://www.sun.com/aboutsun/pr/2008-08/sunflash.20080807.1.xml Floating Point Performance HP ProLiant DL585 G5 server with latest Quad-Core AMD Opteron™ processors takes overall x86_64 records on SPEC® CPU2006 benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ dl585_g5_speccpu2006_july08.pdf Back to top
Shanghai-based Dell Systems take top scores for VMmark 8 core and 16 core systems. http://www.vmware.com/products/vmmark/results.html
This VMware performance white paper evaluating RVI performance with the Shanghai processor concludes that "the current VMware VMM leverages these features quite well, resulting in performance gains of up to 42% for MMU-intensive benchmarks and up to 500% for MMU-intensive microbenchmarks." http://www.vmware.com/resources/techresources/1079
HP ProLiant DL585 G5 earns #1 virtualization performance record on VMmark benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ proliant_dl585_vmmark_080408.pdf The very first independent Nested Paging Virtualization tests (2 socket servers running Xen with database and web serving workloads and featuring AMD-V (RVI)). http://www.anandtech.com/weblog/showpost.aspx?i=467
“Jaguar,” the AMD Opteron-based system by Cray at Oak Ridge National Labs, is the first entirely x86-based system to break the Petaflop barrier. http://www.marketwatch.com/news/story/Cray-Supercomputer-Oak-Ridge-Smashes/story.aspx?guid=%7B25D20E9B-D6BD-4CA5-B7F6-3484D9616D7C%7D
HP ProLiant DL585 G5 and DL385 G5 AMD Opteron servers lead with 4P, 2P world record performances on the SPECweb®2005 Benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ hp_proliant_dl585_385_specweb2006_073008.pdf (Please note that Dual-Core AMD Opteron processors also hold the SPECWeb2005 performance records for 2P and 4P servers.)
An 8 socket Shanghai-based HP system achieves the top x86-based score with Oracle and a 2 socket Shanghai-based HP system achieves the top x86-based score with SQL Server 2005. http://www.sap.com/solutions/benchmark/sd2tier.epx
AnandTech is "quite surprised that Shanghai was able to meet and, in some cases, pass Harpertown at various workload levels in some of the benchmarks." http://www.anandtech.com/showdoc.aspx?i=3456&p=7
HP ProLiant DL585 G5 with Quad-Core AMD Opteron processors takes #1 4-socket worldwide price/performance record again on TPC-C benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ hp_proliant%20dl585_tpc_080208.pdf HP ProLiant DL785 G5 achieves #1 8P non-clustered performance and price/performance on TPC-H@300GB benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ dl785g5-tpch300gb-0708.pdf
HP ProLiant DL585 G5 with Quad-Core AMD Opteron processors takes #1 4-socket worldwide price/performance record again on TPC-C benchmark.
HP ProLiant BL465c G5 server blade posts HP’s first Quad-Core AMD Opteron™ blade result on Oracle Applications Standard Benchmark (small model, single DB instance). ftp://ftp.compaq.com/pub/products/servers/benchmarks/ hp_proliant_bl460c%20_siebel_perf_brief_051408.pdf HP ProLiant DL585 G5 achieves #1 4-processor Windows result on two-tier SAP® Sales and Distribution Standard Application Benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ dl585g5_2tsapsd_071408.pdf HP ProLiant DL785 G5 takes #1 8-processor Windows result with new Quad-Core AMD Opteron™ processors on two-tier SAP® Sales and Distribution Standard Application Benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ dl785g5_2tsapsd_may08.pdf HP ProLiant servers show excellent performance scalability with new Quad-Core AMD Opteron processors on two-tier SAP® Sales and Distribution (SD) Standard Application Benchmark (2 socket and 4 socket blades and servers). ftp://ftp.compaq.com/pub/products/servers/benchmarks/ HP_ProLiant_DL385_BL685c_2tSAPSD_March2708.pdf
Quad-Core AMD Opteron processor-based Sun X4600 server sets x86 SPECjbb2005 world record (8 socket server). http://www.sun.com/aboutsun/pr/2008-08/sunflash.20080807.1.xml
HP ProLiant DL585 G5 server with latest Quad-Core AMD Opteron™ processors takes overall x86_64 records on SPEC® CPU2006 benchmark. ftp://ftp.compaq.com/pub/products/servers/benchmarks/ dl585_g5_speccpu2006_july08.pdf