AMD EPYC 9004 “Genoa Zen 4” Data Center CPUs Official: Zen 4 Powers The Fastest Server CPUs On The Planet With Up To 96 Cores & 192 Threads

The AMD EPYC 9004 Genoa family is the start of a new server ecosystem for the EPYC brand. It covers multiple segments, multiple families, and multiple families. The AMD Zen 4 lineup for Data Centers will be split into three families, the standard Zen 4 for EPYC Genoa, the Compute Density-Optimized Zen 4C for EPYC Bergamo, and the Cache-Optimized Zen 4 V-Cache within the EPYC Genoa-X series. Furthermore, the lineup will be featuring a cost-optimized and entry-level server offering known as EPYC Siena which will feature the same Zen 4 cores but on an entirely new platform known as SP6 which will once again focus on optimizing TCO compared to SP5. The lineup will be branded under the EPYC 8004 family.

AMD EPYC 9004 “Genoa Zen 4” Server CPU Lineup

The AMD EPYC 9004 Genoa “Zen 4” CPUs are based on a 5nm Chiplet architecture which we have seen on the Ryzen 7000 and Radeon 7000 products. The CPU delivers a 14% increase in IPC, a 1% increase over the consumer Zen 4 parts. The reason for the slight uplift is the geomean data which is taken across a larger set of workloads compared to consumer-centric workloads for the Ryzen chips. The 5nm process node makes use of a 4th Gen FinFET technology, enhanced metal stack & which has been optimized for high performance. The standard Zen 4 lineup will feature up to 12 CCDs, 96 cores, and 192 threads. Each CCD will come with 32 MB of L3 cache and 1 MB of L2 cache per core. The EPYC 9004 CPUs will pack the latest instructions such as BFLOAT16, VNNU, AVX-512 (256b data path), addressable memory of 57b/52b, and an updated IOD with an internal AMD Gen3 Infinity Fabric architecture with higher bandwidth (die-to-die interconnect). This provides up to four 32 Gb/s socket-to-socket fabrics for fast CPU interconnect. That’s a 1.9x improvement versus the previous gen Infinity Fabric design. AMD compared its Zen 4 core + its L2 cache to an Ice Lake-SP (Sunny Cove) core + its L2 cache. As per the details, Genoa offers a 40% reduced area while delivering 48% higher energy efficiency vs the competition. The AMD EPYC 9004 “Genoa” CPU is split into six SKU segments which include:

Density Optimized Cache Optimized Frequency Optimized Cost Optimized Density + Frequency Balanced

The top parts are based on a 12 CCD SKU with up to 96 cores, 192 threads, and 384 MB of L3 cache. These SKUs will range between 360W-400W TDPs. Next up, we have the 8 CCD SKUs which feature a total of 16 SKUs that range from 16 cores up to 64 cores. These chips pack 256 MB of L3 cache & TDPs that range between 280/320/360W. Lastly, we have the 4 CCD SKUs which include 4 SKUs that offer 16-32 cores. These chips will offer 64 to 128 MB of L3 cache and TDPs in the range of 200-210 Watts. As for frequencies, the EPYC 9004 “F” SKUs will have a boost target range above 4.0 GHz and the rest of the chips are rated at around 4.0 GHz (boost). Compared to EPYC Milan, the AMD Zen 4 CCD is 11% smaller than the Zen 3 CCD (80mm vs 72mm). The IOD is also 5% smaller (416mm vs 397mm). The package and socket size has increased a lot & that is mainly due to the fact that EPYC Genoa chips incorporate 50% more CCDs than EPYC Milan chips (12 vs 8 CCDs). The Genoa package measures 5428mm2 while the socket has a total area of 6080 mm2 while SP3 measures 4410mm2. Do note how the number of pins comes close to the area size of each respective socket. The AMD EPYC 9004 “Genoa Zen 4” CPUs have already broken over 300 performance records across various records. The list of records includes: Data Management

22 Structured Data and analytics 28 Unstructured Data and analytics

Engineering/Technical

72 High performance computing apps 58 Floating Point performance 12 Floating Point Energy Efficiency

Infrastrcture

23 Cloud and Virtualization 15 Integer Performance 11 Cloud/VM/Integer Efficiency

Business applications

8 ERP Business Apps 48 Java-based Performance 18 Energy Efficiency

AMD SP5 “LGA 6096” Server CPU Platform

The LGA 6096 socket will feature 6096 pins arranged in the LGA (Land Grid Array) format. This will be by far the biggest socket that AMD has ever designed with 2002 more pins than the existing LGA 4094 socket. We have already listed the size and dimensions of this socket above so let’s talk about its power ratings. It looks like the peak power of the LGA 6096 SP5 socket will be rated at up to 700W which will only last for 1ms, the peak power at 10ms is rated at 440W while the peak power with PCC is rated at 600W. If the cTDP is exceeded, then the EPYC chips featured on the SP5 socket will return to these limits within 30ms. AMD’s EPYC Genoa CPUs will feature 128 PCIe Gen 5.0 lanes of which 112 PCIe Gen 5 lanes will be available since the remaining 16 are reserved, 160 for a 2P (dual-socket) configuration. The SP5 platform will also feature DDR5-5200 memory support which is an insane improvement over the existing DDR4-3200 Mbps DIMMs. But that’s not all, it will also support up to 12 DDR5 memory channels and 2 DIMMs per channel which will allow up to 12 TB of system memory using 1 TB 3DS RDIMM modules. The platform will feature support for 12 DDR5 channels with up to 4800 Mbps DIMM support and include options for 2,4,6,8,10,12 interleaving. Both RDIMM & 3DS RDIMM will be supported with 2 DIMMs per channel for up to 6 TB/ capacities per socket (using 256 GB 3DS RDIMMs). There will be 160 gen 5 lanes available on the 2P platform, 12 PCIe Gen 3 lanes (8 lanes on 1P), 32 SATA lanes, & 64 IO lanes supporting CXL 1.1+ (CXL 2.0 memory devices support) with bifurcations down to x4 & SDCI (Smart Data Cache Injection).

AMD EPYC Milan Zen 3 vs EPYC Genoa Zen 4 Size Comparisons:

AMD EPYC 9004 “Genoa Zen 4” Server CPU Performance

In terms of performance, the charts showcase the SPEC2017 Integer (Base) benchmarks for 14 chips within the AMD EPYC Genoa lineup. At least five of the chips sit above 1000 points while the rest are positioned competitively in the mid-tier and entry-tier segments. All benchmarks were done on a 2P (dual-socket) platform so two chips are used. In floating point, AMD’s EPYC 9004 Genoa CPUs will deliver a 2.2x increase over the 3rd Gen EPYC Milan and a 2.52x improvement over the Intel 3rd Gen Xeon Platinum CPUs (Ice Lake). In Java server-side workloads, AMD’s EPYC 9004 Genoa CPUs will deliver a 2x increase over the 3rd Gen EPYC Milan and a 3x improvement over the Intel 3rd Gen Xeon Platinum CPUs (Ice Lake). In VMark 3.1.1, AMD EPYC Genoa CPUs will offer around a 3x improvement while also offering 3x the Virtual Networking density versus Intel’s offerings. In terms of energy efficiency, AMD has touted a 2.6x leap over Intel’s 3rd Gen Xeon Platinum (Ice Lake-SP) platform within the 2P Integer (performance per watt) score on SPECrate 2017. Following is the score breakdown: With its higher performance per watt, new data centers and servers will result in lower TCO and power input while also requiring less amount of servers to deliver much higher performance in HPC workloads. Summing up the performance of 15 Intel 8380 Xeon servers versus 5 AMD EPYC 9654 servers, users will save 52% power with 67% fewer servers, reduce CAPEX by 40% and OPEX by 61%. In a far more interesting comparison, AMD showed that it will have 320,000 AMD 9654 servers to match 1 million Intel 8380 servers. At the same time, the AMD servers will save 4.3 Billion worth of cash in kWh saved per year. This would mean that AMD will reduce CO2 emissions by around 2.2M & 2.4M acres of forest to remove this CO2 from the atmosphere. All leading AMD partners and ecosystem vendors are launching their full-fledged systems with the EPYC 9004 “Genoa Zen 4” CPUs starting today.

AMD EPYC 9004 “Genoa Zen 4” Server CPU SKUs:

AMD 4th Gen EPYC 9004  Genoa Zen 4  CPUs Launched  Up To 96 Cores  192 Threads  384 MB L3 Cache   Crushing All Other Server Chips - 25AMD 4th Gen EPYC 9004  Genoa Zen 4  CPUs Launched  Up To 96 Cores  192 Threads  384 MB L3 Cache   Crushing All Other Server Chips - 27AMD 4th Gen EPYC 9004  Genoa Zen 4  CPUs Launched  Up To 96 Cores  192 Threads  384 MB L3 Cache   Crushing All Other Server Chips - 47