Intel Xeon 6 Priority Cores as a Big NVIDIA GPU AI Server Feature

Today we have one we found to be quite surprising. Intel is now touting its SST features as a feature for NVIDIA GPU servers. This is one that we went into in a lot more depth in our Intel Xeon 6 High-Priority and Low-Priority Cores Explained. Still, this is at least interesting.

Intel Xeon 6 Priority Cores as a Big NVIDIA GPU AI Server Feature

In the upcoming NVIDIA DGX B300, NVIDIA is utilizing the Intel Xeon 6776P. This is a 64-core, 350W, 336MB L3 cache part. It is a big deal since winning a reference NVIDIA socket usually means a large number of customer systems on the HGX side will use the same CPUs.

There was a bit of marketing that we should call out in Intel’s release.

Looking more broadly, Intel Xeon 6 processors with P-cores deliver industry-leading features for any AI system, including:

High Core Counts and Exceptional Single-Threaded Performance: With up to 128 P-cores per CPU, these processors ensure balanced workload distribution for intensive AI tasks.
30% Faster Memory Speeds¹: When compared to the competition, Intel Xeon 6 offers superior memory performance at high-capacity configurations and supports leading-edge memory bandwidth with MRDIMMs and Compute Express Link…

¹ 2DPC memory configuration comparison, 2DPC on an Intel Xeon 6700P processor = 5,200 MT/s RDIMM speed; 2DPC on the latest AMD EPYC processor = 4,000 MT/s RDIMM speed (Source: Intel)

The 128 P-cores per CPU are in the Intel Xeon 6900P series, which is a different socket than the Xeon 6700P series which is used in the NVIDIA DGX B300.

On the “memory speeds” this is very thoughtfully crafted because Intel is comparing 2DPC Xeon 6700P and the 2DPC AMD EPYC 9005 looking at just speed. One of the reasons that the AMD EPYC 9005 drops memory speeds in 2DPC mode is that it has 50% more memory channels. Those extra channels create longer traces and thus there is a drop in speed.

If you were running 12-channel 2DPC on the AMD EPYC 9005 versus the Intel Xeon 6700P that Intel is comparing, then you end up with both more memory capacity and more memory bandwidth because you have 24 DIMMs per socket and 12 channel memory at 4000MT/s versus 16 DIMMs per socket and 8 channel memory at 5200MT/s. What is also notable, is that if you only need the capacity of 12 DIMMs, then you can run either the Xeon 6900P with MCR DIMMs/ MRDIMMs or EPYC 9005 at DDR5-6400 speeds getting a mix of capacity and performance while still fitting side-by-side in a 19″ rack form factor.

On the topic of MRDIMMs since they are mentioned, if you use them, then you are going to be using the CPUs in 1DPC mode, not 2DPC mode. That gives you 8000MT/s speeds on the Xeon 6700P with 8 channels and 8 DIMMs. AMD’s comparison is 6400MT/s with 12 channels and 12 DIMMs giving you more capacity and bandwidth.

Final Words

Let us get to what is really happening here. We oftentimes do not get to show AMD EPYC-powered NVIDIA GPU servers because NVIDIA does not want its chips marketed alongside its chief GPU competitor. With Intel, NVIDIA does not see a significant competitor as Broadcom and AMD are much more significant threats in the AI market. Therefore using Xeon with NVIDIA GPUs is preferred. Intel Xeon is in many ways the beneficiary of Rialto Bridge and Falcon Shores not coming to market.

Despite the fact that Intel is playing it quite loose with the marketing here, being in the NVIDIA DGX reference design is a big deal. Many reference designs use the same processors as NVIDIA uses in the DGX. In the future, NVIDIA has shown partners that it is looking at standardizing not just the HGX 8-GPU baseboard, but also the entire motherboard design, as it continues on its MGX journey. That will further the importance of winning the NVIDIA reference socket.

OC

Intel Xeon 6 Priority Cores as a Big NVIDIA GPU AI Server Feature

Intel Xeon 6 Priority Cores as a Big NVIDIA GPU AI Server Feature

Final Words

关于《Intel Xeon 6 Priority Cores as a Big NVIDIA GPU AI Server Feature》的评论

发表评论

摘要

相关新闻

相关讨论