Why do Nvidia, AWS, and Alibaba all like Arm's server CPU?

"The answer is very simple. By collaborating with Arm, they can build and optimize solutions based on their own use cases and infrastructure," said Mohammed Awad, Senior Vice President and General Manager of the Infrastructure Business Unit at the 2023 Arm Tech Symposia Annual Technical Conference.

One of the most important AI chip providers, NVIDIA, also enjoys the customizable features of Arm server CPUs, just like ultra large scale cloud service providers.

In Nvidia's powerful GH200 superchip, it includes 72 Arm Neoverse cores, and with Nvidia's GPU, the AI performance of GH200 can be improved by 10 times compared to x86 based systems.

In order to meet the customized needs of more customers in infrastructure construction, Arm also has two important measures.

Why choose Arm Neoverse CPU?

The GH200 Grace Hopper Superchip Platform is a product released by Nvidia in May this year, designed specifically to handle massive generative AI tasks. The NVIDIA DGX GH200 supercomputer, equipped with 256 GH200 superchips, has improved its AI performance to an astonishing Exaflop level.

The key to such powerful AI performance lies in the transformation of system architecture.

Traditional system architecture in the field of infrastructure

The traditional server system architecture connects memory to a general-purpose ready-made CPU (i.e. Host CPU) via PCIe, responsible for managing multiple accelerators.

"This traditional architecture was the only one available in the market in the past," Mohamed Awad pointed out, "The problem with this architecture is that the universal ready-made CPU and the interface between accelerators directly limit the final performance level of the product. Because all accelerators must access additional memory through this universal ready-made CPU, memory consistency cannot be achieved, and the performance of accelerators cannot be fully utilized, thus failing to support the requirements of generative AI well."

Facing new application requirements, modern system architectures have emerged in the field of infrastructure

The GH200 superchip has changed the traditional architecture by using NVLink to connect each CPU separately to an accelerator, achieving strong memory consistency. One of the key points is the customizable CPU. Therefore, with this architecture, Nvidia can fully leverage the efficiency of GPUs and achieve maximum performance based on actual scenarios and use cases.

"Only by understanding the final use case and designing the CPU specifically according to the usage scenario can we achieve better efficiency and achieve the best performance of the product." Mohammed Awad further stated, "NVIDIA has partnered with Arm to leverage the flexibility brought by Arm technology to design the chips they need to further optimize the system, while fully utilizing Arm's powerful software ecosystem."

The next question is, will Nvidia's proposed architecture become the mainstream of the era of generative AI?

"It is still too early to determine whether a CPU versus a GPU as an accelerator is the main or only trend in the future," Mohammed Awad told Leifeng Net, "We are in the era of computing acceleration, and in the future architecture, no matter how it is coupled, there will always be an accelerator next to any general-purpose CPU. Arm's unique feature is that it can help partners build customized CPUs from scratch and according to their needs, and make good connections between CPUs and accelerators."

Due to the standard CPU chip provided by x86, the best choice for CPU in the GH200 superchip platform is only Arm CPU, which is also the key to the popularity of Arm Neoverse.

That is to say, standardized CPUs cannot meet the customization needs of infrastructure, and customizability has become Arm's trump card in the server market.

Customizable, Arm's "killer weapon" in the server market

In August of this year, Arm launched the Arm Neoverse Computing Subsystem (CSS), which enabled the Arm ecosystem to create specialized chips with lower costs, less risk, and shorter time.

The first generation product of Arm CSS, Arm Neoverse CSS N2, integrates the Neoverse N2 platform and optimizes power consumption, performance, and area (PPA) through configuration validation.

"Through Neoverse CSS, we can help our partners further reduce investment, accelerate the accessibility of our solutions to the entire ecosystem, and accelerate the time to market of our partner products," Mohamed Awad said.

Lei Fengwang learned that some Arm clients have saved up to 80 engineers per year by using Neoverse CSS. Another client used Neoverse CSS, and the project only took 13 months from concept to production.

Microsoft's recently released Cobalt 100 CPU is also based on Neoverse CSS.

"Arm Neoverse has many clients in the Chinese market, especially in the infrastructure sector, and has developed very strongly in the past three to four years." said Zou Ting, Global Vice President of Arm's China business. "Arm also actively participates in local ecological and open-source software communities such as data centers and cloud computing, including the Dragon Lizard community, to help these communities better integrate into Arm's global ecosystem."

Mohammed Awad also emphasized that China is one of the very important markets for Arm, and the total shipment volume of ARM based chips by Chinese partners has accumulated to 30 billion. Arm has nearly 400 technology authorized customers in China, and this number continues to rise every month.

The global ecosystem of Arm is also the key to meeting the differentiated needs of customers. Building on Neoverse CSS, Arm has launched Arm Total Design, which further combines the power of the ecosystem to simplify the development process of customized chips and make delivery easier and more convenient.

The launch of Arm's comprehensive design allows ASIC design companies to quickly initiate design projects and provide their design solutions to clients at any time; IP suppliers can pre integrate, pre validate, and pre optimize advanced IP for Neoverse CSS; EDA partners can seamlessly support the most advanced tools and processes, simplifying SoC design; Commercial firmware solutions can be developed before chip fabrication; At the same time, the design of Neoverse CSS will be specially optimized to fully leverage the advantages of leading process nodes.

Obviously, in the era of pursuing differentiation in infrastructure, Arm Neoverse CSS and Arm comprehensive design are the best choices to meet differentiation needs today.

It should also be noted that Arm has transformed into a computing platform company, and now its comprehensive computing solutions (Arm Total Compute Solutions), Arm Neoverse platform, Arm Corstone, and SOFFEE computing platforms have been widely used in fields such as mobile, infrastructure, IoT, and automotive.

(Image and text excerpted from Lei Feng.com)

Time: 2023-12-04
Views:
After years of cultivation, CPUs based on Arm architecture have experienced significant growth in the server market and have been favored and adopted by many customers. Why do super large-scale cloud service providers such as Amazon Cloud Services (AWS), Alibaba, Microsoft, and other self-developed CPUs choose to cooperate with Arm?