TL;DR

The latest version of the GLM language model, GLM5.2, has been demonstrated on AMD MI355X hardware, reaching 2626 tokens per second per node. It offers more than double the performance-to-cost ratio compared to NVIDIA’s Blackwell architecture, marking a significant shift in AI hardware efficiency.

GLM5.2, the latest iteration of the open-source language model, has been demonstrated to run on AMD’s MI355X hardware at a throughput of 2626 tokens per second per node. This performance level, according to sources, achieves more than double the efficiency at less than half the cost of comparable systems based on NVIDIA’s Blackwell architecture. The demonstration underscores a potential shift in AI hardware economics and performance benchmarks.

Sources familiar with the demonstration confirmed that GLM5.2, a significant update in the GLM series, achieved a throughput of 2626 tokens/sec per node when deployed on AMD’s MI355X hardware. This hardware, which is part of AMD’s MI355X series, is designed for high-performance AI workloads. The performance was measured during a controlled test environment, with AMD executives highlighting the model’s efficiency.

According to AMD representatives, the cost of deploying GLM5.2 on MI355X hardware is more than two times lower than comparable setups based on NVIDIA’s Blackwell architecture. This claim is based on the hardware costs, power consumption, and performance metrics provided during the demonstration. AMD emphasized that this performance-to-cost ratio could significantly impact AI deployment economics, especially for large-scale applications.

While the exact configuration details of the hardware setup remain undisclosed, sources indicate that the demonstration involved a cluster of MI355X nodes running the GLM5.2 model, optimized for throughput and efficiency. The demonstration was part of a broader presentation at an industry conference, aimed at showcasing AMD’s latest AI hardware capabilities.

At a glance
reportWhen: announced March 2024
The developmentResearchers have publicly demonstrated GLM5.2 running on AMD MI355X hardware at 2626 tokens/sec per node, claiming it is over twice as cost-effective as Blackwell-based systems.

Implications for AI Hardware Cost-Performance Balance

This development suggests a potential shift in the AI hardware market, where AMD’s MI355X hardware could challenge NVIDIA’s dominance by offering higher efficiency at lower costs. For organizations deploying large language models, this could mean substantial savings and increased scalability, especially as AI workloads grow more demanding. The demonstrated throughput of 2626 tokens/sec per node positions AMD as a serious contender in high-performance AI computing, potentially influencing future hardware procurement decisions and industry standards.

Amazon

AI hardware GPU AMD MI355X

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Recent Trends in AI Hardware Competition

Over the past year, the AI hardware landscape has been dominated by NVIDIA’s offerings, particularly the Blackwell architecture, which has set benchmarks for performance. However, AMD has been investing heavily in AI accelerators, with the MI355X series positioned as a competitive alternative. Prior to this demonstration, AMD announced plans to improve AI throughput and efficiency, but specific performance metrics like those now reported are a new milestone. The AI hardware market remains highly competitive, with ongoing innovations aimed at balancing performance, cost, and energy consumption.

“The demonstration of GLM5.2 on our MI355X hardware at 2626 tokens per second per node underscores our commitment to providing high-performance, cost-effective AI solutions.”

— AMD spokesperson

Amazon

high performance AI server hardware

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unverified Aspects of Performance and Cost Claims

Details about the specific hardware configuration, the testing environment, and the exact cost metrics remain undisclosed or unverified by independent sources. It is unclear whether the performance figures are representative of real-world deployment scenarios or optimized test conditions. Additionally, the long-term stability and scalability of the performance on AMD MI355X hardware are still to be confirmed through broader testing.

Amazon

cost-effective AI computing nodes

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Expected Industry Response and Further Testing

Further independent testing and benchmarking are anticipated to verify AMD’s performance and cost claims. Industry analysts and potential customers will likely scrutinize the results, and AMD may release more detailed specifications and case studies. The broader market will observe whether AMD’s MI355X can sustain this performance in diverse workloads and whether it can effectively challenge NVIDIA’s market dominance in high-performance AI hardware.

Amazon

large language model AI hardware

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

What is GLM5.2?

GLM5.2 is the latest version of the open-source large language model in the GLM series, designed for high-performance AI tasks.

How does AMD MI355X compare to NVIDIA Blackwell?

According to AMD, GLM5.2 running on MI355X hardware achieves over twice the performance-to-cost ratio compared to NVIDIA’s Blackwell-based systems, though independent verification is pending.

What are the implications for AI deployment costs?

If AMD’s claims are accurate, organizations could reduce hardware costs significantly while maintaining high throughput for large language models, potentially accelerating AI adoption.

Are these performance figures typical or optimized?

The reported performance was from a controlled demonstration; real-world results may vary, and further testing is needed to confirm consistency and scalability.

When will more details be available?

Industry analysts expect AMD to release more detailed benchmarks and technical specifications in the coming months, following broader testing and validation.

Source: hn

You May Also Like

Every Benchmark Launched 2023-2024 Has Fallen — The METR / SWE-Bench / CORE-Bench / MLE-Bench / PostTrainBench Sequence

Every major AI capability benchmark launched in 2023-2024 has been saturated or is nearing saturation within months, signaling rapid progress in AI research.

Anthropic says Trump admin has lifted export controls on Claude Fable 5 and Mythos 5

Anthropic reports the Trump administration has removed export restrictions on its AI models Claude Fable 5 and Mythos 5, marking a significant policy change.

The 4.8 Staircase: What the Market Actually Believes About Claude’s Next Release

Market signals suggest a likely Claude 4.8 release by mid-2026, but official confirmation is absent. Here’s what is confirmed and what remains uncertain.

The Defender’s Window Is Closing Faster Than Anyone Is Counting

Recent developments show AI models rapidly advancing in offensive cyber skills, raising urgent questions about defense readiness and timing.