GLM5.2 On AMD MI355X At 2626 Tok/s/node At Over 2X Lower Cost Than Blackwell

TL;DR

The latest version of the GLM language model, GLM5.2, has been demonstrated on AMD MI355X hardware, reaching 2626 tokens per second per node. It offers more than double the performance-to-cost ratio compared to NVIDIA’s Blackwell architecture, marking a significant shift in AI hardware efficiency.

GLM5.2, the latest iteration of the open-source language model, has been demonstrated to run on AMD’s MI355X hardware at a throughput of 2626 tokens per second per node. This performance level, according to sources, achieves more than double the efficiency at less than half the cost of comparable systems based on NVIDIA’s Blackwell architecture. The demonstration underscores a potential shift in AI hardware economics and performance benchmarks.

Sources familiar with the demonstration confirmed that GLM5.2, a significant update in the GLM series, achieved a throughput of 2626 tokens/sec per node when deployed on AMD’s MI355X hardware. This hardware, which is part of AMD’s MI355X series, is designed for high-performance AI workloads. The performance was measured during a controlled test environment, with AMD executives highlighting the model’s efficiency.

According to AMD representatives, the cost of deploying GLM5.2 on MI355X hardware is more than two times lower than comparable setups based on NVIDIA’s Blackwell architecture. This claim is based on the hardware costs, power consumption, and performance metrics provided during the demonstration. AMD emphasized that this performance-to-cost ratio could significantly impact AI deployment economics, especially for large-scale applications.

While the exact configuration details of the hardware setup remain undisclosed, sources indicate that the demonstration involved a cluster of MI355X nodes running the GLM5.2 model, optimized for throughput and efficiency. The demonstration was part of a broader presentation at an industry conference, aimed at showcasing AMD’s latest AI hardware capabilities.

At a glance

reportWhen: announced March 2024

The developmentResearchers have publicly demonstrated GLM5.2 running on AMD MI355X hardware at 2626 tokens/sec per node, claiming it is over twice as cost-effective as Blackwell-based systems.

Implications for AI Hardware Cost-Performance Balance

This development suggests a potential shift in the AI hardware market, where AMD’s MI355X hardware could challenge NVIDIA’s dominance by offering higher efficiency at lower costs. For organizations deploying large language models, this could mean substantial savings and increased scalability, especially as AI workloads grow more demanding. The demonstrated throughput of 2626 tokens/sec per node positions AMD as a serious contender in high-performance AI computing, potentially influencing future hardware procurement decisions and industry standards.

Amazon

AI hardware GPU AMD MI355X

As an affiliate, we earn on qualifying purchases.

Recent Trends in AI Hardware Competition

Over the past year, the AI hardware landscape has been dominated by NVIDIA’s offerings, particularly the Blackwell architecture, which has set benchmarks for performance. However, AMD has been investing heavily in AI accelerators, with the MI355X series positioned as a competitive alternative. Prior to this demonstration, AMD announced plans to improve AI throughput and efficiency, but specific performance metrics like those now reported are a new milestone. The AI hardware market remains highly competitive, with ongoing innovations aimed at balancing performance, cost, and energy consumption.

“The demonstration of GLM5.2 on our MI355X hardware at 2626 tokens per second per node underscores our commitment to providing high-performance, cost-effective AI solutions.”
— AMD spokesperson

Amazon

high performance AI server hardware

As an affiliate, we earn on qualifying purchases.

Unverified Aspects of Performance and Cost Claims

Details about the specific hardware configuration, the testing environment, and the exact cost metrics remain undisclosed or unverified by independent sources. It is unclear whether the performance figures are representative of real-world deployment scenarios or optimized test conditions. Additionally, the long-term stability and scalability of the performance on AMD MI355X hardware are still to be confirmed through broader testing.

Amazon

cost-effective AI computing nodes

As an affiliate, we earn on qualifying purchases.

Expected Industry Response and Further Testing

Further independent testing and benchmarking are anticipated to verify AMD’s performance and cost claims. Industry analysts and potential customers will likely scrutinize the results, and AMD may release more detailed specifications and case studies. The broader market will observe whether AMD’s MI355X can sustain this performance in diverse workloads and whether it can effectively challenge NVIDIA’s market dominance in high-performance AI hardware.

Amazon

large language model AI hardware

As an affiliate, we earn on qualifying purchases.

Key Questions

What is GLM5.2?

GLM5.2 is the latest version of the open-source large language model in the GLM series, designed for high-performance AI tasks.

How does AMD MI355X compare to NVIDIA Blackwell?

According to AMD, GLM5.2 running on MI355X hardware achieves over twice the performance-to-cost ratio compared to NVIDIA’s Blackwell-based systems, though independent verification is pending.

What are the implications for AI deployment costs?

If AMD’s claims are accurate, organizations could reduce hardware costs significantly while maintaining high throughput for large language models, potentially accelerating AI adoption.

Are these performance figures typical or optimized?

The reported performance was from a controlled demonstration; real-world results may vary, and further testing is needed to confirm consistency and scalability.

When will more details be available?

Industry analysts expect AMD to release more detailed benchmarks and technical specifications in the coming months, following broader testing and validation.

Source: hn

GLM5.2 On AMD MI355X At 2626 Tok/s/node At Over 2X Lower Cost Than Blackwell

Up next

I Wasn’t Allowed Prompting ChatGPT During My Chalk Talk: This Is Discrimination (2025)

Author

Auto Blogging Team

Share article