Huawei reveals its latest Nvidia H20 killer packing a frankly ridiculous 1.56 PFLOPS of FP4 compute and up to 112GB of HBM

3 minute read

Published: Saturday, March 28, 2026 at 6:35 pm

Huawei Launches Atlas 350 Accelerator Card, Challenging Nvidia in AI Compute

Huawei has unveiled its latest Atlas 350 accelerator card, powered by the new Ascend 950PR processor, at the Huawei China Partner Conference. The company is positioning the Atlas 350 as a competitor to Nvidia's H20, particularly for AI inference workloads and multimodal AI processing.

Huawei claims the Atlas 350 delivers a significant 1.56 PFLOPS of FP4 compute performance. This figure is reportedly 2.87 times higher than the H20, although direct comparisons are complex due to differing architectures and the lack of native FP4 support on some competing hardware. The Atlas 350 is designed to optimize for this low-precision format, enabling larger AI models to operate with reduced memory requirements.

The Ascend 950PR chip incorporates several upgrades over its predecessor, the Ascend 910 series. These include enhancements to the microarchitecture, faster memory access, and flexible programming modes. The Atlas 350 is equipped with 112GB of proprietary HBM, named HiBL 1.0, offering up to 1.4TB/s bandwidth. This configuration is designed for efficient multimodal generation and inference tasks, with a reported quadrupling of memory access efficiency for small operators compared to the previous generation. The interconnect bandwidth also reaches 2TB/s using the LingQu protocol, a 2.5 times increase over the Ascend 910 series.

Huawei is targeting the Atlas 350 at recommendation inference, large language model (LLM) processing, and multimodal AI workloads. Several key partners have already developed complete system products leveraging the Atlas 350, creating customized high-performance inference solutions for enterprise customers. The accelerator is designed to integrate with AI ecosystems, allowing partners to optimize performance while maintaining compatibility with Huawei's AI software stack.

The launch of the Atlas 350 reflects China's ongoing efforts to achieve self-reliance in AI compute hardware, particularly in the face of U.S. export restrictions. While Huawei faces challenges in accessing certain technologies, such as TSMC's CoWoS, the company has implemented alternative advanced packaging solutions for HBM and memory stacking. The Atlas 350 is reportedly priced around $16,000, comparable to the Nvidia H20.

BNN's Perspective:

The Atlas 350 represents a significant step forward for Huawei in the AI hardware market. While the performance claims are impressive, the true impact will depend on real-world testing and adoption. The focus on low-precision compute and the development of a supporting ecosystem are key to Huawei's success. The ongoing competition between Huawei and Nvidia, driven by geopolitical factors, is likely to accelerate innovation in the AI hardware space, ultimately benefiting consumers and businesses.

Tags: Huawei, Atlas 350, Ascend 950PR, AI accelerator, Nvidia, H20, FP4, compute performance, inference, multimodal AI, HBM, bandwidth, China, self-reliance, AI hardware

Full Story