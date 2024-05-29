Key Takeaways Arm unveils its 6th Generation GPUs like the new Immortalis G925 and Mali G725 with big gains in AI and gaming performance.

Fragment Pre-pass technology changes the game, reducing CPU cycles by up to 43% and enhancing graphics processing efficiency significantly.

Arm's Immortalis G925 offers impressive real-world performance upgrades, including 36% faster inference, and up to 34.4 TOPS in a 24-core configuration for AI operations.

Alongside the launch of Arm's new Compute SubSystem for 2024 in the form of the Cortex-X925, A725, and A520 Refresh, we're also getting a look at Arm's GPUs for the next year too. This includes a new Immortalis G925, a Mali G725, and a Mali G625. These are Arm's 6th Generation GPUs, and there are some big gains to be had here.

Each of these GPUs are relatively straightforward increments over their predecessors, with the Immortalis G925 being the best GPU from Arm yet. Like in their cores, the GPUs here are particularly being positioned as AI champions, though there are some pretty big gaming performance gains to be had as well.

With how good the Immortalis G720 was last year, Arm has been a real contender in the GPU space. Interestingly as well, there are some parts of the presentation where Arm told me that there will be more revealed later this year, suggesting that there are some big things to come for Arm GPUs this year.

What's the difference between Arm and Mali?

Getting the confusion out of the way

Source: Arm

Before diving into Arm's new GPUs, it's important to distinguish where the differences are in Immortalis and Mali, particularly as the lines have been somewhat blurred giving the shared part names. When I asked Arm last year about the difference, I was told that when OEMs equip their chipsets with Arm GPUs, GPUs marketed as Immortalis must have a ray-tracing unit, whereas Mali can but is not required to.

On top of that, the Mali G725 can have between six and nine cores, whereas the Immortalis G925 can have up to 24 cores. As for the Mali G625, it's limited to up to five cores, though it's also a lot more of a budget GPU aimed at devices like wearables or cheaper smartphones.

Arm's Immortalis G925 is a powerful contender to be one of the best mobile GPUs of 2025

With nearly 40% improvements in graphics performance and up to 30% less power, it's a big deal

Source: Arm

Arm typically brings in a big new technology to its GPUs with each release and with the G720 last year, that was Deferred Vertex Shading. This time around, Arm is bringing a new feature called Fragment Pre-pass. This new approach fundamentally changes how GPUs handle Hidden Surface Removal (HSR) —a critical element in 3D graphics processing. Traditional methods predominantly rely on sorting objects by their depth (Z-order) to determine visibility and manage rendering workload efficiently. This depth sorting, usually orchestrated by the CPU within the GPU’s driver, is a computationally expensive task.

Fragment Pre-pass changes this process by eliminating the need for sorting triangles by depth. The crux of this technology is that it allows the GPU to process visibility without the need for CPU-intensive list arrangements. As a result, this reduces the CPU cycles spent by up to a massive 43% in the rendering thread, where the GPU driver engages with graphical APIs like OpenGL or Vulkan. This reduction in CPU load not only enhances processing speed but also improves energy efficiency, making it a fairly big deal for both developers and end-users.

What’s more, the entire fragment pre-pass operation is handled in hardware, which marks a significant stride toward a more self-reliant GPU architecture. For developers, this means less complexity in graphics programming and potentially shorter development cycles, as they won't need to manually invoke any kind of depth sorting. For gamers and professionals using graphics-intensive applications, it translates into smoother, faster, and more efficient rendering, which further translates into more efficient and better-performant hardware.

Source: Arm

In real-world performance, there are some big improvements as a result. Especially in ray tracing, with maintained accuracy, users can still get 27% more performance with 3% less memory traffic. This scales up big time if you reduce accuracy, with 52% more performance and 57% less memory traffic on a G925 with 14 cores. When I asked Arm how these gains were made, they said that it would be announced later this year with a partner that it had been worked on alongside.

Finally, in the AI realm, which Arm is clearly big on, there are big numbers to share too. Arm claims 36% faster inference and a Unity framework that can convert FP32 calculations to INT8 on the fly, resulting in up to 44% better performance in these operations. In a 14-core config, measured at INT8 precision and a max frequency of 1.4GHz, the Immortalis G925 can reach 20 peak TOPS in AI. With a 24-core configuration, it scales to 34.4 TOPS.

All of this is to say that Arm's latest Immortalis GPU is the best the company has released yet, but fragment pre-pass in particular is a novel technology that could really improve performance across the board in rendering capabilities. With the company showing performance figures that show 49% improved FPS in Genshin Impact, 72% improved FPS in Call of Duty: Mobile, and other big improvements in other titles, there are big changes that users will notice, here.

Arm is making big gains in performance this year

Efficiency isn't being ignored either

Arm has made some big claims about the leaps that it has made this year, and we're excited to get to try them out in real-world devices. Immortalis was already catching up on what Qualcomm had been doing with its Adreno GPUs, and I expect that we'll get to see a MediaTek SoC with Arm's CSS for Client set at some point. On top of that, the talk of "AI PC" has me excited, given that Qualcomm also assured me that the new fragment pre-pass technology was compatible with Windows on Arm, too.

Just like with Arm's other cores, the shader cores in Immortalis/Mali this year are being provided as GDSII files to partners that want to use these cores in their products. This means that Arm has already worked on part of the implementation for those partners, increasing time to market. The Mali GPUs this year scale to last year's models too.

We're excited to see what comes of Arm's GPUs later this year in the devices it may launch it, and we'll be excited to hear more about what the company has to say as well when it comes to ray-tracing and Windows on Arm.