Immortalis-G715, Mali-G715, G615: Hardware ray tracing and VRS for new arm GPUs
: Test |CUP | Specs |Config
In addition to the new CPU cores Cortex-X3, A715 and A510 Refresh, Arm today introduced three new GPUs: Immortalis-G715, Mali-G715 and Mali-G615. The flagship is equipped with a ray tracing unit to enable hardware ray tracing in smartphones, tablets and arm notebooks, for example. All three GPUs also support VRS.
After the CXT GPU from Imagination Technologies and the Xclipse 920 GPU based on AMD RDNA 2 in the Exynos 2200, the Immortalis G715 is the third mobile GPU developed for Arm processors that supports hardware-accelerated ray tracing and thus a clear advantage towards software implementations. In comparison to solutions of this type, the lead should not be surprisingly over 300 percent, explains Arm based on his own internal benchmarks. The only competitor missing in the Android environment is Qualcomm with an Adreno GPU and ray tracing support.
4th generation Valhall architecture
With the Immortalis-G715, Arm continues to rely on the Valhall architecture, first introduced in May 2019 with the Mali-G77, which was also used in 2020 in the Mali-G78 and G68 as well as last year’s Mali-G710 and G610. According to a roadmap published by the company, the GPU architecture is likely to change next year.
Immortal kommt exklusiv mit Raytracing-Unit
Immortalis-G715, Mali-G715 and Mali-G615 all use the fourth generation of the Valhall architecture and have a very similar structure, which differs only in the configuration of the shader cores and their number. The Immortalis-G715 is exclusively equipped with a new ray tracing unit (RTU) for hardware-accelerated ray tracing within the inner core of the shader core, which the other two innovations lack. As Arm explains, the new ray tracing acceleration should account for less than 4 percent of the shader core area. The Mali-G715 has the same shader structure, but with a reduced number, and has to do without the RTU. The same applies to the Mali-G615, which is limited to even fewer shader cores. Arm describes the Immortalis-G715 as the new flagship and the Mali-G715 and Mali-G615 as the new premium GPUs.
Efficient ray tracing for smartphones
In order to make ray tracing as efficient as possible on a mobile GPU like the Immortalis G715, not every primitive in a scene, i.e. the polygons that make up the objects in a scene, is tested against the ray. Instead, as with other well-known GPU manufacturers, an acceleration technique is used that tests the beam against ever-shrinking three-dimensional boxes containing a complex three-dimensional object made of polygons. If the ray does not cross this box, then it logically does not cross the primitives contained in it either, which therefore do not have to be calculated. The RTU carries out this procedure until a leaf is reached in the so-called “bounding volume hierarchy” (BVH), i.e. the hierarchy of the data structure (tree), against whose primitives the beam is then tested. To perform these calculations, each RTU of each inner core of a shader core of the Immortalis-G715 has an RBOX_UNIT (RT_RAY_BOX) for the traversal of the BVH and an RTRI_UNIT (RT_RAY_TRI) for the intersection with the polygon. Shading and denoising are then taken over again by the shaders in the shader core.
Ray tracing is only supported on the Immortalis-G715 in connection with the Vulkan API and is currently only intended for Android, but not for Windows.
Raytracing units grow with the number of shader cores
How many ray tracing units an Immortalis G715 has depends on how many shader cores it was configured with by the SoC provider. To maintain its flagship positioning in the portfolio, Arm allows for recommended configurations of 10 to 16 shader cores and 2 or 4 L2 slices of up to 1MB. A maximum of 16 RTUs are therefore used. The new Mali-G715, which does not require an RTU, can be configured with 7 to 9 shader cores and, apart from ray tracing, has the same properties. The Mali-G615 is designed for 1 to 6 shader cores and can be equipped with just 1 instead of 2 or 4 L2 slices if required.
Arm doubles FMA units
Arm provides 15 percent more performance and 15 percent less consumption compared to the previous generation with the same number of shader cores – raytracing on the outside. This is achieved, among other things, by revising the execution engines. The inner core of each shader core comes with the Valhall architecture on two execution engines, which in turn each have two processing units, among other things. Each processing unit contains a processing element, which Arm has supplemented with a second module for FMA (“Fused Multiply-Add”) with an additional block for the multiplication of matrices (MMUL) for the fourth Valhall generation. This doubling of the units replaces the previous structure with only one FMA module and is intended to double the performance, especially with FMA, although the shader core area only increases by 25 percent.
Shader cores extensively revised
The two execution engines are part of each shader core, which will receive further improvements in the area of ”Power, Performance and Area” (PPA) with the fourth Valhall generation. In general, the “Command Stream Front-End” (CSF) should work faster, the Tiler achieves three times the polygon throughput at its peak, the FP16 Blender throughput has been doubled, a new hardware block for FP16-MSAA has been integrated, the Texture mapping works with certain LODs with double speed, “Arm Fixed Rate Compression” (AFRC) was implemented, the throughput of the varying unit was doubled and the load/store efficiency of the caches was increased.
All new GPUs support VRS
With the exception of the RTU, these optimizations on the front end, tiler and shader core are incorporated into all new products today. This also applies to the support of “Variable Rate Shading” (VRS), which enables a lower shading quality for certain image areas without the quality visibly suffering as a result. In an example in which Arm uses VRS to decouple the rasterization frequency from the shading frequency, the shading rate is only one per four pixels instead of one per one pixel without VRS.
VRS is particularly relevant for mobile devices because it not only increases the FPS, although Arm quantifies this increase at up to 40 percent. Alternatively, with the same FPS as before, consumption can also be reduced and thus the energy efficiency and battery life of the smartphone increased. VRS works on Immortalis-G715, Mali-G715 and Mali-G615.
Driver updates planned via Google Play
The three new GPUs are to receive updatable drivers via Google Play Services at a later date, as Arm explained in a question and answer session. Especially with the introduction of hardware ray tracing, there should still be a lot of optimization potential in the Android driver area.
ComputerBase received information about this item from Arm at a manufacturer’s event in Austin, Texas under the NDA. The costs for arrival, departure and hotel accommodation were borne by the company. The company had no influence on or obligation to report. The only requirement was the earliest possible publication date.
Was this article interesting, helpful or both? The editors are happy about any support from ComputerBase Pro and disabled ad blockers. More about ads on ComputerBase.