[Test] Asus TUF Gaming RTX 3080 OC |Specs

Asus TUF Gaming RTX 3080 OC |Specs | Price – After the arrival of ray tracing and DLSS at Nvidia and its first RTX Turing released in 2018, the founder is back on the attack with Ampere, an architecture that promises a big increase in performance for a price that does not skyrocket compared to the previous generation.

Contents hide

1 Ampere

1.1 GA102 not yet complete

1.2 RT and Tensor hearts

1.3 Make way for GDDR6X

1.4 Increasing consumption

1.5 Related posts:

Indeed, while the MSRP of the RTX 2080 Super (and RTX 2080 before it) excluding FE was € 745, the new RTX 3080 are supposed to start at € 719. For our file, we recovered an Asus TUF Gaming RTX 3080 OC partner card, a model advertised at € 789. What does the beast have in its belly? This is what we will see!

Ampere

First of all and first change compared to Turing, the engraving fineness goes from 12 nm (TSMC) to 8 nm (Samsung). This allows the chips to be either smaller for an identical number of transistors, or on the contrary with more transistors of equal size, or even both at the same time! This is exactly the case for the GA102 which equips the RTX 3080 and 3090, the largest consumer DIE at present.

Nvidia GA102 RTX 3080 and 3090 diagram — *Nvidia GA102 diagram (RTX 3080 and 3090)*

With a DIE of 628 mm² for 28.3 billion transistors, it is a large model but which remains smaller than the 754 mm² (and 18.6 billion transistors) for the TU102 which equips the RTX 2080 Ti. As for the RTX 2080 Super that the RTX 3080 replaces, the TU104 peaked at 545 mm² for “only” 13.6 billion transistors. There is therefore a big improvement in the density of the components. As can be seen from the diagram above, the GA102 has a total of seven GPCs, compared to six for Turing and his TU102. A GPC (Graphics Processing Clusters) is a block which contains the SMs, RT cores, tensors, CUDA cores as well as quite recently the ROPs, we can roughly compare it to a core found in desktop CPUs.

Read This Now: According to a report, the NVIDIA RTX 4090 was delayed until the end of the year

GA102 not yet complete

	RTX 2080	RTX 2080 Super	RTX 2080 Ti	RTX 3080	RTX 3090
GPU	TU104	TU104	TU102	GA102	GA102
Engraving fineness	12 nm	12 nm	12 nm	8 nm	8 nm
Number of transistors	13.6 billion	13.6 billion	18.6 billion	28.3 billion	28.3 billion
DIE size	545 mm²	545 mm²	754 mm²	628 mm²	628 mm²
SMs	46	48	68	68	82
CUDA hearts	2944	3072	4352	8704	10496
Cœurs Tensor	368	384	544	272	328
RT hearts	46	48	68	68	82
Base frequency	1515 MHz	1650 MHz	1350 MHz	1440 MHz	1395 MHz
Boost frequency	1710 MHz	1815 MHz	1545 MHz	1710 MHz	1695 MHz
Memory quantity	8 Go GDDR6	8 Go GDDR6	11 Go GDDR6	10 Go GDDR6X	24 Go GDDR6X
Memory frequency	1750 MHz (14 Gbps)	1937 MHz (15,5 Gbps)	1750 MHz (14 Gbps)	1188 MHz (19 Gbps)	1219 MHz (19,5 Gbps)
Memory interface	256 bits	256 bits	352 bits	320 bits	384 bits
TGP	215 Watts	250 Watts	250 Watts	320 Watts	350 Watts

Theoretically, this allows the GA102 to have 84 SM for just as many RT cores (for ray tracing management), but none of the GPUs on the market is equipped with this fully unlocked GA102. Even the RTX 3090, which is nevertheless the flagship of green, only accommodates 82 SM. Likewise, the GA102 of the RTX 3080 is even more castrated because it only has 68 SM, which is the equivalent of the old RTX 2080 Ti. If Nvidia does with Ampere what it did with Turing, it is easy to see Super or Ti variants arriving between the RTX 3080 and 3090, or even a more powerful model than the latter using a full GA102. But only time will tell!

SM GA102 Nvidia RTX 3080 and 3090 diagram

But looking at the technical characteristics, we see that despite the identical number of SMs between the RTX 2080 Ti and RTX 3080, Nvidia announces twice as many CUDA cores. How is it possible ? The founder doubled the number of FP32 units in each SM to do this. Indeed, when Turing had two distinct blocks, one dealing with integers (INT32) and the other with floating points (FP32), Ampere also offers two blocks with one dedicated to FP32 but the other allows both to process INT32 as well as the FP32. Turing introduced the ability to compute both INT32s and FP32s simultaneously using separate data paths. In this way, it is no longer necessary to calculate either an FP32 or an INT32. For Ampere, Nvidia wanted to take advantage of the dedicated INT32 data path (which is not nearly as used as that of the FP32s) in order to increase the potential raw power. To do this, he added 64 FP32 units (in addition to the 64 already present) but which share the path of the INT32s. Theoretically, therefore, and if the applications used only require floating point calculations, the raw power is doubled. If calculations on integers have to be done, it will be to the detriment of the FP32s on this data path. The given value of the CUDA cores therefore represents the best case scenario, when the GPU must only process FP32s. If the INT32 and FP32 units are loaded equally (and this is not won), we end up with an identical total of 4352 FP32 on the RTX 2080 Ti and 3080, or 5376 for a full GA102 against 4608 for a TU102 (Titan RTX) full.

RT and Tensor hearts

As with Turing, Ampere also has cores dedicated to processing ray tracing, or RT cores. Major argument during the release of the first RTX, Nvidia worked on this point for Ampere and brought a big boost to the performance level of each of these cores. After analyzing how its RT cores work in games, Nvidia saw that the main bottleneck was dealing with the intersections between different spokes and triangles. To remedy the problem, the throughput was doubled at this level. This does not mean that the performance is doubled, but the computing time is greatly improved, up to 70% depending on the founder.

Read This Now: Fractal Design Core 2500: Reviewn| Specs| Pros & Cons| Hashrate| Set-up