Review of the video card ASUS TUF Gaming GeForce RTX 3080 OC | Video Cards | Reviews
The RTX 3000 graphics cards have continued to thrill the public since their inception. There is a strong shortage of video cards in the world – it seems that Samsung factories are not up to the task, because video cards do not seem to reach stores, they are sold out too quickly. And we are talking about flagships, whose price is approaching 100,000 rubles, and not about affordable graphics such as RTX 2060. What’s new in NVIDIA Ampere video cards? Let’s look at the example of ASUS TUF Gaming GeForce RTX 3080 OC and at the same time test its performance in modern games.
- Graphics card name: ASUS TUF Gaming GeForce RTX 3080 OC (TUF-RTX3080-O10G-GAMING)
- Architecture: Ampere, 8 nm
- GPU: NVIDIA GA102, 28.3 billion transistors, 628.4 mm ^ 2
- CUDA cores: 8704
- TMU blocks: 272
- ROP blocks: 96
- Ядра Ray Tracing: 68
- Tensor Cores: 272
- Частоты (указанные на сайте): 1 440 МГц / 1 710 МГц (режим Gaming); 1785 МГц (Boost, режим OC)
- Макс. Частота (измерено в играх): 1980 МГц
- Видеопамять: GDDR6X, 10 Гбайт, 19 000 МГц, 320 Бит
- Подключение: PCIe 3.0 X16
- Разъём питания: 2x 8-контактов
- Видеовыходы: 3x DisplayPort 1.4a, 2x HDMI 2.1
- Используемые конденсаторы GPU: керамические
- Габариты: 30,5 x 13,06 x4,89 см
- Рекомендуемая мощность блока питания: 850 Вт
- Ориентировочная цена: 85 000 – 100 000 (сильно зависит от магазина)
Прежде чем переходить к самой видеокарте ASUS TUF RTX 3080 OC, стоит рассказать о архитектуре Ampere, ведь на этот раз NVIDIA как следует поработала. В играх RTX 3080 оказалась в два раза быстрее своей предшественницы, RTX 2080. Можно сказать, что это рекорд, потому что ранее флагман из новой линейки буквально на один шаг опережал флагман из линейки предыдущей. Если обобщить, то можно сказать, что ранее рост производительности был плавным, почти линейным, а с появлением Ampere мы увидели огромный скачок вперёд.
Начнём с осмотра графического чипа NVIDIA GA102, который используется в RTX 3080 и RTX 3090. Новый чип выполнен по 8 нм техпроцессу, вместо 12 нм. Выпуском кристаллов на этот раз занимается не TSMC, а Samsung. Причину смены производителя в NVIDIA не комментируют, однако в мире уже наблюдается дефицит видеокарточек. Судя по всему, в Samsung не смогли справиться с высоким спросом, или выход годных кристаллов оказался меньше запланированного. Ситуация во всех смысл неприятная, но вернёмся к GPU. Чип NVIDIA GA102 выглядит также, как и его предшественники. Однако, благодаря уменьшению техпроцесса с 12 до 8 нм и одновременному увеличению площади кристалла с 545 мм^2 до 628 мм^2 (по сравнению с Turing) на нём разместилось почти в два раза больше транзисторов. Если быть точным, то 28,3 миллиарда, когда как у чипа RTX 2080 было доступно 13,6 млрд транзисторов.
Подобное увеличение транзисторов мы видим не впервые, оно вполне вписывается в закон Мура. Такое бывало и раньше, вот только производительность так сильно не росла. В чём же дело, ведь явно не в увеличении ядер CUDA? Всё верно, в Ampere появилось несколько знаковых оптимизаций, благодаря которым и удалось добиться прироста скорости.
Первая, и самая важная оптимизация, это удвоение скорости вычислений FP32 в потоковых мультипроцессорах. Это вычисления с плавающей запятой и 32-битной точностью. Именно они чаще всего используются в играх, и во многом определяют значение FPS. Чтобы лучше разобраться, за счёт чего такие расчёты стали проводиться быстрее, вспомним строение потокового мультипроцессора, который использовался в предыдущей архитектуре Turing.
If we forget about RT cores for a while, then the Turing streaming multiprocessor can be divided into four equal blocks. Inside each quarter are two tensor cores, an FP32 computation unit and an INT32 integer computation unit. The latter appeared in the Turing architecture and was supposed to help increase performance in games, since then integer computations began to be used more often, and NVIDIA wanted to play on this to add performance to their video cards. However, as practice has shown, most often the main load fell on FP32 blocks, INT32 blocks were idle – they had no effect. And to eliminate this shortcoming, NVIDIA optimized streaming multiprocessors so that INT32 blocks could perform FP32 computations. Since FP32 and INT32 block sizes are the same,
Yes, as a result, we got a double increase in FP32 calculations, but how will this affect the number of FPS in games, will it double? But this will already depend on the games themselves and their optimizations. If the developers have focused on FP32, a huge increase is to be expected. Well, if not, then the RTX 3080 video card turned out to have almost three times more CUDA cores than the RTX 2080. Therefore, no one will leave the offended one.
Optimization of Tensor Kernels and RT Kernels
Attentive readers have probably already compared the images of Turing and Ampere streaming multiprocessors, and noticed that Ampere lacks tensor cores. That’s right, it’s not a mistake. The number of tensor cores has indeed dropped to four such cores per streaming multiprocessor instead of eight. On the other hand, thanks to the increase in the size of the multiplied matrices from 4×4 to 4×8, it was possible to double the efficiency of the cores. This is probably why – and also to save space and used transistors – NVIDIA decided to abandon the second tensor core.
The number of RT cores has not changed, although they also received their optimizations. There is still one such core for each streaming multiprocessor. That’s just from the performance increased from 34 RT-teraflops that Turing had, to 58 RT-teraflops for Ampere.
Video Memory GRRD6X
Along with the new video cards, the new GDDR6X video memory was presented, which is currently used for the RTX 3080 and RTX 3090. The peculiarity of this memory is that it was developed in secrecy by NVIDIA and Micron. The development was so conspiratorial that the specifications of the new memory were not even submitted to the JEDEC consortium. Therefore, there is only one memory manufacturer, and as you might guess, this is Micron.
Considering the difficult fate of GDDR6X, it is difficult to say whether the new memory will become the standard of the new generation, or will remain the lot of top-end NVIDIA video cards. So far, we can only say that GDDR6X turned out to be noticeably faster and more efficient than GDDR6. Therefore, at an operating frequency of 19.5 GHz (which is used on Ampere GA10x chips), the new memory shows a bandwidth of 936 GB / s. And the RTX 2080 Ti had almost half the memory bandwidth.
Another feature of the new memory is support for the Error Detection and Replay function. Its principle of operation is similar to that of ECC memory, which is used in servers. If there are broken data in memory, the video card will request it again, and not stop its work. Of course, if this happens during memory overclocking, it will drop its frequencies. But artifacts or BSODs will not appear on the screen.
NVIDIA DLSS 2.1
NVIDIA DLSS (Deep Learning Super Sampling) technology was introduced with the RTX 2000 series graphics cards. Its task is to improve performance in games by using new anti-aliasing with deep learning algorithms and using tensor kernels. Since its inception, DLSS technology has grown to version 2.1 and is much better. Now, thanks to the optimization of DLSS algorithms, the video card will need less input data to create a frame, and the whole process will be much faster. For example, during a demonstration of the new DLSS, NVIDIA showed that the new technology can easily cope with Wolfenstein: Youngblood running on an 8K monitor and with RTX effects.
This is a great leap forward for both DLSS and modern games. The only pity is that DLSS support depends on the game developers and can only be enabled in compatible games.
Optimizing the speed of loading textures NVIDIA RTX IO
Games are getting bigger every year. They need more video memory and more SSD space. For example, the new Call of Duty: Modern Warfare has ceased to fit on 256 gigabyte SSD drives altogether, the game turned out to be so big.
The problem is that loading large textures into the memory of a video card takes a lot of time by system standards. First, the texture is placed in RAM, and then moved to GPU memory. To remedy the situation, NVIDIA came up with the NVIDIA RTX IO.
Thanks to NVIDIA RTX IO and the Microsoft DirectStorage API, textures can be loaded directly into the memory of the video card, bypassing the processor and RAM. This should speed up the loading of levels and reduce the number of freezes in games.
Reducing NVIDIA Reflex Input
Once at AMD, they thought about how to simplify the life of gamers, and came up with the Radeon Anti-Lag technology. It reduced the time that elapses from the moment the mouse button is pressed to the performed action on the screen. Simply put, in shooters, it became possible to shoot a little faster. With the release of Ampere, NVIDIA had a symmetrical answer, NVIDIA Reflex.
Thanks to NVIDIA Reflex, the time of hardware delays in games is reduced, they become more responsive, and the player has extra moments, which are sometimes not enough to win. True, the new technology has its drawbacks. It must be supported by the game developers, and it works best in competitive games that hit several hundred FPS.
NVIDIA Broadcast is the joy of a streamer
Thanks to NVIDIA, it became possible to conduct streams without significant load on the system. At one time, the digging added a hardware decoding unit, which is used for video streaming, removing this task from the CPU. Now NVIDIA has decided to tackle other aspects of streaming. Namely, for the sound and picture from the webcam.
With NVIDIA Broadcast and Smart Noise Canceling, you can remove background noise from your stream. For example, key clicks, mouse clicks and other extraneous sounds. And if you have a webcam, you can use NVIDIA Broadcast to set any video background and automatically crop the picture by tilting your head.
NVIDIA Omniverse Machinima
The latest technology, NVIDIA Omniverse Machinima, comes in handy for those who like to create their own clips and videos on game engines. With its help, it will be possible to add your own effects to the game, enable lip-syncing of a character with spoken text (the so-called lip sync), add RTX rendering to effects, and much more. In short, with NVIDIA Omniverse Machinima, the game turns into a real film set, where the director is the player.
This concludes the description of the architecture and begins the description of the ASUS TUF Gaming GeForce RTX 3080 OC video card itself and the long-awaited tests. Go!
ASUS TUF Gaming GeForce RTX 3080 OC – appearance and cooling system
ASUS TUF Gaming GeForce RTX 3080 OC is one of the first high-performance graphics cards released in the TUF Gaming lineup. Before it, the fastest “tafs” were RTX 2060 and Radeon RX 5700. Now the line has been updated, and it has video cards RTX 3080 and RTX 3070. There is no TUF RTX 3060 card on the ASUS website yet, but most likely, it will definitely appear there after the official release.
The dimensions of the ASUS TUF Gaming GeForce RTX 3080 OC are similar to the RTX 2080 cards performed by the ROG Strix. The new video card will feel comfortable in most cases, because its length is 300 millimeters. In height (or thickness, whichever is convenient) it will take 2.7 slots – nothing new either. This is a kind of standard among gaming graphics cards.
If the dimensions of the ASUS TUF Gaming GeForce RTX 3080 OC are similar to other ASUS video cards, then the casing of the cooling system has a special one. It is not made of plastic, as usual, but aluminum. The casing has LED illumination, albeit a small one – a small strip above the logo will glow. The ASUS Aura Sync backlight is used, which allows you to synchronize the backlight with video cards with other devices that support Aura Sync.
CO fans will be familiar to everyone who has read at least one review of ROG Strix graphics cards from the RTX 2000 series. These are Axial-tech turntables, with a special impeller geometry, a reduced central part (to increase airflow) and double ball bearings. A total of three fans are installed on the casing, while the middle spinner rotates in the opposite direction relative to the other fans. Well, if the GPU temperature is 55 degrees or less, then the fans are turned off altogether, only a radiator is enough for cooling.
The reverse side of the video card is covered with a metal plate. One could write that ASUS TUF Gaming GeForce RTX 3080 OC has “everything like everyone else” in this planet, however, the plate has a wide opening for hot air outlet. It is located opposite the third fan. Also, on the back of the video card, you can see ceramic capacitors (opposite the GPU installation site) and the video card BIOS mode switch: you can choose a productive gaming BIOS, with increased frequencies and the usual noise of the cooling system, or a quiet BIOS – with slightly reduced frequencies and quiet CO.
To connect power, ASUS TUF Gaming GeForce RTX 3080 OC uses two 8-pin connectors, and not one 12-pin connector as in the reference models. This is great, the owner of the video card will not have to think about finding a 12-pin adapter. You can get by with two standard 8-pin cables. The mounting plate is made of stainless steel and contains five monitor ports: three DisplayPort 1.4a and two HDMI 2.1.
Testing ASUS TUF Gaming GeForce RTX 3080 OC
To test the video card, we took two test benches. The first with an Intel Core i5-10600K processor and an ASUS ROG Maximus XII Formula motherboard can be considered a good modern gaming computer. Well, the second stand was built on the basis of a high-performance Intel Core i9-10920X processor overclocked to 4.7 GHz and a Prime X299 Edition 30 motherboard. The rest of the stand components were the same: the ASUS ROG RYUJIN 360 CPU cooling system, 2x 8 GB DDR4-3333 memory Kingston HyperX Predator (HX433C16PB3K2 / 16), WD Blue SN550 NVMe SSD, ASUS ROG Thor 850W PSU, and ASUS ROG Swift PG279Q Gaming Monitor. Operating system – Windows 10 Pro, 64 bit (version 2004) with all the latest updates. All games were launched at maximum graphics settings and at a resolution of 2560×1440.
Testing took place in the 3Dmark benchmark and popular games. In games, we measured the maximum temperature and frequency of the GPU, as well as the consumption of video memory. The test results are presented below.
The results in 3DMark were as expected, and the small difference in numbers is determined by the processors. After all, 3Dmark shows an average result, which also includes CPU performance, for this it has a separate test. FPS in games turned out to be obviously high, only the updated Crysis showed less than 60 frames, but what can I say, the king is back, long live the king! 🙂 It is also worth mentioning the difference in performance between Call of Duty: Modern Warfare 2 Remastered and Call of Duty: Modern Warfare. It’s all about the Super Sampling x4 anti-aliasing method, which CoD MW 2 Remaster has and CoD MW does not have. Therefore, the first game scored less FPS than the second. During the tests, the GPU temperature was at 62 degrees Celsius, and the three fans were quiet. And this is at the maximum load and GPU frequency of about 2 GHz. You can say
NVIDIA Ampere turned out to be an iconic architecture, and ASUS TUF Gaming GeForce RTX 3080 OC was a cool graphics card. After studying all the innovations of Ampere, it becomes clear the excitement that has flared up among users, and why video cards are instantly swept off the store shelves. With the new architecture, we have received an unprecedented increase in performance and the emergence of other equally useful functions for gamers, streamers and creators of clips based on game engines.
ASUS TUF Gaming GeForce RTX 3080 OC graphics card will be an excellent purchase for the gamer. It copes with a resolution of 2560×1440 at the maximum, highest graphics settings. This means that it will cope with 1920×1080 playfully, and for games in 4K, you sometimes need to lower the graphics settings.
The card itself is made of very high quality with a highly efficient but quiet cooling system. There is literally nothing to find fault with.