Nvidia announces Tesla P100 - Pascal with 15.3 billion transistors and HBM2

During Nvidia’s annual GPU Technology Conference, the company’s Jen-Hsun Huang takes the stage to present the big news. This year, it will also be the platform where Nvidia sheds light on the upcoming architecture Pascal, where the first to enter the spotlight is the Tesla P100 with the graphics circuit GP100.

Contents hide

1 Specifications: Nvidia Tesla P100

2 Related posts:

Block diagram for Nvidia “Pascal” GP100.

Specifications: Nvidia Tesla P100

	Tesla K40	Tesla M40	Tesla P100
Technical	28nm TSMC	28nm TSMC	16nm TSMC
Circuit	GK110	GM200	GP100
Circuit surface	551 mm²	601 mm²	610 mm²
Transistors	7.1 billion	8 billion	15.3 billion
Architecture	Kepler	Maxwell	Pascal
SM cluster	15 st.	24 st.	56 st.
CUDA cores	2 880 st.	3 072 st.	3 584 st.
Texture units	240 st.	192 st.	224 st.
Clock frequency	745 MHz	948 MHz	1 328 MHz
GPU Boost	875 MHz	1 114 MHz	1 480 MHz
Calculating power (SP, FP32)	5 040 GFLOPS	6 844 GFLOPS	10 608 GFLOPS
Calculating power (DP, FP64)	1 680 GFLOPS	214 GFLOPS	5 304 GFLOPS
Memory bus	384-bit	384-bit	4 096-bit
Minnestyp	GDDR5	GDDR5	HBM2
Memory amount	Up to 12 GB	Up to 24 GB	16 GB
Memory frequency	6 000 MHz	6 000 MHz	~1 400 MHz
Memory bandwidth	288 GB/s	288 GB/s	720 GB/s
TDP	235 W	250 W	300 W

Read This Now: AMD's graphics manager Raja Koduri leaves Radeon Technologies Group

The Tesla P100 computing card is the first with the Pascal architecture, which, thanks to a 16-nanometer technology from TSMC, houses no less than 15.3 billion transistors. This compares to 8 billion for the Maxwell-based GM200, and the size of the circuit itself is slightly larger at 610 mm².

The massive number of transistors also includes plenty of computing power, where the card at single precision (FP32) is set to 10.6 TFLOPS and double precision (FP64) to 5.3 TFLOPS. The former is a 54 percent increase over the GM200, and could be seen as an indication of what to expect from the sequel to the consumer-oriented graphics card Geforce Titan X.

A Stream Multiprocessor cluster with 64 CUDA cores.

For further specifications, the GP100 has a full 60 Stream Multiprocessorclusters, each with 64 CUDA cores and 4 texture units. This gives a total of 3,840 CUDA cores and 240 texture units. For the Tesla P100, however, four clusters are deactivated, which reduces the number to 3,584 and 224 respectively.

Read This Now: Nvidia is working on a solution for blue screens with AMD Ryzen 3000 systems

The graphics circuit is also flanked by four HBM2 memory capsules with a total capacity of 16 GB. The bandwidth is set at a record high 720 GB / s, which, however, is lower than the 1 TB / s made possible by the memory technology. However, it is likely that Nvidia has turned down the clock frequencies to squeeze power consumption, and that other graphics cards with GP100 are equipped with faster HBM2.

Deliveries of the Nvidia Tesla P100 will begin shortly. It is not clear when the GP100 graphics circuit can be found in consumer graphics cards.