GPU

NVIDIA GeForce RTX 4070 SUPER

Edit@4 months ago

Intergrated Memory(VRAM)
Capacity

12 GB

(GDDR6X 192-bit)

Bandwidth

504 GB/s

72 Token/s

Vector Compute
FP64
554.40 G
FP32
35.48 T
FP16
35.48 T
BF16
35.48 T
INT32
17.74 T
INT8
X

NVIDIA GeForce RTX 4070 SUPER General-Purpose Float-Point performance (Vector Performance / Scalar Performance)

FP64: 554.40 GFLOPS

FP32: 35.48 TFLOPS

FP16: 35.48 TFLOPS

BF16: 35.48 TFLOPS

INT32: 17.74 TOPS

Matirx Compute
FP64
X
FP32
X
FP16
70.96 T
141.93 T
FP8
141.93 T
283.85 T
TF32
35.48 T
70.96 T
BF16
70.96 T
141.93 T
INT16
X
INT8
283.85 T
567.71 T
INT4
567.71 T
1135.41 T

NVIDIA GeForce RTX 4070 SUPER AI performance (Tensor Performance / Matrix Performance)

FP16: 70.96 TFLOPS, with sparsity: 141.93 TFLOPS

FP8: 141.93 TFLOPS, with sparsity: 283.85 TFLOPS

TF32: 35.48 TFLOPS, with sparsity: 70.96 TFLOPS

BF16: 70.96 TFLOPS, with sparsity: 141.93 TFLOPS

INT8: 283.85 TOPS, with sparsity: 567.71 TOPS

INT4: 567.71 TOPS, with sparsity: 1135.41 TOPS

Hardware Specs
NVIDIA GeForce RTX 4070 SUPER is a 5nm chip, has 35800 million transistors, launched by NVIDIA at 2024. It has 12 GB built-in(On-Board/On-Chip) memory with bandwidth up to 504 GB/s. It has 7168 general-purpose ALUs(CUDA cores/Shader cores) and 224 matrix cores(Tensor cores) .
Process Node
5 nm
Launch Year
2024

Vector(CUDA) Cores
7168
Matrix(Tensor) Cores
224
Core Frequency
1980 ~ 2475 MHz
Cache
48MB