GPU

NVIDIA RTX A4000

Edit@3 months ago

Intergrated Memory(VRAM)
Capacity

16 GB

(GDDR6 256-bit)

Bandwidth

448 GB/s

64 Token/s

Vector Compute
FP64
299.50 G
FP32
19.17 T
FP16
19.17 T
BF16
INT32
INT8
X

NVIDIA RTX A4000 General-Purpose Float-Point performance (Vector Performance / Scalar Performance)

FP64: 299.50 GFLOPS

FP32: 19.17 TFLOPS

FP16: 19.17 TFLOPS

Matirx Compute
FP64
X
FP32
X
FP16
76.68 T
153.35 T
FP8
X
TF32
38.34 T
76.68 T
BF16
76.68 T
153.35 T
INT16
X
INT8
153.35 T
306.71 T
INT4
306.71 T
613.42 T

NVIDIA RTX A4000 AI performance (Tensor Performance / Matrix Performance)

FP16: 76.68 TFLOPS, with sparsity: 153.35 TFLOPS

TF32: 38.34 TFLOPS, with sparsity: 76.68 TFLOPS

BF16: 76.68 TFLOPS, with sparsity: 153.35 TFLOPS

INT8: 153.35 TOPS, with sparsity: 306.71 TOPS

INT4: 306.71 TOPS, with sparsity: 613.42 TOPS

Hardware Specs
NVIDIA RTX A4000 is a 8nm chip, has 17400 million transistors, launched by NVIDIA at 2021. It has 16 GB built-in(On-Board/On-Chip) memory with bandwidth up to 448 GB/s. It has 6144 general-purpose ALUs(CUDA cores/Shader cores) and 192 matrix cores(Tensor cores) .
Process Node
8 nm
Launch Year
2021

Vector(CUDA) Cores
6144
Matrix(Tensor) Cores
192
Core Frequency
735 ~ 1560 MHz
Cache
4MB