GPU VRAM Sizing Tool
Size video memory requirements for LLM and deep learning inference.
Model Parameters (Billions)
Precision (Bits)
FP16 (2 Bytes)
INT8 (1 Byte)
INT4 (0.5 Bytes)
Concurrent Users:
10
1
50
100
Context Length (Tokens):
2,048
512
16K
32K
Weights
--
KV Cache
--
Total Recommended
--
--
Export Technical Graphic