Find GPU for Your Model

Running Kimi K2.6? DeepSeek V4 Pro? Mercury 2? Or a custom model? Pick the parameter count and precision, and we show every GPU in our catalog that has enough VRAM — ranked by FP16 compute performance.

We calculate VRAM requirements for FP16, INT8, and FP4/GGUF quantization across batch sizes. Each result shows the maximum batch size that GPU can handle for your config.

Choose model size

Model Size

Set precision and batch size

Precision

Batch Size: 1

VRAM needed: 16 GB

27 GPUs can run this config

GPU	VRAM	FP16 TFLOPS	Architecture	Max Batch	Tags
AMD Instinct MI300X	192 GB	1307	CDNA 3	12x	datacentertraininginference
NVIDIA B200	192 GB	1125	Blackwell	12x	datacentertraininginference
NVIDIA GB200	192 GB	1125	Blackwell	12x	datacentertraininginference
NVIDIA B300	288 GB	1100	Blackwell Ultra	18x	datacentertraininginference
NVIDIA H100	80 GB	495	Hopper	5x	datacentertraininginference
NVIDIA H200	141 GB	495	Hopper	8x	datacentertraininginference
NVIDIA GH200	96 GB	495	Hopper	6x	datacentertraininginference
NVIDIA RTX 5090	32 GB	419	Blackwell	2x	consumerinferencetraining
NVIDIA H200 NVL	141 GB	418	Hopper	8x	datacentertraininginference
AMD Instinct MI250X	128 GB	383	CDNA 2	8x	datacentertraininginference
NVIDIA L40S	48 GB	366	Ada Lovelace	3x	datacenterinferencegraphics
NVIDIA RTX 6000 Ada	48 GB	365	Ada Lovelace	3x	workstationinferencetraining
NVIDIA A100	80 GB	312	Ampere	5x	datacentertraininginference
NVIDIA RTX 5080	16 GB	225	Blackwell	1x	consumerinferencetraining
NVIDIA RTX 4090	24 GB	165	Ada Lovelace	1x	consumerinferencetraining
NVIDIA RTX A6000	48 GB	155	Ampere	3x	workstationinferencetraining
NVIDIA A40	48 GB	150	Ampere	3x	datacenterinferencegraphics
NVIDIA A10	24 GB	125	Ampere	1x	datacenterinference
NVIDIA A10G	24 GB	125	Ampere	1x	datacenterinference
NVIDIA V100	32 GB	125	Volta	2x	datacentertraininginference