Inference Benchmarks

Performance metrics across hardware, software, and model configurations

Early Preview

We are working to certify more internal benchmarks to be published. If you're interested in providing hardware or have questions, email benchmarks@haimaker.ai.

Filters

Active Filters:Tag: NVIDIA H20

Found 7 benchmark suites

Date	Suite Name	GPU	Model	Output TPS	Input TPS	Energy Cost (kWh/MT)
10/26/2025	NVIDIA H20 (8x) - deepseek-v3.1	NVIDIA H20 8x 760GB	deepseek-v3.1 deepseek-ai	865.08	4,142.63	0.16
10/25/2025	NVIDIA H20 (8x) - llama-3.3-70b-instruct (High Throughput)	NVIDIA H20 8x 760GB	llama-3.3-70b-instruct meta-llama	5,091.04	7,327.23	0.10
10/24/2025	NVIDIA H20 (8x) - llama-3.3-70b-instruct	NVIDIA H20 8x 760GB	llama-3.3-70b-instruct meta-llama	3,370.98	6,350.24	0.11
10/24/2025	NVIDIA H20 (8x) - qwen2.5-vl-72b-instruct	NVIDIA H20 8x 760GB	qwen2.5-vl-72b-instruct qwen	2,266.04	6,375.82	0.11
10/24/2025	NVIDIA H20 (8x) - qwen3-coder-30b-a3b-instruct	NVIDIA H20 8x 760GB	qwen3-coder-30b-a3b-instruct qwen	8,987.38	30,699.04	0.02
10/24/2025	NVIDIA H20 (8x) - mistral-nemo-instruct-2407	NVIDIA H20 8x 760GB	mistral-nemo-instruct-2407 mistralai	12,605.43	24,969.58	0.02
10/24/2025	NVIDIA H20 (8x) - gemma-3-27b-it	NVIDIA H20 8x 760GB	gemma-3-27b-it google	6,567.80	12,358.95	0.05