Haimaker.ai Logo

Inference Benchmarks

Performance metrics across hardware, software, and model configurations

Back to Home

Early Preview

We are working to certify more internal benchmarks to be published. If you're interested in providing hardware or have questions, email benchmarks@haimaker.ai.

Filters

llama-2-70b-hf
Clear
Active Filters:Tag: llama-2-70b-hf

Found 3 benchmark suites

DateSuite NameGPUModelOutput TPSInput TPSEnergy Cost
(kWh/MT)
11/13/2025
NVIDIA H100 80GB HBM3 (8x) - llama-2-70b-hf
NVIDIA H100 80GB HBM3
8x 632GB
llama-2-70b-hf
meta-llama
668.64855.760.79
11/6/2025
NVIDIA H200 NVL (2x) - llama-2-70b-hf (50% Max Batch Token)
NVIDIA H200 NVL
2x 280GB
llama-2-70b-hf
meta-llama
4,620.818,844.220.03
11/6/2025
NVIDIA H200 NVL (2x) - llama-2-70b-hf
NVIDIA H200 NVL
2x 280GB
llama-2-70b-hf
meta-llama
5,012.7710,466.050.03