Performance metrics across hardware, software, and model configurations
Back to HomeWe are working to certify more internal benchmarks to be published. If you're interested in providing hardware or have questions, email [email protected].
22
1
1
Found 22 benchmark suites
| Date | Suite Name | GPU | Model | Output TPS | Input TPS | Energy Cost (kWh/MT) |
|---|---|---|---|---|---|---|
| 12/19/2025 | NVIDIA A100-SXM4-80GB (2x) - gpt-oss-120b Reasoning | NVIDIA A100-SXM4-80GB 2x 160GB | gpt-oss-120b openai | 3,860.39 | 15,892.90 | 0.01 |
| 11/20/2025 | NVIDIA A100-PCIE-40GB (1x) - Mistral-Nemo-Instruct | NVIDIA A100-PCIE-40GB 1x 40GB | Mistral-Nemo-Instruct mistral | 3,541.62 | 6,567.89 | 0.01 |
| 11/13/2025 | NVIDIA H100 80GB HBM3 (8x) - gpt-oss-120b | NVIDIA H100 80GB HBM3 8x 632GB | gpt-oss-120b openai | 18,672.47 | 50,200.55 | 0.02 |
| 11/13/2025 | NVIDIA H100 80GB HBM3 (8x) - llama-2-70b-hf | NVIDIA H100 80GB HBM3 8x 632GB | llama-2-70b-hf meta-llama | 668.64 | 855.76 | 0.79 |
| 11/12/2025 | NVIDIA H100 80GB HBM3 (8x) - llama-3.3-70b-instruct | NVIDIA H100 80GB HBM3 8x 632GB | llama-3.3-70b-instruct meta-llama | 9,219.60 | 16,108.82 | 0.06 |
| 11/7/2025 | NVIDIA H200 NVL (2x) - mistral-nemo-instruct-2407 | NVIDIA H200 NVL 2x 280GB | mistral-nemo-instruct-2407 mistralai | 12,204.48 | 47,690.47 | 0.01 |
| 11/7/2025 | NVIDIA H200 NVL (2x) - qwen3-30b-a3b | NVIDIA H200 NVL 2x 280GB | qwen3-30b-a3b qwen | 6,124.38 | 51,413.77 | 0.00 |
| 11/6/2025 | NVIDIA H200 NVL (2x) - allam-7b-instruct-preview | NVIDIA H200 NVL 2x 280GB | allam-7b-instruct-preview humain-ai | 11,481.64 | 45,184.12 | 0.01 |
| 11/6/2025 | NVIDIA H200 NVL (2x) - llama-2-70b-hf (50% Max Batch Token) | NVIDIA H200 NVL 2x 280GB | llama-2-70b-hf meta-llama | 4,620.81 | 8,844.22 | 0.03 |
| 11/6/2025 | NVIDIA H200 NVL (2x) - llama-2-70b-hf | NVIDIA H200 NVL 2x 280GB | llama-2-70b-hf meta-llama | 5,012.77 | 10,466.05 | 0.03 |
| 11/5/2025 | NVIDIA H200 NVL (2x) - gpt-oss-120b | NVIDIA H200 NVL 2x 280GB | gpt-oss-120b openai | 3,166.06 | 11,929.37 | 0.01 |
| 11/5/2025 | NVIDIA H200 NVL (2x) - qwen3-coder-30b-a3b-instruct | NVIDIA H200 NVL 2x 280GB | qwen3-coder-30b-a3b-instruct qwen | 5,757.76 | 43,900.39 | 0.01 |
| 11/5/2025 | NVIDIA H200 NVL (2x) - llama-3.3-70b-instruct | NVIDIA H200 NVL 2x 280GB | llama-3.3-70b-instruct meta-llama | 5,005.29 | 11,042.39 | 0.03 |
| 11/2/2025 | NVIDIA A100 80GB PCIe (2x) - gpt-oss-120b | NVIDIA A100 80GB PCIe 2x 160GB | gpt-oss-120b openai | 1,673.99 | 5,556.37 | 0.02 |
| 11/2/2025 | NVIDIA A100 80GB PCIe (2x) - gemma-3-27b-it | NVIDIA A100 80GB PCIe 2x 160GB | gemma-3-27b-it google | 1,834.30 | 4,909.53 | 0.03 |
| 10/26/2025 | NVIDIA H20 (8x) - deepseek-v3.1 | NVIDIA H20 8x 760GB | deepseek-v3.1 deepseek-ai | 865.08 | 4,142.63 | 0.16 |
| 10/25/2025 | NVIDIA H20 (8x) - llama-3.3-70b-instruct (High Throughput) | NVIDIA H20 8x 760GB | llama-3.3-70b-instruct meta-llama | 5,091.04 | 7,327.23 | 0.10 |
| 10/24/2025 | NVIDIA H20 (8x) - llama-3.3-70b-instruct | NVIDIA H20 8x 760GB | llama-3.3-70b-instruct meta-llama | 3,370.98 | 6,350.24 | 0.11 |
| 10/24/2025 | NVIDIA H20 (8x) - qwen2.5-vl-72b-instruct | NVIDIA H20 8x 760GB | qwen2.5-vl-72b-instruct qwen | 2,266.04 | 6,375.82 | 0.11 |
| 10/24/2025 | NVIDIA H20 (8x) - qwen3-coder-30b-a3b-instruct | NVIDIA H20 8x 760GB | qwen3-coder-30b-a3b-instruct qwen | 8,987.38 | 30,699.04 | 0.02 |
| 10/24/2025 | NVIDIA H20 (8x) - mistral-nemo-instruct-2407 | NVIDIA H20 8x 760GB | mistral-nemo-instruct-2407 mistralai | 12,605.43 | 24,969.58 | 0.02 |
| 10/24/2025 | NVIDIA H20 (8x) - gemma-3-27b-it | NVIDIA H20 8x 760GB | gemma-3-27b-it google | 6,567.80 | 12,358.95 | 0.05 |
