NVIDIA H200 NVL (2x) - gpt-oss-120b

November 5, 2025 at 05:18 PM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
3,166.06
Peak generation speed
Best Input TPS
11,929.37
Peak prefill speed
Best Energy Efficiency
0.01 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
357.63 ms
Lowest latency
Best E2E (P95)
716.58 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
1282,04864x3,166.06231.420.072,794.334,892.2534,597.65100.0%
1285121x139.5046.250.57767.40767.402,637.14100.0%
1285122x282.0886.000.29620.30756.092,899.47100.0%
1285124x452.54145.680.22825.13958.613,477.27100.0%
1285128x728.63243.190.151,046.761,494.414,202.05100.0%
12851216x1,056.09390.940.121,688.322,952.675,214.76100.0%
12851232x1,397.37503.560.093,343.764,820.628,093.25100.0%
12851264x2,251.33874.390.073,291.336,009.239,095.0598.4%
128512128x2,492.75942.620.076,188.3712,018.2916,337.42100.0%
128512256x2,286.84848.730.0715,823.0328,945.0434,137.9999.6%
1281,0241x172.5222.900.66553.01553.015,326.20100.0%
1281,0242x317.5543.190.35639.32713.645,785.12100.0%
1281,0244x514.8974.900.241,032.721,380.336,774.34100.0%
1281,0248x802.35124.300.181,206.452,009.118,242.68100.0%
1281,02416x1,267.33196.480.131,592.592,526.7310,444.02100.0%
1281,02432x1,898.29312.300.101,847.923,358.7913,031.04100.0%
1281,02464x2,988.10457.780.072,318.954,563.8917,244.77100.0%
1281,024128x3,035.63474.810.0710,231.2920,242.7332,968.62100.0%
1281,024256x3,040.37476.340.0725,402.6150,855.4764,192.32100.0%
1281,024512x2,948.89469.450.0756,805.81114,072.15126,681.55100.0%
1282,0481x169.8513.150.62978.15978.159,270.99100.0%
1282,0482x246.9122.900.45661.51903.1810,610.19100.0%
1282,0484x542.9737.440.26892.071,456.7313,560.76100.0%
1282,0488x839.0861.600.191,116.592,041.5816,643.81100.0%
1282,04816x1,313.3997.270.141,442.193,477.0721,150.40100.0%
1282,04832x2,021.59153.770.101,797.662,948.4826,551.93100.0%
1282,048128x3,163.76232.050.0717,907.0238,204.5968,597.57100.0%
5121281x99.02691.770.17357.63357.63716.58100.0%
5121282x71.161,264.290.07631.63673.00782.80100.0%
5125121x165.01182.290.33359.44359.442,720.90100.0%
5125122x283.03340.520.18594.56706.282,918.81100.0%
5125124x499.71572.340.13597.56714.993,449.06100.0%
5125128x746.37939.800.091,055.141,673.324,183.24100.0%
51251216x1,179.501,509.030.071,331.722,255.265,221.01100.0%
51251232x1,700.762,318.560.051,849.612,908.996,783.01100.0%
51251264x1,973.212,772.510.045,224.637,396.2111,309.15100.0%
512512128x2,462.843,514.960.046,316.7512,005.5416,870.40100.0%
512512256x2,566.013,527.030.0413,555.7226,755.8132,251.16100.0%
512512512x2,383.923,331.800.0431,117.0462,341.3268,201.75100.0%
5121,0241x176.7792.190.40421.03421.035,379.04100.0%
5121,0242x285.41151.540.271,281.181,381.546,557.91100.0%
5121,0244x464.46247.500.191,736.841,901.197,982.94100.0%
5121,0248x733.05456.280.141,356.452,029.478,644.40100.0%
5121,02416x1,172.79760.960.101,295.312,387.0910,380.17100.0%
5121,02432x1,900.101,195.310.071,741.473,161.6313,228.61100.0%
5121,02464x2,957.821,804.230.052,154.104,199.2717,148.13100.0%
5121,024128x3,000.671,857.810.059,632.0919,761.4832,959.14100.0%
5121,024256x2,954.271,773.830.0525,528.7353,707.9466,922.13100.0%
5122,0481x182.8570.200.45407.32407.327,057.89100.0%
5122,0482x332.5985.640.31670.65802.2511,606.98100.0%
5122,0484x565.75144.990.21823.051,086.7213,630.64100.0%
5122,0488x763.77238.280.171,039.031,818.0816,554.54100.0%
5122,04816x1,263.76377.760.121,216.892,264.9220,957.71100.0%
5122,04832x2,026.90587.320.091,744.283,304.9227,023.75100.0%
5122,04864x3,134.69906.920.062,063.753,609.4234,676.29100.0%
5122,048128x3,003.04886.400.0619,262.6140,598.1670,035.44100.0%
1,0245121x166.23367.510.21405.06405.062,652.73100.0%
1,0245122x197.96663.500.131,300.941,454.042,938.66100.0%
1,0245124x343.371,121.340.091,520.081,985.173,482.97100.0%
1,0245128x664.231,790.400.061,356.842,139.204,354.32100.0%
1,02451216x1,032.862,884.340.051,825.232,989.495,404.87100.0%
1,02451232x1,311.853,547.070.044,061.345,308.468,823.33100.0%
1,02451264x2,246.526,627.050.033,463.646,357.589,321.21100.0%
1,024512128x2,298.116,477.090.037,050.4613,584.0218,158.37100.0%
1,024512256x2,214.296,181.300.0315,761.7130,923.4636,499.18100.0%
1,0241,0241x144.68183.440.401,351.321,351.325,314.42100.0%
1,0241,0242x305.72331.690.20730.52816.795,883.94100.0%
1,0241,0244x445.82536.880.141,899.532,621.897,270.81100.0%
1,0241,0248x810.81899.210.101,464.952,701.468,691.47100.0%
1,0241,02416x1,306.111,473.410.071,363.742,651.9510,617.69100.0%
1,0241,02432x2,008.352,362.750.052,239.774,233.5813,193.59100.0%
1,0241,02464x2,955.193,568.210.042,477.714,171.0717,296.41100.0%
1,0241,024128x3,028.373,646.650.0410,114.9119,759.1133,295.32100.0%
1,0241,024256x3,011.903,591.260.0425,548.2851,316.1664,949.26100.0%
1,0242,0481x167.8292.080.461,410.681,410.6810,587.50100.0%
1,0242,0482x337.20169.210.25625.29741.4811,537.62100.0%
1,0242,0484x547.77287.260.181,241.871,637.1013,598.42100.0%
1,0242,0488x889.23471.040.131,466.982,561.7716,602.01100.0%
1,0242,04816x1,354.83738.960.101,681.722,918.4921,190.79100.0%
1,0242,04832x2,120.641,162.630.072,138.184,072.9926,896.69100.0%
1,0242,04864x2,971.461,682.550.054,190.016,090.6137,166.37100.0%
1,0242,048128x3,058.121,747.260.0518,814.3739,916.8370,691.92100.0%
1,0242,048256x3,041.671,701.320.0553,040.41112,879.99143,143.94100.0%
2,0481281x22.271,327.940.061,315.191,315.191,480.55100.0%
2,0481282x67.464,637.870.01682.93731.36842.35100.0%
2,0481284x46.083,852.940.03859.10908.501,008.7150.0%
2,0485121x160.07727.540.14448.77448.772,704.97100.0%
2,0485122x167.171,307.640.081,594.842,179.022,989.95100.0%
2,0485124x416.362,249.350.051,031.491,438.503,477.33100.0%
2,0485128x505.063,598.160.041,441.672,355.754,330.46100.0%
2,04851216x581.243,604.630.044,958.515,733.858,630.60100.0%
2,04851232x1,203.317,439.310.023,653.325,218.258,301.14100.0%
2,04851264x2,075.4911,929.370.024,275.166,441.1410,292.47100.0%
2,048512128x2,035.0011,873.450.027,956.6915,248.3519,847.9499.2%
2,0481,0241x169.57369.160.20643.96643.965,329.50100.0%
2,0481,0242x299.84677.680.13933.111,248.125,778.19100.0%
2,0481,0244x496.851,146.890.09824.25964.356,823.11100.0%
2,0481,0248x635.691,879.440.071,130.861,665.578,314.42100.0%
2,0481,02416x1,140.412,945.760.051,385.482,663.9110,566.95100.0%
2,0481,02432x1,703.384,575.980.032,118.624,059.9813,594.79100.0%
2,0481,02464x2,740.737,064.690.022,924.975,461.0917,544.33100.0%
2,0481,024128x2,880.057,116.620.0210,296.5921,638.1734,077.40100.0%
2,0481,024256x2,616.206,460.150.0328,170.4559,963.8472,883.81100.0%
2,0482,0481x176.15185.490.32962.39962.3910,607.08100.0%
2,0482,0482x336.50338.570.19623.64691.5311,570.42100.0%
2,0482,0484x443.50593.680.14985.301,477.4913,198.39100.0%
2,0482,0488x659.951,001.600.101,138.361,831.0215,605.57100.0%
2,0482,04816x959.611,457.370.081,324.572,263.8321,396.93100.0%
2,0482,04832x1,736.782,315.510.052,302.803,895.1926,949.05100.0%
2,0482,04864x2,713.973,373.620.043,129.567,626.5836,870.71100.0%
2,0482,048128x3,044.213,510.800.0418,074.9740,007.8570,377.16100.0%
2,0482,048256x3,039.963,569.920.0449,029.01107,612.06135,564.76100.0%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H200 NVL
GPU Count2
GPU Memory (Total)280 GB
GPU Driver580.95.05
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)600 W
CPU ModelIntel(R) Xeon(R) 6960P
RAM2,267 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.11.0
OSUbuntu
OS Version22.04.5 LTS (Jammy Jellyfish)
Kernel Version5.15.0-88-generic
Python Version3.10.12

Model Configuration

Provideropenai
Model Namegpt-oss-120b
QuantizationMXFP4

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model LengthUnknown
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.80
Temperature0.70
Top-P1.00
Top-K-1