NVIDIA H100 80GB HBM3 (8x) - gpt-oss-120b

November 13, 2025 at 08:07 AM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
18,672.47
Peak generation speed
Best Input TPS
50,200.55
Peak prefill speed
Best Energy Efficiency
0.02 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
1,364.94 ms
Lowest latency
Best E2E (P95)
3,219.48 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
2,0482,0481024x18,672.4721,871.300.0314,591.2021,056.7982,870.74100.0%
1285121x31.229.437.113,077.023,077.0212,940.58100.0%
1285122x53.0118.744.114,068.084,750.3113,319.07100.0%
1285124x112.8438.632.293,718.676,181.4413,144.89100.0%
1285128x242.3478.241.222,956.455,256.4213,112.60100.0%
12851216x439.37155.770.723,818.208,686.7713,221.55100.0%
12851232x808.63307.130.423,928.247,642.8813,322.80100.0%
12851264x1,707.19601.860.243,750.137,458.3013,479.67100.0%
128512128x3,187.281,191.720.154,158.988,525.8613,579.68100.0%
128512256x6,142.452,267.840.094,280.168,383.7314,241.42100.0%
128512512x10,677.494,090.530.064,829.059,994.3415,223.5899.6%
1285121024x10,809.454,060.510.065,443.7411,304.5917,904.8899.8%
1281,0241x36.354.757.182,201.792,201.7925,695.08100.0%
1281,0242x73.269.753.832,136.362,293.3625,622.29100.0%
1281,0244x123.2219.252.305,443.939,159.0226,376.03100.0%
1281,0248x237.8238.731.263,619.856,536.8726,499.87100.0%
1281,02416x534.5377.730.703,125.706,381.2526,521.72100.0%
1281,02432x983.72153.510.423,856.457,724.0026,684.32100.0%
1281,02464x1,952.67301.730.254,120.388,139.0926,952.46100.0%
1281,024128x3,870.79601.060.154,139.068,618.7327,093.94100.0%
1281,024256x7,393.021,152.030.094,144.458,731.0528,085.83100.0%
1281,024512x13,109.682,088.960.064,616.8010,064.1929,988.59100.0%
1281,0241024x13,387.142,113.530.055,436.5212,043.2034,013.07100.0%
1282,0481x37.282.417.374,010.734,010.7350,586.54100.0%
1282,0482x68.774.804.032,748.563,062.1251,777.39100.0%
1282,0484x134.449.872.274,301.927,133.8551,483.21100.0%
1282,0488x257.5519.331.293,158.814,844.9953,120.02100.0%
1282,04816x551.3938.660.723,067.254,870.3453,325.36100.0%
1282,04832x973.8875.730.453,704.268,225.9754,121.08100.0%
1282,04864x2,070.60150.220.264,172.489,610.5954,204.33100.0%
1282,048128x4,070.38298.510.163,830.387,843.8654,691.97100.0%
1282,048256x6,340.65475.440.114,549.7910,167.8268,610.58100.0%
1282,048512x11,752.38891.820.074,908.3110,435.4972,379.55100.0%
1282,0481024x17,803.151,334.700.055,347.7211,446.7771,066.57100.0%
5121281x16.19131.671.962,005.772,005.773,767.62100.0%
5121282x32.03303.540.881,967.312,170.303,274.48100.0%
5121284x50.70453.880.641,895.562,774.463,272.1975.0%
5125121x34.6138.754.031,766.491,766.4912,800.64100.0%
5125122x61.5478.102.292,975.753,133.8712,731.75100.0%
5125124x109.72131.131.342,588.613,403.0714,794.88100.0%
5125128x192.46254.000.803,747.115,588.5415,533.13100.0%
51251216x372.09502.630.463,367.266,258.8615,778.67100.0%
51251232x737.70994.890.253,625.617,696.6216,020.80100.0%
51251264x1,368.491,957.620.154,236.017,802.6316,233.07100.0%
512512128x2,681.413,878.390.094,337.847,787.7416,329.10100.0%
512512256x5,378.377,453.190.064,332.637,673.2516,932.95100.0%
512512512x9,927.1113,726.870.045,187.639,023.1318,224.20100.0%
5125121024x12,940.8417,877.990.036,822.2711,975.3320,922.65100.0%
5121,0241x37.2619.315.221,702.511,702.5125,680.21100.0%
5121,0242x69.5638.072.962,940.783,953.9226,110.94100.0%
5121,0244x142.8875.111.602,079.753,173.8326,324.82100.0%
5121,0248x266.26148.870.893,586.804,818.3926,529.32100.0%
5121,02416x498.32296.250.532,641.613,766.3626,768.24100.0%
5121,02432x957.37590.100.313,795.717,229.3327,022.98100.0%
5121,02464x1,887.021,161.150.184,183.927,757.1427,406.28100.0%
5121,024128x3,714.662,287.070.113,705.886,989.9627,717.40100.0%
5121,024256x7,292.384,391.450.074,088.797,593.7728,818.02100.0%
5121,024512x13,667.358,185.370.054,648.088,559.8130,679.13100.0%
5121,0241024x18,422.9511,014.940.036,126.0611,300.2236,522.04100.0%
5122,0481x38.729.716.071,709.281,709.2851,105.28100.0%
5122,0482x77.8219.733.271,962.132,203.4050,435.72100.0%
5122,0484x125.6937.142.062,457.714,063.7553,245.15100.0%
5122,0488x260.5073.641.103,561.264,693.0753,634.29100.0%
5122,04816x473.73148.820.672,710.075,243.8953,316.63100.0%
5122,04832x973.01296.080.373,531.136,142.3653,870.40100.0%
5122,04864x1,865.24540.150.233,727.687,427.9058,924.52100.0%
5122,048128x3,387.61986.350.143,834.057,080.8764,393.68100.0%
5122,048256x6,539.641,906.240.094,132.687,975.1966,452.59100.0%
5122,048512x12,142.723,500.360.064,849.869,100.7872,115.89100.0%
5122,0481024x16,881.704,859.380.047,134.8713,065.3074,691.45100.0%
1,0245121x31.9074.402.742,459.802,459.8013,103.45100.0%
1,0245122x62.83148.091.502,524.262,982.0413,182.09100.0%
1,0245124x106.22289.960.834,069.585,583.4813,475.73100.0%
1,0245128x211.45571.130.464,021.756,156.2613,710.42100.0%
1,02451216x424.651,144.370.273,984.155,791.1013,719.03100.0%
1,02451232x807.812,267.300.154,156.027,436.6613,828.89100.0%
1,02451264x1,591.824,488.010.094,610.538,769.6213,975.35100.0%
1,024512128x3,044.348,569.540.064,553.997,904.5114,498.9499.2%
1,024512256x5,530.8915,918.780.045,492.899,795.6015,625.7699.6%
1,024512512x10,024.7428,435.840.036,587.4210,515.6617,514.71100.0%
1,0245121024x11,553.8032,773.700.029,307.1615,200.7722,178.1699.7%
1,0241,0241x32.1037.624.224,948.674,948.6725,918.23100.0%
1,0241,0242x71.8774.902.172,220.012,276.5626,045.60100.0%
1,0241,0244x134.92149.751.233,701.635,062.4926,097.03100.0%
1,0241,0248x256.89291.040.714,154.295,688.7626,909.33100.0%
1,0241,02416x507.06583.440.394,161.716,876.9126,913.72100.0%
1,0241,02432x990.091,157.320.224,269.728,008.5927,118.42100.0%
1,0241,02464x1,855.762,283.680.134,419.388,968.4027,479.85100.0%
1,0241,024128x3,747.414,489.550.084,306.077,584.1627,911.64100.0%
1,0241,024256x7,168.298,683.470.054,729.058,829.9128,777.64100.0%
1,0241,024512x13,469.8115,866.790.045,446.989,339.3931,434.87100.0%
1,0241,0241024x14,863.2817,687.890.038,036.4313,298.0538,284.75100.0%
1,0242,0481x35.3418.715.395,322.015,322.0152,099.10100.0%
1,0242,0482x72.1637.762.904,578.917,086.7751,699.53100.0%
1,0242,0484x138.5172.721.664,777.696,566.1953,743.72100.0%
1,0242,0488x286.94147.610.863,355.724,878.0553,054.09100.0%
1,0242,04816x525.35290.550.524,461.399,856.0154,063.35100.0%
1,0242,04832x1,082.02581.190.294,019.116,924.5254,034.35100.0%
1,0242,04864x2,018.031,143.420.183,951.207,211.1654,968.16100.0%
1,0242,048128x3,970.462,266.200.114,459.578,807.7455,423.77100.0%
1,0242,048256x7,699.614,402.350.074,934.349,009.1156,988.94100.0%
1,0242,048512x14,166.958,019.960.056,108.6310,214.8562,440.23100.0%
1,0242,0481024x17,979.8810,267.400.039,691.9016,959.2776,422.89100.0%
2,0481281x23.61611.370.461,364.941,364.943,219.48100.0%
2,0481282x14.23595.820.522,099.582,099.583,246.2150.0%
2,0485121x32.72154.791.572,421.602,421.6012,714.26100.0%
2,0485122x60.12290.860.852,938.373,467.4913,470.95100.0%
2,0485124x117.28582.260.503,099.033,863.9213,455.93100.0%
2,0485128x180.911,175.430.294,562.607,816.4113,310.00100.0%
2,04851216x368.042,266.740.164,433.417,789.3813,796.61100.0%
2,04851232x755.764,574.310.094,516.347,800.6613,647.61100.0%
2,04851264x1,457.478,613.280.065,041.788,266.2914,480.79100.0%
2,048512128x2,769.1916,120.560.045,645.289,260.5115,452.64100.0%
2,048512256x5,021.5929,872.900.036,876.3911,113.5916,646.74100.0%
2,048512512x8,418.4750,200.550.029,143.1514,351.7519,613.3699.6%
2,0485121024x8,273.3148,936.110.0214,204.0221,115.7827,235.2099.8%
2,0481,0241x32.9875.812.704,212.754,212.7525,958.98100.0%
2,0481,0242x65.15149.841.484,425.844,497.7626,137.27100.0%
2,0481,0244x139.40297.990.752,894.423,410.4126,302.99100.0%
2,0481,0248x200.65582.330.493,372.385,789.7426,862.81100.0%
2,0481,02416x416.841,157.910.274,355.468,889.6626,998.09100.0%
2,0481,02432x864.252,305.100.154,424.437,512.3627,104.86100.0%
2,0481,02464x1,772.184,513.520.094,540.308,174.9127,677.38100.0%
2,0481,024128x3,579.538,716.480.064,997.289,111.1328,644.04100.0%
2,0481,024256x6,737.2216,657.760.046,006.9810,286.2029,925.44100.0%
2,0481,024512x12,109.0029,635.290.038,028.6912,449.4733,561.80100.0%
2,0481,0241024x15,343.3337,646.450.0214,487.4220,218.9945,000.76100.0%
2,0482,0481x38.4938.863.802,481.142,481.1450,636.77100.0%
2,0482,0482x74.6775.401.972,784.152,795.0151,974.62100.0%
2,0482,0484x104.48151.641.263,656.244,795.7551,600.04100.0%
2,0482,0488x192.12292.940.763,973.536,003.9753,418.88100.0%
2,0482,04816x372.27583.070.434,096.187,922.1253,626.70100.0%
2,0482,04832x874.581,157.180.234,369.157,449.9554,042.98100.0%
2,0482,04864x1,825.172,263.660.145,223.319,970.3055,233.80100.0%
2,0482,048128x3,846.414,423.240.085,591.079,451.8056,494.14100.0%
2,0482,048256x7,204.488,512.340.056,826.6311,870.1258,669.83100.0%
2,0482,048512x13,178.6115,408.000.049,057.7912,811.6964,681.28100.0%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H100 80GB HBM3
GPU Count8
GPU Memory (Total)632 GB
GPU Driver570.195.03
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)700 W
CPU ModelIntel(R) Xeon(R) Platinum 8480+
RAM1,772 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.11.0
OSUbuntu
OS Version22.04.5 LTS (Jammy Jellyfish)
Kernel Version5.15.0-88-generic
Python Version3.10.12

Model Configuration

Provideropenai
Model Namegpt-oss-120b
QuantizationMXFP4

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length8192
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.95
Temperature0.70
Top-P1.00
Top-K-1