NVIDIA A100-SXM4-80GB (2x) - gpt-oss-120b Reasoning

December 19, 2025 at 07:06 AM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
3,860.39
Peak generation speed
Best Input TPS
15,892.90
Peak prefill speed
Best Energy Efficiency
0.01 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
46.30 ms
Lowest latency
Best E2E (P95)
938.60 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
128512256x3,860.391,005.730.03441.22622.2332,387.74100.0%
1281281x126.95126.950.3075.5575.55938.60100.0%
1281282x222.83228.310.1578.0081.071,093.08100.0%
1281284x388.98400.000.1095.5297.841,268.98100.0%
1281288x613.64648.360.07127.75131.181,580.34100.0%
12812816x944.201,000.970.05202.13206.942,049.25100.0%
12812832x1,365.331,442.330.04325.17333.222,822.87100.0%
12812864x1,874.941,961.040.03500.98579.224,091.66100.0%
128128128x2,524.842,650.750.02847.581,012.876,068.41100.0%
128128256x3,244.613,416.990.021,370.871,853.339,403.47100.0%
128128512x3,265.413,425.090.025,340.7611,095.7418,549.51100.0%
1281281024x3,305.243,472.760.0213,689.2528,900.3736,415.39100.0%
1285121x136.2033.040.5246.8346.833,692.92100.0%
1285122x237.5359.090.3156.3359.394,229.62100.0%
1285124x410.28103.590.2163.7666.654,902.57100.0%
1285128x655.16167.210.1576.9780.516,139.23100.0%
12851216x973.23254.500.1196.37100.788,093.40100.0%
12851232x1,393.09372.130.09120.97130.5210,997.03100.0%
12851264x1,987.21526.540.06186.05210.0215,426.25100.0%
128512128x2,864.86743.010.04280.19341.2621,916.04100.0%
128512512x3,809.051,003.670.0316,213.0833,000.3664,673.26100.0%
1281,0241x135.7116.310.5948.3848.387,478.96100.0%
1281,0242x237.8929.400.3756.8059.848,502.18100.0%
1281,0244x390.2351.830.2664.4867.609,800.21100.0%
1281,0248x613.1783.200.1964.4667.6312,340.01100.0%
1281,02416x970.77123.320.1494.1198.3816,716.69100.0%
1281,02432x1,358.82179.850.10124.71134.2222,781.08100.0%
1281,02464x1,941.33255.290.08211.97231.9031,882.30100.0%
1281,024128x2,811.10371.880.05300.78365.7443,916.35100.0%
1281,024256x3,797.16502.510.04402.15532.6064,988.36100.0%
1281,024512x3,641.12486.610.0432,810.8069,321.69134,028.57100.0%
1282,0481x134.958.080.6648.0048.0015,094.23100.0%
1282,0482x235.8116.000.4158.7461.7815,611.64100.0%
1282,0484x393.6925.380.2963.1166.3520,010.37100.0%
1282,0488x582.5642.000.2272.2475.4024,449.30100.0%
1282,04816x908.5760.710.1696.43100.4633,966.52100.0%
1282,04832x1,215.1286.300.13131.11141.4447,498.74100.0%
1282,04864x1,838.11125.290.09201.31229.3065,020.27100.0%
1282,048128x2,709.69183.940.06290.57351.9388,907.96100.0%
1282,048256x3,555.36245.520.041,360.961,832.19133,441.07100.0%
1282,048512x3,174.31221.060.0511,091.59101,508.68168,360.8259.2%
5121281x118.29493.040.09107.41107.411,005.84100.0%
5121282x209.69876.650.08112.77116.651,134.37100.0%
5121284x352.851,466.270.06178.39184.361,346.42100.0%
5121288x573.932,308.590.04272.38275.461,708.00100.0%
51212816x818.843,366.990.03479.18483.902,345.62100.0%
51212832x1,147.254,719.400.02878.85887.323,357.57100.0%
51212864x1,523.086,206.430.021,524.551,637.105,078.37100.0%
512128128x1,896.687,737.450.012,734.733,074.448,125.10100.0%
512128256x2,266.869,279.690.014,302.845,959.6613,523.68100.0%
512128512x2,265.539,256.040.019,543.4819,376.0227,028.40100.0%
5125121x133.99132.660.3446.3046.303,738.99100.0%
5125122x234.99232.420.2259.1962.304,280.02100.0%
5125124x403.01397.190.1472.1375.404,979.53100.0%
5125128x645.71635.250.1085.5388.436,214.49100.0%
51251216x976.38986.450.07113.57118.968,033.97100.0%
51251232x1,295.841,398.250.06154.94165.2911,391.74100.0%
51251264x2,012.842,045.570.04231.47256.8315,525.06100.0%
512512128x2,799.032,909.460.03344.98414.2021,757.07100.0%
512512256x3,762.103,870.490.02553.10782.2632,636.38100.0%
512512512x3,787.853,851.070.0216,408.4633,594.6865,467.73100.0%
5125121024x3,153.803,206.280.0253,583.85122,437.14152,428.4396.9%
5121,0241x134.1465.550.4749.6649.667,566.84100.0%
5121,0242x234.58115.150.2961.3664.408,638.41100.0%
5121,0244x402.78196.230.2074.3676.5510,076.93100.0%
5121,0248x642.12312.750.1483.2988.2812,622.62100.0%
5121,02416x924.83481.880.10111.79116.5016,457.77100.0%
5121,02432x1,239.05655.910.08881.97890.5924,305.89100.0%
5121,02464x1,801.30956.450.061,508.711,639.2433,263.46100.0%
5121,024128x2,581.831,355.970.042,571.933,090.5646,829.92100.0%
5121,024256x3,454.121,786.030.034,273.085,936.6371,056.83100.0%
5121,024512x3,382.721,750.810.0336,225.0279,182.97144,822.96100.0%
5122,0481x133.2632.430.5749.5049.5015,292.34100.0%
5122,0482x233.0156.920.3662.0565.2817,477.25100.0%
5122,0484x374.5797.770.2580.5982.8420,228.17100.0%
5122,0488x627.04153.110.1883.7887.3325,793.31100.0%
5122,04816x819.91248.330.14113.37119.0431,942.70100.0%
5122,04832x1,155.83327.260.11150.94159.3448,735.44100.0%
5122,04864x1,832.96491.970.07231.07252.4964,712.24100.0%
5122,048128x2,565.44689.530.052,751.733,290.0892,186.19100.0%
5122,048256x3,385.66917.170.044,271.646,021.46138,534.79100.0%
5122,048512x3,151.04848.170.047,737.6660,367.90160,247.6558.0%
1,0241281x111.95917.220.07152.81152.811,062.72100.0%
1,0241282x208.161,626.980.05170.85173.811,198.53100.0%
1,0241284x340.692,695.860.03272.70277.311,448.56100.0%
1,0241288x497.684,043.880.02463.12468.531,932.96100.0%
1,02412816x713.715,729.760.02832.09837.782,733.49100.0%
1,02412832x947.887,583.740.011,570.871,611.714,118.65100.0%
1,02412864x1,186.539,539.580.012,841.823,046.286,543.73100.0%
1,024128128x1,394.9311,260.430.014,803.675,882.9411,077.14100.0%
1,024128256x1,593.1412,825.390.017,870.3211,593.6419,426.03100.0%
1,024128512x1,587.8512,791.140.0112,816.0131,147.9738,792.68100.0%
1,0245121x132.47256.780.2249.2149.213,797.09100.0%
1,0245122x232.71452.000.1454.9758.324,321.84100.0%
1,0245124x404.26785.410.0975.5277.994,975.07100.0%
1,0245128x644.561,254.690.0789.5992.556,240.47100.0%
1,02451216x976.221,906.210.05123.19128.708,231.55100.0%
1,02451232x1,391.892,747.840.04171.12185.4611,410.57100.0%
1,02451264x1,979.733,983.790.03253.68285.7215,734.10100.0%
1,024512128x2,760.095,576.560.02373.01448.2822,422.90100.0%
1,024512256x3,547.817,093.440.01656.88930.1435,223.56100.0%
1,024512512x3,680.267,379.230.0117,127.1634,862.7267,675.41100.0%
1,0245121024x2,775.235,555.760.0243,380.5394,320.88133,188.9078.2%
1,0241,0241x126.20121.220.36157.97157.978,042.90100.0%
1,0241,0242x222.34214.020.22178.60181.839,127.52100.0%
1,0241,0244x380.29366.150.15279.72286.5910,672.96100.0%
1,0241,0248x615.80594.040.11486.84489.8513,181.18100.0%
1,0241,02416x901.04871.960.08879.57889.4218,003.00100.0%
1,0241,02432x1,273.941,250.580.061,665.531,720.5325,100.56100.0%
1,0241,02464x1,758.021,778.610.042,558.083,048.8535,313.91100.0%
1,0241,024128x2,415.102,430.400.033,908.415,962.1651,664.96100.0%
1,0241,024256x3,128.633,193.610.028,152.5211,472.2078,511.58100.0%
1,0241,024512x3,148.333,171.180.0241,026.7692,169.96158,046.71100.0%
1,0241,0241024x3,239.123,270.810.0241,050.1181,381.71152,556.5352.5%
1,0242,0481x128.5962.250.4847.8747.8715,660.94100.0%
1,0242,0482x231.84111.090.3058.2461.3317,588.05100.0%
1,0242,0484x397.76190.640.20111.45115.6920,500.76100.0%
1,0242,0488x618.11303.480.1589.8692.6925,802.00100.0%
1,0242,04816x885.30441.230.11834.90840.4035,592.21100.0%
1,0242,04832x1,256.77626.950.081,590.441,615.0350,095.20100.0%
1,0242,04864x1,724.16911.690.062,832.803,046.4368,933.17100.0%
1,0242,048128x2,401.681,248.590.054,871.655,849.94100,548.36100.0%
1,0242,048256x3,123.031,633.160.037,469.0311,734.18153,824.65100.0%
1,0242,048512x3,021.591,588.050.037,295.9768,999.29161,981.9656.4%
2,0481281x101.461,597.400.04260.94260.941,232.34100.0%
2,0481282x178.102,860.580.03272.53275.941,369.10100.0%
2,0481284x278.944,536.460.02477.08480.421,726.96100.0%
2,0481288x403.426,529.830.02860.60868.122,389.58100.0%
2,04812816x542.238,662.700.011,603.821,613.503,597.57100.0%
2,04812832x689.5710,967.920.012,706.463,047.305,683.03100.0%
2,04812864x810.3012,994.600.014,830.595,923.329,579.24100.0%
2,048128128x914.9514,622.350.017,999.4711,670.5317,014.43100.0%
2,048128256x995.5615,892.900.0118,581.5223,173.9831,306.08100.0%
2,048128512x995.3915,873.140.0120,258.1254,558.7162,534.32100.0%
2,0485121x129.17505.390.1452.9052.903,894.15100.0%
2,0485122x227.86887.660.0963.4966.814,413.78100.0%
2,0485124x355.741,550.130.0680.6883.085,055.29100.0%
2,0485128x555.272,544.210.04101.80108.356,145.96100.0%
2,04851216x875.293,780.180.03136.62145.568,261.93100.0%
2,04851232x1,265.575,210.860.02191.77208.9411,980.82100.0%
2,04851264x1,878.817,591.350.02282.81344.7516,420.94100.0%
2,048512128x2,655.8610,661.440.01453.82625.4623,349.48100.0%
2,048512256x2,170.408,671.510.0113,507.7323,245.6457,511.66100.0%
2,0481,0241x129.75251.570.2353.8153.817,822.65100.0%
2,0481,0242x228.63441.380.1563.8267.968,878.68100.0%
2,0481,0244x380.86760.850.1074.8277.3010,300.84100.0%
2,0481,0248x598.011,233.220.0796.32101.7912,685.47100.0%
2,0481,02416x830.711,879.870.05141.55151.6616,624.68100.0%
2,0481,02432x1,141.672,668.430.04190.39213.0023,418.34100.0%
2,0481,02464x1,796.713,721.740.03303.59341.2133,563.41100.0%
2,0481,024128x2,598.535,267.680.02464.71616.5047,386.31100.0%
2,0481,024256x3,455.917,124.050.02754.831,184.4270,031.23100.0%
2,0481,024512x2,595.225,298.100.0248,676.60118,887.81188,455.43100.0%
2,0482,0481x126.87122.460.37256.94256.9416,070.95100.0%
2,0482,0482x223.94215.210.24269.97273.3118,207.72100.0%
2,0482,0484x311.11385.210.16468.11471.9720,347.37100.0%
2,0482,0488x455.10639.430.11840.36844.0024,472.92100.0%
2,0482,04816x689.12934.510.091,610.051,619.6933,456.39100.0%
2,0482,04832x1,024.501,242.100.072,950.763,037.7150,345.42100.0%
2,0482,04864x1,533.861,718.280.055,101.765,899.6572,777.67100.0%
2,0482,048128x2,256.872,371.150.047,928.5511,668.50105,430.51100.0%
2,0482,048256x2,829.053,020.780.0313,717.4022,985.22165,359.14100.0%
2,0482,048512x2,916.563,129.200.038,843.6163,846.80175,277.1258.6%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA A100-SXM4-80GB
GPU Count2
GPU Memory (Total)160 GB
GPU Driver570.195.03
CUDA VersionUnknown
Compute Capability8.0
Power Limit (per GPU)400 W
CPU ModelAMD EPYC-Milan Processor
RAM174 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.11.0
OSUbuntu
OS Version24.04.3 LTS (Noble Numbat)
Kernel Version6.14.0-29-generic
Python Version3.12.3

Model Configuration

Provideropenai
Model Namegpt-oss-120b
QuantizationMXFP4

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length32768
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.90
Temperature0.70
Top-P1.00
Top-K-1