NVIDIA A100 80GB PCIe (2x) - gpt-oss-120b

November 2, 2025 at 09:38 PM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
1,673.99
Peak generation speed
Best Input TPS
5,556.37
Peak prefill speed
Best Energy Efficiency
0.02 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
629.78 ms
Lowest latency
Best E2E (P95)
1,120.10 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
1282,048128x1,673.99122.730.0935,501.8977,242.15133,102.89100.0%
1285121x97.0129.440.63890.51890.514,144.33100.0%
1285122x145.5852.290.431,588.761,863.274,778.02100.0%
1285124x259.3691.880.291,705.541,860.535,525.68100.0%
1285128x466.41150.960.211,538.852,377.286,797.46100.0%
12851216x682.35223.580.152,160.113,835.789,205.48100.0%
12851232x818.94332.310.134,279.098,698.7312,258.35100.0%
12851264x1,402.02499.330.084,558.629,242.3516,208.27100.0%
128512128x1,361.56491.660.0812,860.7624,818.4033,048.32100.0%
1281,0241x104.1614.410.721,158.861,158.868,467.29100.0%
1281,0242x189.6727.020.441,352.301,552.519,250.17100.0%
1281,0244x332.0047.110.301,343.302,179.8510,780.49100.0%
1281,0248x469.0775.640.231,772.412,474.9113,567.52100.0%
1281,02416x736.41110.140.172,775.415,215.0418,709.24100.0%
1281,02432x1,081.49168.310.123,044.726,338.4924,311.39100.0%
1281,02464x1,598.74245.270.084,118.268,414.0133,136.13100.0%
1281,024128x1,603.33250.210.0820,361.4141,048.4065,168.03100.0%
1281,024256x1,595.86245.600.0852,678.93107,115.77132,996.02100.0%
1282,0481x107.367.110.761,700.991,700.9917,156.63100.0%
1282,0482x198.9113.260.461,138.431,452.8118,807.09100.0%
1282,0484x336.0023.210.321,294.081,759.9421,882.58100.0%
1282,0488x482.8938.240.251,559.123,177.8126,848.71100.0%
1282,04816x756.3556.890.182,363.525,031.1436,235.77100.0%
1282,04832x1,053.2681.850.133,157.075,647.8350,054.92100.0%
1282,04864x1,636.19120.900.094,423.458,237.2467,310.04100.0%
1282,048256x1,581.40116.210.0942,269.5994,876.62144,249.7555.5%
5121281x52.68442.860.14652.22652.221,120.10100.0%
5121282x65.09809.600.10875.00910.401,226.34100.0%
5121284x106.321,027.800.09914.561,164.081,435.3175.0%
5125121x104.24119.400.35655.39655.394,153.25100.0%
5125122x185.79209.120.23677.91807.304,755.29100.0%
5125124x296.83358.660.161,093.101,288.955,513.48100.0%
5125128x421.52575.130.121,606.022,974.996,858.98100.0%
51251216x625.25868.840.092,293.254,656.019,101.71100.0%
51251232x939.591,236.230.073,649.267,060.6412,848.42100.0%
51251264x1,300.871,810.590.055,279.449,866.5317,478.52100.0%
512512128x1,299.851,817.130.0513,255.7225,409.6134,732.20100.0%
5121,0241x113.7659.520.50629.78629.788,332.54100.0%
5121,0242x193.53106.860.321,060.301,463.109,308.41100.0%
5121,0244x333.79180.540.221,166.121,597.0010,952.96100.0%
5121,0248x468.81289.890.171,728.462,403.6213,619.81100.0%
5121,02416x717.76450.860.121,889.793,016.9517,591.74100.0%
5121,02432x1,055.95631.540.092,793.865,540.9225,233.26100.0%
5121,02464x1,575.62945.670.063,967.588,556.0533,558.36100.0%
5121,024128x1,605.17982.600.0619,423.6839,341.7964,531.32100.0%
5121,024256x1,536.00928.200.0652,437.03110,998.04136,274.74100.0%
5122,0481x112.1628.310.61762.05762.0517,518.00100.0%
5122,0482x204.4451.870.38814.92944.4519,180.16100.0%
5122,0484x330.2189.110.27846.52968.0722,192.69100.0%
5122,0488x527.67140.300.201,297.822,173.6528,148.47100.0%
5122,04816x741.16221.560.152,038.453,464.4835,797.81100.0%
5122,04832x1,000.31304.760.122,914.055,269.1452,307.98100.0%
5122,04864x1,610.78470.200.074,275.017,930.4567,644.59100.0%
5122,048128x1,611.86477.120.0735,372.8075,858.06133,131.39100.0%
5122,048256x1,573.60458.870.0742,653.1886,277.22144,454.1557.0%
1,0245121x72.79223.160.281,722.881,722.884,365.84100.0%
1,0245122x170.06397.480.161,004.431,365.954,910.05100.0%
1,0245124x248.77685.310.111,898.443,025.045,701.40100.0%
1,0245128x386.371,071.690.082,455.143,469.727,297.65100.0%
1,02451216x606.631,563.440.063,045.614,288.0910,025.92100.0%
1,02451232x872.382,301.000.054,537.717,029.6913,589.60100.0%
1,02451264x1,187.443,313.880.037,120.0910,995.8718,830.60100.0%
1,024512128x1,161.493,245.050.0315,093.1329,057.9638,186.3199.2%
1,0241,0241x99.33114.750.411,510.341,510.348,496.57100.0%
1,0241,0242x184.95204.290.251,320.201,880.549,558.84100.0%
1,0241,0244x308.66347.810.171,702.102,605.4311,232.92100.0%
1,0241,0248x501.11561.950.131,997.093,233.8313,918.80100.0%
1,0241,02416x746.12837.430.092,273.103,155.2518,744.13100.0%
1,0241,02432x1,068.341,219.610.063,374.336,317.2925,714.48100.0%
1,0241,02464x1,546.701,848.220.044,379.298,036.1433,899.75100.0%
1,0241,024128x1,584.081,867.630.0420,516.9240,805.8967,035.27100.0%
1,0241,024256x1,450.861,726.100.0556,948.68118,993.23145,167.72100.0%
1,0242,0481x116.2757.630.50652.26652.2616,916.44100.0%
1,0242,0482x185.88100.810.332,280.963,450.4019,377.51100.0%
1,0242,0484x333.63171.580.221,615.292,823.5722,772.63100.0%
1,0242,0488x526.30275.090.171,792.322,983.9428,456.98100.0%
1,0242,04816x774.60413.310.122,590.494,551.8137,976.54100.0%
1,0242,04832x1,119.13598.350.093,540.205,843.8152,470.92100.0%
1,0242,04864x1,598.87899.250.064,588.268,476.3569,837.39100.0%
1,0242,048128x1,605.22915.740.0636,902.1877,829.31137,125.22100.0%
1,0242,048256x1,538.65873.760.0643,065.2994,225.70149,818.9755.9%
2,0481281x6.281,544.740.051,215.771,215.771,274.07100.0%
2,0481282x75.582,848.110.03898.58901.891,374.73100.0%
2,0481284x38.033,293.060.021,530.521,602.301,782.8575.0%
2,0485121x94.92452.310.14870.54870.544,351.26100.0%
2,0485122x141.65805.720.101,606.091,801.534,860.97100.0%
2,0485124x239.201,429.170.061,292.401,591.665,481.96100.0%
2,0485128x387.152,056.230.052,244.822,621.127,591.94100.0%
2,04851216x512.452,916.100.044,088.526,929.1910,666.54100.0%
2,04851232x656.174,004.860.036,476.6411,120.4015,537.00100.0%
2,04851264x936.365,436.380.0210,313.2813,270.3322,873.54100.0%
2,048512128x994.145,556.370.0218,248.3434,828.8044,698.53100.0%
2,048512256x951.475,536.220.0237,363.9479,310.1688,972.1999.2%
2,0481,0241x104.20219.080.26836.52836.528,982.26100.0%
2,0481,0242x185.96406.920.161,245.211,545.089,627.31100.0%
2,0481,0244x256.75719.570.111,632.962,424.0810,891.79100.0%
2,0481,0248x406.431,178.700.081,609.462,156.6813,268.91100.0%
2,0481,02416x592.431,741.700.062,449.364,231.9317,932.50100.0%
2,0481,02432x936.842,334.510.053,274.405,885.4126,739.32100.0%
2,0481,02464x1,454.503,607.180.035,151.419,245.4234,559.52100.0%
2,0481,024128x1,492.543,634.940.0320,821.4643,406.0968,569.22100.0%
2,0481,024256x1,469.223,603.670.0354,776.31112,557.72138,402.81100.0%
2,0482,0481x104.11106.930.411,171.991,171.9918,403.09100.0%
2,0482,0482x196.42199.470.251,167.621,523.5119,641.87100.0%
2,0482,0484x278.52358.680.171,147.961,447.0521,851.23100.0%
2,0482,0488x393.15611.380.121,621.842,252.7525,595.87100.0%
2,0482,04816x614.50878.480.092,547.344,868.7935,579.74100.0%
2,0482,04832x927.701,205.770.074,220.848,074.3451,777.13100.0%
2,0482,04864x1,490.771,790.940.054,970.4410,079.0269,699.68100.0%
2,0482,048128x1,526.691,772.690.0536,320.0680,568.87140,875.36100.0%
2,0482,048256x1,495.351,775.600.0543,596.6593,080.80151,209.8357.4%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA A100 80GB PCIe
GPU Count2
GPU Memory (Total)160 GB
GPU Driver570.195.03
CUDA VersionUnknown
Compute Capability8.0
Power Limit (per GPU)300 W
CPU ModelIntel Xeon Processor (Icelake)
RAM31 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.11.0
OSUbuntu
OS Version22.04.5 LTS (Jammy Jellyfish)
Kernel Version5.15.0-88-generic
Python Version3.10.12

Model Configuration

Provideropenai
Model Namegpt-oss-120b
QuantizationMXFP4

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length32768
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.90
Temperature0.70
Top-P1.00
Top-K-1