NVIDIA A100-SXM4-80GB (1x) - qwen3-8b

March 9, 2026 at 01:06 AM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
5,547.95
Peak generation speed
Best Input TPS
42,519.57
Peak prefill speed
Best Energy Efficiency
0.00 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
19.63 ms
Lowest latency
Best E2E (P95)
1,474.61 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
128512512x5,547.951,388.050.0184.01125.769,123.15100.0%
1281281x86.8486.840.2937.7037.701,474.61100.0%
1281282x171.01169.000.1522.1422.631,496.49100.0%
1281284x333.99332.030.0840.7645.981,531.33100.0%
1281288x121.90120.360.136,912.736,914.808,397.22100.0%
1285121x88.1722.040.6519.6319.635,807.12100.0%
1285122x171.7042.420.3422.8023.175,963.30100.0%
1285124x338.6884.170.1731.4135.676,044.00100.0%
1285128x673.67167.300.0933.8136.616,039.33100.0%
12851216x1,267.78320.280.0577.98100.226,372.76100.0%
12851232x2,302.81582.480.03103.76176.087,017.05100.0%
12851264x4,043.791,030.610.02149.77316.777,913.26100.0%
128512128x5,353.031,354.830.02184.38372.099,364.86100.0%
128512256x5,507.071,391.060.0289.86118.189,103.71100.0%
1285121024x5,501.521,388.950.0198.46138.799,118.20100.0%
1281,0241x88.1111.010.8220.4220.4211,621.57100.0%
1281,0242x171.1221.140.4029.5132.0111,965.66100.0%
1281,0244x299.2741.940.2332.2734.9812,131.72100.0%
1281,0248x628.8581.880.1129.4331.9312,342.53100.0%
1281,02416x1,222.12157.530.0638.8143.1212,962.14100.0%
1281,02432x2,171.25279.820.0451.5961.1914,637.15100.0%
1281,02464x3,802.73485.860.0268.7690.2716,842.71100.0%
1281,024128x4,836.75628.010.0286.05124.5920,263.86100.0%
1281,024256x4,858.06626.680.0292.19124.8220,319.78100.0%
1281,024512x4,859.92625.830.0295.09136.1820,349.06100.0%
1281,0241024x4,857.65626.350.0294.59138.9420,328.76100.0%
1282,0481x87.535.470.9120.0220.0223,397.14100.0%
1282,0482x169.6310.480.4522.2722.7424,145.74100.0%
1282,0484x273.0620.710.2830.0532.0724,569.88100.0%
1282,0488x570.1839.550.1431.1433.6725,556.56100.0%
1282,04816x1,123.7674.820.0741.1446.8527,306.58100.0%
1282,04832x1,861.67128.760.0554.6565.2831,844.37100.0%
1282,04864x3,099.51216.350.0366.8791.3537,901.00100.0%
1282,048128x3,854.18268.410.0389.22122.9047,560.64100.0%
1282,048256x3,876.65266.250.03132.84209.0247,880.19100.0%
1282,048512x3,876.41267.850.03105.25135.4347,562.84100.0%
5121281x86.20333.330.1345.1645.161,484.97100.0%
5121282x167.76649.410.0745.8546.261,525.97100.0%
5121284x321.201,248.430.0388.7789.771,593.06100.0%
5121288x609.892,360.330.02115.87163.491,675.66100.0%
51212816x1,084.754,192.800.01178.45287.921,878.68100.0%
51212832x1,740.026,734.490.01294.88551.352,328.99100.0%
51212864x2,618.9310,144.820.01541.831,067.423,086.36100.0%
512128128x3,428.8813,300.030.01493.181,228.153,659.14100.0%
512128256x4,760.1318,463.740.00140.78187.512,608.37100.0%
512128512x4,760.1318,463.740.00147.20197.672,610.96100.0%
5121281024x4,746.0118,408.970.00145.31200.102,612.14100.0%
5125121x87.9084.980.4422.4522.455,825.11100.0%
5125122x170.61165.110.2224.4924.896,000.56100.0%
5125124x333.66324.210.1138.9642.106,136.61100.0%
5125128x660.43638.990.0638.6542.536,198.69100.0%
51251216x1,235.411,193.790.0351.6758.596,619.37100.0%
51251232x2,145.182,079.080.0279.9194.207,605.25100.0%
51251264x3,654.563,557.910.01114.39149.438,862.51100.0%
512512128x4,683.934,542.040.01136.56167.4810,838.07100.0%
512512256x4,670.514,542.870.01153.89209.1810,843.29100.0%
5121,0241x87.7742.430.6323.0423.0411,667.05100.0%
5121,0242x169.8982.210.3131.8135.3012,054.61100.0%
5121,0244x298.00161.890.1739.0041.2812,290.08100.0%
5121,0248x643.12314.850.0941.9546.8712,581.71100.0%
5121,02416x1,185.00583.220.0549.1853.8413,557.86100.0%
5121,02432x2,014.80985.820.0381.11102.8316,059.95100.0%
5121,02464x3,302.371,658.030.02122.66150.5319,084.18100.0%
5121,024128x4,130.412,101.540.02148.45209.9723,527.05100.0%
5121,024256x4,162.512,094.100.02164.54221.3423,617.98100.0%
5121,024512x4,155.132,096.310.02149.72207.6623,595.44100.0%
5122,0481x87.1621.070.7822.4522.4523,496.22100.0%
5122,0482x168.4940.770.3931.0234.0624,307.59100.0%
5122,0484x268.9780.080.2333.0737.5524,846.55100.0%
5122,0488x560.33151.210.1237.6741.1326,202.42100.0%
5122,04816x1,078.90278.570.0749.3557.7928,400.97100.0%
5122,04832x1,757.80463.440.0481.75101.9034,184.05100.0%
5122,04864x2,735.04760.290.03125.49163.1141,677.79100.0%
5122,048128x3,389.80936.420.03197.07315.6552,928.36100.0%
5122,048256x3,404.10933.480.03159.83212.9953,096.28100.0%
5122,048512x3,397.89937.940.03141.62195.0852,832.96100.0%
1,0241281x83.82636.540.0881.7581.751,527.65100.0%
1,0241282x162.541,235.560.0482.7983.171,574.47100.0%
1,0241284x304.402,332.340.02151.62152.941,681.81100.0%
1,0241288x554.414,254.470.01192.91289.151,843.60100.0%
1,02412816x911.447,010.240.01302.77551.202,235.93100.0%
1,02412832x1,336.8110,260.770.01559.951,053.503,041.93100.0%
1,02412864x1,828.5714,042.860.011,008.832,068.594,434.61100.0%
1,024128128x2,369.4918,191.230.00909.392,392.505,320.06100.0%
1,024128256x3,834.6329,439.480.00223.51327.733,253.03100.0%
1,024128512x3,855.4229,599.100.00218.96306.823,239.88100.0%
1,0241281024x3,820.9029,334.030.00220.70327.863,271.18100.0%
1,0245121x87.48166.070.3023.9323.935,853.22100.0%
1,0245122x169.20321.550.1528.1228.446,051.57100.0%
1,0245124x328.31628.890.0842.3346.246,235.62100.0%
1,0245128x625.821,225.130.0448.4853.556,409.45100.0%
1,02451216x1,153.172,242.280.0267.6277.927,010.57100.0%
1,02451232x1,897.103,688.720.0294.52123.118,495.69100.0%
1,02451264x3,079.266,036.460.01163.16215.2110,373.35100.0%
1,024512128x3,852.277,437.300.01216.74310.7013,125.36100.0%
1,024512256x3,844.567,447.440.01219.71308.3613,107.70100.0%
1,0241,0241x87.19108.100.4124.5824.588,980.59100.0%
1,0241,0242x136.53162.300.2734.7638.6011,752.48100.0%
1,0241,0244x295.15314.290.1334.3635.6612,479.66100.0%
1,0241,0248x586.51600.810.0756.4970.8113,074.14100.0%
1,0241,02416x1,052.481,109.610.0466.1781.0514,184.65100.0%
1,0241,02432x1,762.371,769.520.03106.04132.4017,741.33100.0%
1,0241,02464x2,830.122,849.280.02164.03225.4222,024.95100.0%
1,0241,024128x3,471.253,511.740.02222.90295.8227,896.97100.0%
1,0241,024256x3,487.413,501.230.02215.45309.5027,974.78100.0%
1,0241,024512x3,473.123,505.480.02238.31320.0427,950.20100.0%
1,0242,0481x87.25185.570.2825.9725.975,226.10100.0%
1,0242,0482x114.60100.550.3928.3128.7118,702.80100.0%
1,0242,0484x246.01157.050.2035.2638.3724,965.16100.0%
1,0242,0488x496.94297.290.1148.6758.6626,105.49100.0%
1,0242,04816x918.44544.350.0667.1583.8328,914.26100.0%
1,0242,04832x1,529.16866.920.04108.04135.9336,241.00100.0%
1,0242,04864x2,403.051,336.110.03157.27205.2247,028.94100.0%
1,0242,048128x2,909.651,612.320.02313.47613.4060,857.59100.0%
1,0242,048256x2,910.681,623.260.02226.45325.6460,448.14100.0%
1,0242,048512x2,938.131,643.180.02219.71302.4359,714.41100.0%
1,0242,0481024x2,909.971,629.340.02229.72322.4860,217.70100.0%
2,0481281x79.311,209.420.04148.05148.051,614.24100.0%
2,0481282x152.112,317.880.02107.29158.341,681.81100.0%
2,0481284x274.094,179.340.01233.72292.551,865.93100.0%
2,0481288x461.687,046.440.01496.99565.722,215.87100.0%
2,04812816x693.3010,585.990.01948.971,084.162,949.78100.0%
2,04812832x903.4013,798.190.011,286.292,104.544,515.01100.0%
2,04812864x1,120.8117,122.450.011,828.834,027.677,249.50100.0%
2,048128128x1,440.7922,015.980.001,681.784,856.828,796.40100.0%
2,048128256x2,758.0342,143.930.00387.88600.764,555.05100.0%
2,048128512x2,782.6142,519.570.00368.29563.234,511.69100.0%
2,0481281024x2,781.4042,501.090.00385.32570.984,515.25100.0%
2,0485121x86.15328.450.1933.5733.575,943.47100.0%
2,0485122x166.15632.970.0933.9034.296,162.27100.0%
2,0485124x318.951,215.850.0548.3949.496,418.79100.0%
2,0485128x602.182,297.710.0374.1483.006,798.35100.0%
2,04851216x1,050.264,009.100.02111.38129.997,792.68100.0%
2,04851232x1,574.946,034.050.01160.60205.9010,344.69100.0%
2,04851264x2,418.489,236.700.01262.27367.5113,495.27100.0%
2,048512128x2,864.1211,014.190.01387.74580.7617,669.75100.0%
2,048512256x2,859.0711,016.670.01362.64537.0117,638.09100.0%
2,0481,0241x86.10164.130.3333.3333.3311,892.45100.0%
2,0481,0242x165.84315.900.1633.6134.0012,347.96100.0%
2,0481,0244x317.45605.050.0946.3247.5712,901.18100.0%
2,0481,0248x597.081,139.140.0565.3781.6413,714.43100.0%
2,0481,02416x1,033.501,972.560.03110.71126.3215,843.39100.0%
2,0481,02432x1,494.972,916.320.02175.14219.4021,427.49100.0%
2,0481,02464x2,273.104,438.190.02276.12375.7128,142.55100.0%
2,0481,024128x2,672.865,257.230.01453.21826.7437,109.87100.0%
2,0481,024256x2,663.275,322.470.01392.23570.5736,660.50100.0%
2,0482,0481x85.6981.670.5130.8530.8523,901.04100.0%
2,0482,0482x160.28169.720.2534.2034.6822,906.78100.0%
2,0482,0484x276.07303.560.1448.0354.9725,318.85100.0%
2,0482,0488x553.66558.160.0872.7581.8227,993.72100.0%
2,0482,04816x947.44964.560.0593.34113.1332,405.61100.0%
2,0482,04832x1,339.211,444.730.03175.49216.3043,276.64100.0%
2,0482,04864x1,967.342,105.810.03662.561,134.2659,367.86100.0%
2,0482,048128x2,216.632,424.660.021,768.514,713.1580,575.73100.0%
2,0482,048256x2,306.962,542.380.02884.501,638.6076,841.94100.0%
2,0482,048512x2,304.742,553.330.02834.351,481.7676,516.07100.0%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA A100-SXM4-80GB
GPU Count1
GPU Memory (Total)80 GB
GPU Driver570.211.01
CUDA VersionUnknown
Compute Capability8.0
Power Limit (per GPU)400 W
CPU ModelAMD EPYC-Milan Processor
RAM92 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.17.0
OSUbuntu
OS Version24.04.4 LTS (Noble Numbat)
Kernel Version6.14.0-29-generic
Python Version3.12.3

Model Configuration

Providerqwen
Model Nameqwen3-8b
QuantizationFP16

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length32768
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.85
Temperature0.70
Top-P1.00
Top-K-1