NVIDIA H20 (8x) - qwen3-coder-30b-a3b-instruct

October 24, 2025 at 08:18 PM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
8,987.38
Peak generation speed
Best Input TPS
30,699.04
Peak prefill speed
Best Energy Efficiency
0.02 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
58.88 ms
Lowest latency
Best E2E (P95)
988.69 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
1,0241,0241024x8,987.389,678.220.0413,549.2621,404.5458,999.03100.0%
1281281x100.9596.211.41357.83357.831,267.42100.0%
1281282x196.77192.160.75265.99378.741,287.84100.0%
1281284x245.68243.760.60577.58968.702,080.17100.0%
1281288x416.77417.990.39890.121,277.392,451.56100.0%
12812816x638.24644.490.28963.231,978.123,188.94100.0%
12812832x1,254.471,265.270.161,494.131,828.263,224.97100.0%
12812864x1,471.041,471.040.162,970.583,643.495,519.00100.0%
128128128x1,776.901,803.660.145,325.226,479.798,721.50100.0%
128128256x1,512.991,556.080.1316,522.1518,214.5120,671.62100.0%
1285121x144.8034.501.6864.1064.103,534.83100.0%
1285122x227.7557.801.03122.35126.984,310.54100.0%
1285124x452.48116.620.59215.59245.174,337.80100.0%
1285128x889.14233.780.39282.35303.864,386.70100.0%
12851216x1,431.43375.710.29449.31499.325,484.00100.0%
12851232x2,121.82554.620.22860.71918.977,371.21100.0%
12851264x3,241.76829.000.161,744.062,090.919,756.43100.0%
128512128x4,358.011,119.700.134,921.435,829.6014,402.60100.0%
128512256x4,419.111,158.490.1215,761.9917,502.1627,582.93100.0%
128512512x5,225.331,362.490.0925,257.1029,517.4441,742.47100.0%
1285121024x6,777.191,785.320.0930,828.3943,830.4156,140.29100.0%
1281,0241x145.0817.291.8978.4478.447,056.73100.0%
1281,0242x205.8235.291.32123.61125.036,889.54100.0%
1281,0244x435.0963.840.72163.54175.727,828.76100.0%
1281,0248x759.73121.500.48237.82248.388,446.06100.0%
1281,02416x1,357.39202.830.33457.93496.6910,079.49100.0%
1281,02432x2,167.79320.870.24837.11862.8612,765.08100.0%
1281,02464x3,507.73492.150.181,525.101,883.4616,497.78100.0%
1281,024128x4,997.05719.990.133,941.794,952.5722,283.27100.0%
1281,024256x6,294.38918.600.1112,991.5114,580.0334,715.29100.0%
1281,024512x6,466.53957.670.0925,426.5529,760.9759,956.53100.0%
1281,0241024x7,321.761,106.560.0849,795.4075,677.92102,843.78100.0%
1282,0481x144.559.632.0366.9066.9012,665.91100.0%
1282,0482x175.0117.691.6393.0293.5713,584.86100.0%
1282,0484x355.1835.600.91224.75239.6913,707.01100.0%
1282,0488x468.2255.320.71286.34329.0817,624.29100.0%
1282,04816x1,026.32127.180.40469.87491.9013,703.95100.0%
1282,04832x1,623.51195.390.29837.13897.0618,941.65100.0%
1282,04864x2,604.93283.250.231,650.041,973.3126,774.97100.0%
1282,048128x3,968.42435.810.164,026.854,953.7333,833.84100.0%
1282,048256x5,769.65649.680.1212,932.5214,410.7847,957.70100.0%
1282,048512x6,135.59690.660.1027,069.5729,897.4182,594.73100.0%
1282,0481024x8,870.321,014.910.0712,239.4524,975.9679,239.08100.0%
5121281x126.48490.120.4696.2896.281,011.83100.0%
5121282x258.851,006.070.24120.15122.19988.69100.0%
5121284x400.001,545.310.18202.81240.451,276.81100.0%
5121288x619.112,388.150.12312.14371.371,650.91100.0%
51212816x902.603,497.580.09603.971,088.792,260.57100.0%
51212832x1,282.585,000.310.071,035.781,586.033,177.34100.0%
51212864x1,981.267,758.030.061,747.702,283.074,079.58100.0%
512128128x2,037.237,953.020.054,173.565,677.897,891.70100.0%
512128256x1,683.616,690.910.0514,010.0215,425.4518,471.51100.0%
5125121x143.18139.521.1358.8858.883,554.71100.0%
5125122x267.15259.590.62110.25129.393,818.87100.0%
5125124x531.48514.570.36181.43191.963,841.51100.0%
5125128x897.19867.750.24285.21326.914,545.70100.0%
51251216x1,490.601,462.590.18456.84510.385,416.54100.0%
51251232x1,801.031,787.150.15917.871,030.928,910.64100.0%
51251264x3,224.563,182.230.111,779.752,147.569,902.88100.0%
512512128x4,536.844,454.270.084,009.994,927.0514,207.16100.0%
512512256x4,793.074,759.530.0713,710.9115,211.6826,199.92100.0%
512512512x4,950.744,918.400.0623,814.5829,779.4643,340.15100.0%
5125121024x3,551.453,599.550.0851,912.7161,944.1376,988.1261.2%
5121,0241x136.3583.391.3961.6261.625,939.74100.0%
5121,0242x266.85133.630.8378.8880.997,429.38100.0%
5121,0244x443.11263.210.51158.60178.607,460.75100.0%
5121,0248x668.63400.280.37264.82281.859,746.30100.0%
5121,02416x1,095.53671.520.27454.82505.6011,807.22100.0%
5121,02432x2,084.311,150.090.18977.651,060.9713,851.74100.0%
5121,02464x3,363.231,815.020.141,556.761,939.9217,271.95100.0%
5121,024128x5,133.172,834.190.104,150.305,093.8021,847.95100.0%
5121,024256x5,942.203,264.390.0814,806.9716,445.9637,510.74100.0%
5121,024512x7,881.264,348.080.066,894.7914,102.1547,299.46100.0%
5121,0241024x6,752.883,792.250.0662,981.8388,879.02116,021.9797.3%
5122,0481x136.4584.281.4197.2897.285,877.02100.0%
5122,0482x230.2190.861.04114.41127.9110,761.82100.0%
5122,0484x276.05125.470.82194.61234.0114,850.37100.0%
5122,0488x480.60258.460.51303.32347.7514,007.55100.0%
5122,04816x1,026.05535.490.30485.76530.6312,998.07100.0%
5122,04832x1,527.40663.550.24894.79968.1420,960.94100.0%
5122,04864x2,646.111,119.460.171,595.681,982.2125,318.38100.0%
5122,048128x3,761.001,531.670.134,073.165,157.9239,091.19100.0%
5122,048256x5,495.342,349.420.1015,633.7417,043.3851,384.96100.0%
5122,048512x6,657.922,727.580.0828,838.8433,077.7488,549.94100.0%
5122,0481024x5,744.552,435.670.0856,422.5376,924.96121,377.8667.6%
1,0241281x120.41917.220.29147.73147.731,062.05100.0%
1,0241282x235.511,797.610.15171.42173.821,083.88100.0%
1,0241284x440.623,364.030.10241.87247.351,158.38100.0%
1,0241288x620.154,753.030.07352.11415.171,644.04100.0%
1,02412816x1,064.458,165.800.05596.16667.061,919.21100.0%
1,02412832x1,251.009,631.320.051,025.031,378.313,247.01100.0%
1,02412864x1,966.6215,216.260.031,636.982,047.483,960.91100.0%
1,024128128x2,125.7416,497.180.034,288.405,295.087,370.80100.0%
1,024128256x1,558.9112,227.150.0315,708.7917,452.3020,224.54100.0%
1,0245121x134.07255.300.84137.40137.403,818.64100.0%
1,0245122x270.97517.070.41103.91106.743,768.28100.0%
1,0245124x529.881,011.380.25178.04179.643,860.35100.0%
1,0245128x720.451,379.780.19371.64420.715,670.60100.0%
1,02451216x1,253.442,426.410.14529.45596.976,461.38100.0%
1,02451232x2,169.004,195.300.09925.211,123.777,470.39100.0%
1,02451264x3,157.126,220.110.071,589.381,962.1110,021.42100.0%
1,024512128x4,171.098,185.270.064,384.645,389.6815,171.94100.0%
1,024512256x4,530.678,888.040.0514,517.0916,391.2627,651.47100.0%
1,024512512x4,798.659,451.390.0427,802.1031,532.0346,435.57100.0%
1,0245121024x4,165.118,326.960.0555,657.9565,893.2380,264.8978.7%
1,0241,0241x142.88136.041.11135.54135.547,166.83100.0%
1,0241,0242x205.29258.290.69162.82164.787,383.72100.0%
1,0241,0244x524.27511.450.38235.81238.777,581.12100.0%
1,0241,0248x809.08928.850.25344.19405.218,389.74100.0%
1,0241,02416x1,052.961,097.750.22567.71719.2514,301.49100.0%
1,0241,02432x2,051.102,149.740.14928.061,092.7414,596.95100.0%
1,0241,02464x3,329.703,502.060.111,671.522,112.6917,802.89100.0%
1,0241,024128x4,659.124,919.230.084,096.335,201.2225,327.47100.0%
1,0241,024256x5,915.936,140.880.0613,927.6915,846.0240,285.21100.0%
1,0241,024512x6,705.486,983.050.0525,274.9730,634.4865,183.42100.0%
1,0242,0481x143.21103.431.29114.30114.309,418.16100.0%
1,0242,0482x277.42210.360.70152.25152.909,277.09100.0%
1,0242,0484x451.01274.950.53204.23206.0914,164.14100.0%
1,0242,0488x825.70638.280.31337.98409.1912,180.72100.0%
1,0242,04816x1,072.31761.930.26562.80661.7920,606.89100.0%
1,0242,04832x1,615.041,196.170.191,039.981,293.2326,203.05100.0%
1,0242,04864x2,862.242,159.940.141,618.262,024.2027,331.83100.0%
1,0242,048128x4,342.443,372.210.104,378.975,483.9834,930.26100.0%
1,0242,048256x5,533.534,302.690.0815,978.0117,706.6754,805.90100.0%
1,0242,048512x6,554.265,138.700.0630,800.5136,045.8290,565.54100.0%
1,0242,0481024x7,038.705,563.110.0668,518.77104,970.44149,591.7892.8%
2,0481281x116.151,785.840.16176.53176.531,101.46100.0%
2,0481282x227.353,480.460.09202.69202.721,121.77100.0%
2,0481284x428.816,565.330.05262.66264.681,190.11100.0%
2,0481288x677.2510,351.850.04385.84467.031,504.84100.0%
2,04812816x986.9615,111.590.03603.06768.132,058.60100.0%
2,04812832x1,171.0517,898.170.031,253.931,614.513,485.03100.0%
2,04812864x1,897.0729,077.830.021,676.882,118.524,227.04100.0%
2,048128128x1,990.4330,699.040.024,278.825,568.208,010.61100.0%
2,048128256x1,478.2223,130.700.0214,712.4317,271.2020,999.90100.0%
2,0485121x137.41528.180.46170.47170.473,724.57100.0%
2,0485122x268.481,027.530.25149.77200.033,808.80100.0%
2,0485124x504.441,934.600.16240.78270.284,040.70100.0%
2,0485128x875.143,410.760.10276.51289.604,582.74100.0%
2,04851216x1,341.715,538.420.07483.74575.135,636.16100.0%
2,04851232x2,031.578,062.360.06996.231,166.917,742.34100.0%
2,04851264x2,973.9411,648.450.051,724.612,279.5310,664.51100.0%
2,048512128x3,773.3714,697.040.045,125.666,118.2516,809.62100.0%
2,048512256x4,269.7916,892.610.0314,551.3816,682.4229,144.83100.0%
2,048512512x4,284.8216,885.130.0328,654.9533,831.5250,832.77100.0%
2,0485121024x5,010.8619,754.350.0351,550.4566,020.7683,404.0499.3%
2,0481,0241x132.36350.120.64171.29171.295,612.67100.0%
2,0481,0242x201.29485.140.46286.93339.327,929.48100.0%
2,0481,0244x365.07993.660.26227.12271.467,870.78100.0%
2,0481,0248x607.911,854.060.17410.84507.627,997.20100.0%
2,0481,02416x969.532,655.210.13684.48887.7811,574.09100.0%
2,0481,02432x1,838.984,203.410.091,235.601,569.6014,859.29100.0%
2,0481,02464x2,635.555,895.280.071,805.722,676.2021,160.98100.0%
2,0481,024128x4,317.609,196.180.064,588.545,753.5026,931.35100.0%
2,0481,024256x5,412.7311,663.650.0414,586.0316,318.7341,571.09100.0%
2,0481,024512x6,106.2413,150.440.0428,453.7332,802.8669,524.37100.0%
2,0481,0241024x6,979.1415,088.300.0451,309.9577,584.28111,507.41100.0%
2,0482,0481x131.34391.640.60174.16174.165,016.19100.0%
2,0482,0482x249.50878.500.30208.02209.284,422.84100.0%
2,0482,0484x412.351,071.050.26294.36368.507,300.46100.0%
2,0482,0488x616.331,753.330.18436.72592.468,534.30100.0%
2,0482,04816x1,044.742,451.100.14620.01771.8112,028.82100.0%
2,0482,04832x1,631.373,033.790.121,201.281,395.1618,825.83100.0%
2,0482,04864x2,371.714,126.990.101,797.572,371.4926,999.84100.0%
2,0482,048128x3,694.756,039.810.074,382.025,457.3636,439.70100.0%
2,0482,048256x5,160.218,538.170.0614,468.5216,590.1956,393.14100.0%
2,0482,048512x6,284.5110,354.290.0528,222.3732,373.6390,943.42100.0%
2,0482,0481024x6,549.3410,651.700.0467,147.69103,330.82146,263.0489.2%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H20
GPU Count8
GPU Memory (Total)760 GB
GPU Driver570.195.03
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)500 W
CPU ModelIntel(R) Xeon(R) Platinum 8469C
RAM1,007 GB

Software Configuration

Inference FrameworkvLLM
Framework Versionv0.11.0
OSUbuntu
OS Version24.04.3 LTS (Noble Numbat)
Kernel Version6.8.0-79-generic
Python Version3.12.3

Model Configuration

Providerqwen
Model Nameqwen3-coder-30b-a3b-instruct
QuantizationBF16

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length253600
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization85.00
Temperature0.70
Top-P1.00
Top-K-1