NVIDIA H200 NVL (2x) - qwen3-coder-30b-a3b-instruct

November 5, 2025 at 03:39 AM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
5,757.76
Peak generation speed
Best Input TPS
43,900.39
Peak prefill speed
Best Energy Efficiency
0.01 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
82.49 ms
Lowest latency
Best E2E (P95)
954.88 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
128512128x5,757.761,486.570.03493.77629.4710,701.96100.0%
1281281x134.03127.750.21128.81128.81954.88100.0%
1281282x208.13203.250.20253.90255.451,225.81100.0%
1281284x192.99191.480.19946.991,479.462,643.11100.0%
1285121x143.6634.230.4382.4982.493,564.10100.0%
1285122x164.9540.270.391,471.822,713.416,070.86100.0%
1285124x326.2282.700.25831.361,521.996,139.30100.0%
1285128x750.30206.140.14279.12317.204,966.96100.0%
12851216x1,166.52297.950.11326.97390.076,892.98100.0%
12851232x1,928.43490.140.07383.48434.508,329.05100.0%
12851264x3,335.62858.950.04449.03581.709,305.04100.0%
128512256x5,202.771,441.040.034,105.998,659.6518,949.19100.0%
1281,0241x142.6416.990.54102.17102.177,175.82100.0%
1281,0242x190.1934.060.4198.81102.147,097.67100.0%
1281,0244x434.4561.270.24143.88149.218,281.27100.0%
1281,0248x695.95111.680.17198.39214.499,187.55100.0%
1281,02416x982.33148.770.15260.38301.0313,825.84100.0%
1281,02432x1,884.21276.960.09351.29383.7814,707.82100.0%
1281,02464x3,198.38454.900.06404.84478.1117,704.76100.0%
1281,024128x4,061.97594.610.04625.57578.0523,212.89100.0%
1281,024256x4,502.16679.270.048,803.6520,684.6344,669.17100.0%
1281,024512x4,241.25639.380.0428,303.6766,828.2588,046.56100.0%
1282,0481x148.898.870.55105.09105.0913,752.45100.0%
1282,0482x169.8516.740.46157.95164.6714,380.97100.0%
1282,0484x328.5736.700.30159.86185.0313,478.03100.0%
1282,0488x428.9452.390.24213.36254.0318,518.08100.0%
1282,04816x759.7792.930.17245.32272.2319,600.19100.0%
1282,04832x1,328.57157.190.11324.03380.3724,952.58100.0%
1282,04864x2,321.36252.980.08371.72428.5131,967.57100.0%
1282,048128x3,879.42446.610.05358.25517.6534,083.30100.0%
1282,048256x4,467.94512.550.049,849.2828,187.1154,516.44100.0%
1282,048512x4,581.34530.480.0431,182.1972,187.2796,798.13100.0%
1282,0481024x5,149.59596.710.0453,646.79115,093.20142,290.1676.9%
5121281x117.86456.720.11218.04218.041,086.25100.0%
5121282x195.57760.120.09286.50312.261,302.18100.0%
5121284x354.081,367.910.06227.99278.391,437.97100.0%
5121288x552.022,129.380.04281.17343.131,847.20100.0%
51212816x941.183,647.060.03390.21452.742,153.83100.0%
51212832x1,682.306,568.310.02405.57497.652,377.00100.0%
51212864x2,762.2010,796.070.01442.43561.712,839.28100.0%
512128128x3,294.4512,891.050.01654.452,023.023,987.95100.0%
512128256x3,888.8216,613.340.011,300.912,578.645,434.03100.0%
512128512x4,367.0218,962.240.012,150.823,738.286,712.89100.0%
5121281024x4,088.0917,959.380.014,767.5911,615.3014,530.24100.0%
5125121x143.06139.400.2991.2291.223,557.98100.0%
5125122x288.53280.360.18101.45106.533,542.75100.0%
5125124x501.72485.280.12141.98152.624,066.35100.0%
5125128x646.54624.410.12193.70223.586,312.98100.0%
51251216x1,313.161,301.840.06246.36262.916,079.61100.0%
51251232x1,827.981,819.540.05306.18336.138,696.65100.0%
51251264x3,344.783,286.920.03390.92474.779,467.44100.0%
512512128x3,882.783,858.170.02843.13687.8510,671.52100.0%
512512256x4,807.374,894.960.024,561.619,356.0219,674.74100.0%
512512512x5,171.425,397.510.0212,466.3925,082.7834,806.35100.0%
5125121024x5,633.485,767.310.0228,939.5259,242.9769,382.11100.0%
5121,0241x134.4673.050.41201.43201.436,780.82100.0%
5121,0242x255.59127.220.25243.09250.567,804.94100.0%
5121,0244x416.57258.870.17206.64285.687,563.01100.0%
5121,0248x599.56323.270.12329.48359.0812,197.99100.0%
5121,02416x1,071.83632.000.10313.57387.2612,521.67100.0%
5121,02432x1,725.19966.980.07440.19556.2516,447.79100.0%
5121,02464x3,225.171,725.870.04414.15476.4718,298.14100.0%
5121,024128x3,260.341,760.740.042,375.8714,069.7733,771.17100.0%
5121,024256x4,120.492,280.240.0310,776.5523,791.4549,423.24100.0%
5121,024512x4,397.002,422.950.0330,739.2370,046.5293,392.58100.0%
5121,0241024x4,734.182,632.870.0356,191.92117,581.96137,453.5281.7%
5122,0481x132.5232.300.51221.85221.8515,354.89100.0%
5122,0482x239.4658.530.32206.10255.9716,994.49100.0%
5122,0484x286.12135.680.24178.19239.3213,353.54100.0%
5122,0488x526.66266.640.16300.76364.6814,611.83100.0%
5122,04816x719.68364.220.13316.35377.3518,471.23100.0%
5122,04832x1,354.75619.970.09362.22434.4525,618.40100.0%
5122,04864x2,243.39947.100.06417.80550.0733,370.05100.0%
5122,048128x3,951.981,639.730.04669.49695.1438,446.76100.0%
5122,048256x4,494.841,897.630.0411,854.4532,588.1859,365.29100.0%
5122,048512x4,909.462,028.850.0334,662.4178,679.71105,345.13100.0%
5122,0481024x5,160.962,172.260.0353,636.04113,801.72140,583.8869.9%
1,0241281x115.52879.960.07209.67209.671,107.75100.0%
1,0241282x226.351,727.670.05161.52219.621,114.26100.0%
1,0241284x366.762,800.140.03263.01308.721,391.45100.0%
1,0241288x532.544,077.560.02277.53348.781,909.30100.0%
1,02412816x922.527,077.030.02372.12462.472,178.61100.0%
1,02412832x1,540.4611,828.000.01495.76577.762,595.46100.0%
1,02412864x2,428.7818,704.730.01584.78875.543,235.33100.0%
1,024128128x3,099.7723,911.820.01731.421,015.323,829.35100.0%
1,024128256x3,577.1429,234.080.011,852.573,353.466,423.26100.0%
1,024128512x3,557.0630,096.540.013,884.777,795.6710,874.68100.0%
1,0245121x146.12278.250.20107.98107.983,503.16100.0%
1,0245122x256.26488.990.12158.32160.063,992.28100.0%
1,0245124x426.49814.040.10155.27178.934,794.60100.0%
1,0245128x833.931,644.550.06211.06218.864,745.31100.0%
1,02451216x1,141.262,253.120.05252.99283.946,960.24100.0%
1,02451232x1,880.753,634.860.03316.00372.478,574.51100.0%
1,02451264x3,198.996,224.420.02400.03507.939,908.29100.0%
1,024512128x4,101.377,986.800.02718.23666.4111,139.81100.0%
1,024512256x4,462.088,993.170.024,993.4210,084.2721,066.47100.0%
1,024512512x5,487.2611,172.170.0112,874.9925,033.5136,201.67100.0%
1,0245121024x4,756.829,676.730.0234,935.5271,192.3782,059.91100.0%
1,0241,0241x129.49123.650.341,200.151,200.157,876.16100.0%
1,0241,0242x240.92258.950.18202.15249.257,468.33100.0%
1,0241,0244x443.48423.240.14274.50329.109,226.33100.0%
1,0241,0248x641.59659.790.12328.28407.0511,850.99100.0%
1,0241,02416x1,057.411,056.200.08434.12542.9414,844.82100.0%
1,0241,02432x1,928.261,989.810.05459.25554.1015,723.36100.0%
1,0241,02464x2,925.823,008.470.04541.50828.7620,751.47100.0%
1,0241,024128x3,947.764,191.900.03805.591,023.9226,830.26100.0%
1,0241,024256x4,056.084,219.290.0312,278.3031,908.8951,847.36100.0%
1,0241,024512x4,166.944,427.800.0233,657.2779,396.81104,335.86100.0%
1,0241,0241024x4,499.674,825.650.0254,729.56116,126.51138,196.7575.4%
1,0242,0481x147.00101.800.35219.57219.579,568.67100.0%
1,0242,0482x235.69188.660.22203.13251.0710,220.65100.0%
1,0242,0484x370.87263.250.18245.16302.5414,109.17100.0%
1,0242,0488x627.12467.470.13346.81381.6716,367.48100.0%
1,0242,04816x894.99639.700.12333.27414.1324,506.67100.0%
1,0242,04832x1,572.601,151.170.07516.14575.6526,233.92100.0%
1,0242,04864x2,356.831,821.460.05574.82817.5934,204.67100.0%
1,0242,048128x3,560.932,785.260.03823.96936.6444,479.34100.0%
1,0242,048256x3,945.153,058.420.0315,213.7441,207.1776,537.63100.0%
1,0242,048512x4,226.183,320.790.0345,944.78102,654.18135,304.97100.0%
1,0242,0481024x4,711.353,741.150.0359,438.33120,170.37148,630.5764.5%
2,0481281x121.901,874.290.03227.34227.341,049.48100.0%
2,0481282x219.933,366.840.02199.02242.671,159.23100.0%
2,0481284x384.675,889.560.02246.51320.701,323.11100.0%
2,0481288x538.388,229.230.01352.99402.961,891.24100.0%
2,04812816x853.5113,055.510.01435.60589.712,375.97100.0%
2,04812832x1,412.0121,591.440.01609.36833.072,855.50100.0%
2,04812864x1,971.3030,185.960.01781.251,455.893,978.74100.0%
2,048128128x2,845.1443,900.390.011,049.351,939.874,932.09100.0%
2,048128256x2,776.9843,837.990.012,797.107,584.139,969.20100.0%
2,0485121x140.81541.250.13101.89101.893,635.23100.0%
2,0485122x240.60920.820.08166.90172.954,249.51100.0%
2,0485124x391.561,778.360.06154.58175.394,398.58100.0%
2,0485128x580.352,377.280.05223.80250.486,558.00100.0%
2,04851216x1,111.554,715.250.03281.96348.126,608.77100.0%
2,04851232x1,834.337,075.880.02329.85364.808,783.25100.0%
2,04851264x3,109.9912,214.620.01467.53532.6210,095.06100.0%
2,048512128x3,667.4314,417.690.01984.19846.3912,271.93100.0%
2,048512256x4,534.5818,067.490.015,885.5911,776.6223,670.08100.0%
2,048512512x4,150.0517,041.990.0116,858.5135,139.4547,429.90100.0%
2,0481,0241x145.81534.350.12128.40128.403,674.99100.0%
2,0481,0242x225.07557.550.13207.25209.186,872.23100.0%
2,0481,0244x396.93938.910.09187.26195.848,143.50100.0%
2,0481,0248x567.841,498.800.07270.36325.0810,000.47100.0%
2,0481,02416x930.162,398.110.05314.82370.9013,017.94100.0%
2,0481,02432x1,558.393,697.670.04533.54756.7916,838.96100.0%
2,0481,02464x2,925.956,485.180.02601.40856.1319,158.24100.0%
2,0481,024128x4,031.908,539.850.021,202.361,986.8725,752.55100.0%
2,0481,024256x4,497.719,635.050.0211,806.8826,556.9950,803.71100.0%
2,0481,024512x4,474.329,696.820.0231,981.7771,061.6992,574.29100.0%
2,0482,0481x145.00263.490.21196.13196.137,458.81100.0%
2,0482,0482x239.08679.440.11161.35202.775,672.02100.0%
2,0482,0484x399.09831.810.10270.05298.149,236.76100.0%
2,0482,0488x467.931,131.990.08332.57434.2212,925.77100.0%
2,0482,04816x688.341,478.030.07429.83562.9017,377.08100.0%
2,0482,04832x1,306.112,354.810.05601.21886.4125,161.69100.0%
2,0482,04864x2,235.083,862.450.04684.831,176.0331,183.75100.0%
2,0482,048128x3,457.255,554.890.031,112.831,760.9541,830.10100.0%
2,0482,048256x3,886.376,398.250.0213,867.0137,696.6168,750.12100.0%
2,0482,048512x3,851.556,393.740.0241,952.86100,011.70129,410.34100.0%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H200 NVL
GPU Count2
GPU Memory (Total)280 GB
GPU Driver580.95.05
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)600 W
CPU ModelIntel(R) Xeon(R) 6960P
RAM2,267 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.11.0
OSUbuntu
OS Version22.04.5 LTS (Jammy Jellyfish)
Kernel Version5.15.0-88-generic
Python Version3.10.12

Model Configuration

Providerqwen
Model Nameqwen3-coder-30b-a3b-instruct
QuantizationFP16

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length8192
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.90
Temperature0.70
Top-P1.00
Top-K-1