NVIDIA H20 (8x) - gemma-3-27b-it

October 24, 2025 at 08:17 PM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
6,567.80
Peak generation speed
Best Input TPS
12,358.95
Peak prefill speed
Best Energy Efficiency
0.05 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
57.78 ms
Lowest latency
Best E2E (P95)
3,225.24 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
1281,0241024x6,567.80839.930.1233,343.7152,848.8999,852.68100.0%
1281281x22.8721.806.512,648.812,648.815,595.31100.0%
1281282x44.2643.223.942,727.592,809.685,771.60100.0%
1281284x87.5186.822.341,543.842,874.555,816.02100.0%
1281288x168.39168.891.421,698.383,092.406,066.69100.0%
12812816x514.70518.470.60870.24986.743,970.64100.0%
12812832x824.48825.680.381,438.641,766.584,950.66100.0%
12812864x1,402.741,396.230.232,204.802,612.275,783.81100.0%
128128128x1,880.391,880.050.174,144.275,176.608,493.23100.0%
128128256x1,690.991,694.030.1712,862.7014,493.8118,760.80100.0%
1285121x42.6510.166.4489.4989.4912,004.81100.0%
1285122x85.2220.813.9382.1682.4412,012.27100.0%
1285124x168.7041.852.36153.16167.9612,132.94100.0%
1285128x328.0882.501.57295.53359.7712,434.13100.0%
12851216x628.42160.371.10549.11594.9212,857.50100.0%
12851232x1,195.01300.160.59934.651,056.2813,640.72100.0%
12851264x2,078.71518.010.321,708.862,098.4415,662.52100.0%
128512128x3,277.04823.070.214,572.595,459.9919,526.46100.0%
128512256x4,185.951,049.520.1511,763.7713,888.3030,396.39100.0%
128512512x4,044.331,016.710.1425,749.2630,569.3152,883.80100.0%
1281,0241x42.745.097.4058.0558.0523,959.06100.0%
1281,0242x83.8810.243.70142.91153.2924,412.53100.0%
1281,0244x168.3220.883.09191.87195.4824,324.08100.0%
1281,0248x311.9641.911.88293.95324.4624,502.14100.0%
1281,02416x615.0080.531.22501.53577.8025,606.21100.0%
1281,02432x1,174.96150.930.66975.491,161.9227,162.73100.0%
1281,02464x2,135.83272.540.371,775.622,169.4829,830.49100.0%
1281,024128x3,623.01463.050.224,372.015,377.0334,902.70100.0%
1281,024256x4,723.77603.670.1613,792.2715,386.1753,457.94100.0%
1281,024512x4,742.70608.100.1429,754.0832,810.7082,127.94100.0%
1282,0481x42.604.317.5993.7993.7928,283.46100.0%
1282,0482x75.216.575.08134.99136.7237,616.21100.0%
1282,0484x154.1312.953.03206.35248.7339,032.55100.0%
1282,0488x269.3524.831.79319.93391.9541,109.30100.0%
1282,04816x471.1142.511.52504.88589.9144,439.88100.0%
1282,04832x992.3186.620.81955.001,050.7946,643.20100.0%
1282,04864x1,789.52146.960.451,728.652,156.0553,583.57100.0%
1282,048128x3,004.04253.230.274,342.285,493.8458,612.63100.0%
1282,048256x4,518.29381.830.1814,575.1716,468.3879,820.33100.0%
1282,048512x4,576.16383.610.1631,121.4233,142.80134,301.05100.0%
1282,0481024x5,701.51476.170.1460,416.02105,778.57198,057.6498.6%
5121281x39.68153.751.76238.70238.703,225.24100.0%
5121282x78.14303.721.01276.63276.663,273.74100.0%
5121284x154.12595.420.70320.94326.953,317.83100.0%
5121288x283.261,092.670.53421.28544.963,602.21100.0%
51212816x501.591,943.670.27751.711,008.924,070.35100.0%
51212832x791.193,083.060.181,460.601,912.915,153.64100.0%
51212864x1,333.555,188.020.102,143.572,781.456,077.80100.0%
512128128x1,467.575,701.900.095,371.197,256.7710,869.68100.0%
512128256x1,273.784,948.800.0916,336.0919,570.8825,220.39100.0%
5125121x42.2540.934.2057.7857.7812,116.99100.0%
5125122x83.4981.132.11115.42126.4712,262.41100.0%
5125124x166.82161.111.74199.43305.9112,270.42100.0%
5125128x329.79318.041.25284.59392.6012,395.44100.0%
51251216x591.65583.060.73653.18961.1413,597.75100.0%
51251232x1,107.801,090.380.381,130.791,480.6614,614.22100.0%
51251264x1,894.601,857.880.222,194.093,105.0717,111.08100.0%
512512128x3,074.713,015.730.144,467.955,909.5220,903.14100.0%
512512256x3,355.913,270.090.1216,828.4319,469.4938,678.23100.0%
512512512x3,349.953,272.610.1131,034.7437,403.7265,445.06100.0%
5121,0241x42.1720.435.65238.85238.8524,280.52100.0%
5121,0242x84.0740.843.44274.16275.6024,354.20100.0%
5121,0244x166.5880.442.37338.81344.2824,557.73100.0%
5121,0248x313.51151.171.55528.80851.1426,124.44100.0%
5121,02416x564.40283.380.91850.291,186.6227,993.74100.0%
5121,02432x1,080.48551.750.531,460.201,952.5828,905.63100.0%
5121,02464x1,942.19969.160.292,123.363,466.0432,840.45100.0%
5121,024128x3,292.261,645.370.184,606.436,146.7438,441.58100.0%
5121,024256x4,309.822,163.740.1416,642.5419,766.5658,175.09100.0%
5121,024512x3,931.211,958.160.1433,226.5239,045.67110,309.27100.0%
5122,0481x42.1511.506.64238.56238.5643,086.70100.0%
5122,0482x79.7522.104.14276.81276.9944,749.07100.0%
5122,0484x147.3242.212.95298.48345.7446,186.31100.0%
5122,0488x270.2683.481.54544.27814.2145,403.14100.0%
5122,04816x498.53188.411.16780.641,032.6039,906.09100.0%
5122,04832x916.70308.140.671,341.471,945.7447,632.06100.0%
5122,04864x1,736.41577.910.382,308.163,255.6253,786.66100.0%
5122,048128x2,654.82888.390.245,010.637,504.6567,637.97100.0%
5122,048256x3,471.191,172.100.1713,569.5117,655.8490,089.08100.0%
5122,048512x4,316.651,440.920.1514,736.9618,851.86132,110.15100.0%
5122,0481024x4,298.001,442.700.1592,247.54162,998.75250,955.6383.4%
1,0241281x37.44285.171.04435.06435.063,417.08100.0%
1,0241282x74.14565.880.61460.72466.633,449.63100.0%
1,0241284x127.74975.300.40540.87873.084,002.81100.0%
1,0241288x249.511,908.630.29571.81878.604,090.86100.0%
1,02412816x354.572,720.050.181,466.082,383.675,765.75100.0%
1,02412832x632.684,854.340.122,080.473,051.796,461.02100.0%
1,02412864x861.686,616.600.093,486.365,794.659,460.51100.0%
1,024128128x1,099.528,444.000.066,969.199,853.2514,846.10100.0%
1,024128256x1,059.908,139.700.0717,903.8223,073.0830,396.23100.0%
1,0245121x42.4180.752.8658.4658.4612,072.56100.0%
1,0245122x80.93154.431.78285.30454.1012,627.19100.0%
1,0245124x161.26307.801.19337.83524.0112,691.77100.0%
1,0245128x304.76582.810.79531.67871.8313,426.24100.0%
1,02451216x571.551,096.140.46979.041,563.1214,320.97100.0%
1,02451232x1,003.301,956.000.281,621.792,222.9016,034.20100.0%
1,02451264x1,623.583,140.800.162,802.685,186.0019,975.43100.0%
1,024512128x2,383.554,622.990.116,660.2410,008.3127,054.97100.0%
1,024512256x2,715.605,241.880.0917,223.4122,821.0447,320.97100.0%
1,024512512x2,580.915,000.550.1031,681.6952,429.1091,006.32100.0%
1,0241,0241x41.7839.784.35412.68412.6824,505.20100.0%
1,0241,0242x82.6378.842.65452.75453.9724,771.54100.0%
1,0241,0244x166.00158.421.82436.99531.2924,671.98100.0%
1,0241,0248x303.79290.481.08787.241,399.1626,958.83100.0%
1,0241,02416x564.52556.930.751,089.521,599.7128,201.20100.0%
1,0241,02432x1,055.021,045.410.411,790.852,629.6130,026.79100.0%
1,0241,02464x1,817.771,796.380.243,340.555,196.1534,954.60100.0%
1,0241,024128x2,909.762,895.770.156,651.4610,074.4343,388.95100.0%
1,0241,024256x2,791.062,747.540.1313,834.3020,144.4075,059.32100.0%
1,0242,0481x41.8725.875.26430.41430.4137,661.30100.0%
1,0242,0482x74.9144.673.48459.98468.3243,275.88100.0%
1,0242,0484x142.3587.072.13532.54540.7244,152.84100.0%
1,0242,0488x261.61158.051.34639.671,010.4349,345.97100.0%
1,0242,04816x474.16294.880.89939.091,584.0151,871.79100.0%
1,0242,04832x894.39591.000.562,263.553,784.2252,008.95100.0%
1,0242,04864x1,569.621,054.480.332,971.524,169.5258,514.26100.0%
1,0242,048128x2,652.531,817.660.206,881.849,403.6665,816.72100.0%
1,0242,048256x3,365.942,246.890.1516,835.6221,948.6399,797.55100.0%
1,0242,048512x3,544.392,401.400.1437,818.7445,696.57186,457.84100.0%
1,0242,0481024x3,534.422,400.620.1575,525.53160,782.37232,170.6462.7%
2,0481281x33.48514.780.63797.47797.473,821.53100.0%
2,0481282x66.561,018.980.36831.71835.773,843.62100.0%
2,0481284x129.261,979.050.24925.67931.023,952.86100.0%
2,0481288x211.883,238.570.17880.281,477.004,818.83100.0%
2,04812816x347.775,311.770.121,873.722,664.895,876.89100.0%
2,04812832x472.827,222.900.083,073.864,613.458,631.78100.0%
2,04812864x577.728,825.880.065,379.4910,221.3814,116.01100.0%
2,048128128x674.8510,310.570.0510,197.4916,898.5024,154.43100.0%
2,048128256x728.0311,127.750.0522,851.5133,721.9041,525.89100.0%
2,048128512x765.2311,695.700.0543,008.9664,331.3876,361.24100.0%
2,0481281024x806.9912,358.950.0579,075.22125,243.56138,040.4898.5%
2,0485121x39.83153.101.82803.76803.7612,853.38100.0%
2,0485122x79.09302.701.10844.18849.0912,944.28100.0%
2,0485124x156.50599.040.75705.60892.1513,058.63100.0%
2,0485128x251.691,024.280.511,388.782,526.0215,276.20100.0%
2,04851216x478.311,913.560.321,943.743,396.3216,337.89100.0%
2,04851232x838.113,282.550.163,676.294,558.2419,044.63100.0%
2,04851264x1,390.235,407.730.125,144.707,739.1123,107.46100.0%
2,048512128x2,016.447,810.540.089,573.8514,924.8331,891.25100.0%
2,048512256x1,739.116,781.520.0822,932.7755,794.8171,496.03100.0%
2,0481,0241x40.9678.732.96801.50801.5024,996.36100.0%
2,0481,0242x81.14155.281.81833.06839.3025,225.00100.0%
2,0481,0244x141.82312.551.20717.09904.2125,006.37100.0%
2,0481,0248x255.49588.600.791,013.791,521.8326,252.13100.0%
2,0481,02416x420.951,011.740.462,555.484,100.7030,908.40100.0%
2,0481,02432x863.631,890.340.293,278.364,859.4033,077.15100.0%
2,0481,02464x1,546.723,222.300.175,015.838,041.0938,779.76100.0%
2,0481,024128x2,353.144,762.830.1110,088.5016,531.0852,445.33100.0%
2,0481,024256x2,142.534,360.370.1124,260.1175,321.79106,380.18100.0%
2,0482,0481x41.0567.163.28802.55802.5529,277.27100.0%
2,0482,0482x81.78129.602.03836.11839.7830,206.30100.0%
2,0482,0484x122.53224.571.30902.86921.1234,151.92100.0%
2,0482,0488x204.46440.260.851,186.781,759.3734,895.57100.0%
2,0482,04816x342.69623.090.692,464.844,437.3547,483.55100.0%
2,0482,04832x690.951,100.610.423,250.735,549.9652,235.86100.0%
2,0482,04864x1,371.402,084.530.245,203.417,694.8856,055.43100.0%
2,0482,048128x2,356.933,345.180.159,401.4414,961.4069,836.73100.0%
2,0482,048256x2,238.103,255.860.1424,786.6487,727.47134,611.01100.0%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H20
GPU Count8
GPU Memory (Total)760 GB
GPU Driver570.195.03
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)500 W
CPU ModelIntel(R) Xeon(R) Platinum 8469C
RAM1,007 GB

Software Configuration

Inference FrameworkvLLM
Framework Versionv0.9.0
OSUbuntu
OS Version24.04.3 LTS (Noble Numbat)
Kernel Version6.8.0-79-generic
Python Version3.12.3

Model Configuration

Providergoogle
Model Namegemma-3-27b-it
QuantizationBF16

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length8192
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization90.00
Temperature0.70
Top-P1.00
Top-K-1