NVIDIA H20 (8x) - llama-3.3-70b-instruct

October 24, 2025 at 10:44 PM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
3,370.98
Peak generation speed
Best Input TPS
6,350.24
Peak prefill speed
Best Energy Efficiency
0.11 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
40.07 ms
Lowest latency
Best E2E (P95)
3,620.39 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
1285121024x3,370.98888.160.2555,518.94109,926.10126,853.8292.7%
1281281x32.0430.546.29645.39645.393,995.19100.0%
1281282x63.7862.284.29655.98662.874,012.42100.0%
1281284x127.17126.182.26292.21609.613,963.44100.0%
1281288x277.13277.941.33194.05297.193,692.72100.0%
12812816x510.98514.720.85353.83578.614,001.40100.0%
12812832x968.78970.200.46455.21652.654,215.20100.0%
12812864x1,588.831,581.460.28747.251,345.915,123.60100.0%
128128128x1,701.641,706.320.221,502.803,806.457,865.77100.0%
128128256x2,125.272,132.870.204,018.0211,041.5514,655.89100.0%
128128512x2,503.422,525.590.188,716.2317,930.8822,190.62100.0%
1281281024x2,457.562,470.810.1819,636.5539,741.0344,767.40100.0%
1285121x37.759.0010.1277.5177.5113,563.35100.0%
1285122x75.4118.417.0790.7095.4113,577.77100.0%
1285124x150.0937.234.5947.3964.3013,638.21100.0%
1285128x277.0274.152.99105.77185.3613,848.25100.0%
12851216x562.64146.351.50201.83309.6914,083.53100.0%
12851232x1,002.92265.810.81394.96808.4615,413.59100.0%
12851264x1,904.88500.090.43546.62877.3716,271.36100.0%
128512128x2,020.72523.610.361,510.163,492.4519,498.75100.0%
128512256x2,937.99778.130.269,230.7019,247.9735,569.16100.0%
128512512x3,177.72840.980.2626,149.5855,326.2171,462.93100.0%
1281,0241x37.606.2710.9885.3385.3319,417.22100.0%
1281,0242x73.2016.147.3881.8790.3115,424.91100.0%
1281,0244x115.9821.455.55115.00164.1222,942.75100.0%
1281,0248x234.3242.603.27116.86188.0323,883.09100.0%
1281,02416x428.8976.471.94247.01390.4625,269.06100.0%
1281,02432x830.87140.721.06429.32531.0327,791.71100.0%
1281,02464x1,524.04264.170.58670.691,051.0329,793.40100.0%
1281,024128x1,846.67313.180.413,061.6420,348.4246,110.33100.0%
1281,024256x2,529.34439.680.3313,551.1134,474.3761,043.95100.0%
1281,024512x2,953.22513.990.2938,562.1387,480.11111,869.22100.0%
1281,0241024x3,216.00559.020.2854,127.00114,775.21139,005.6864.4%
1282,0481x37.685.9810.8378.7878.7820,383.18100.0%
1282,0482x63.6111.678.1089.5190.5121,068.54100.0%
1282,0484x124.7121.055.59105.07172.2923,812.41100.0%
1282,0488x213.9238.142.53259.53311.3325,352.71100.0%
1282,04816x422.8571.891.90278.65470.1527,057.06100.0%
1282,04832x821.18146.911.08429.62652.2626,892.78100.0%
1282,04864x1,161.83194.940.73702.321,149.9133,834.57100.0%
1282,048128x1,978.32333.100.411,809.114,480.8337,667.60100.0%
1282,048256x2,643.18461.120.3213,835.2236,729.2560,784.45100.0%
1282,048512x2,725.94470.800.3039,119.5387,719.97114,427.63100.0%
1282,0481024x3,024.26522.370.3053,516.26112,798.49137,881.3264.1%
5121281x35.36137.022.67288.79288.793,620.39100.0%
5121282x70.56274.261.70288.62289.073,627.23100.0%
5121284x138.94536.771.08174.92318.183,679.12100.0%
5121288x243.52939.360.74462.98822.794,202.03100.0%
51212816x452.101,751.880.40783.381,084.534,523.46100.0%
51212832x709.022,762.850.271,357.632,072.855,760.35100.0%
51212864x991.603,878.540.172,828.174,292.618,192.03100.0%
512128128x1,006.603,925.990.155,195.9811,902.9515,380.16100.0%
512128256x1,270.394,937.290.139,025.7516,817.6222,159.26100.0%
512128512x1,315.025,131.220.1318,755.7436,738.7844,592.51100.0%
5121281024x1,146.794,467.060.1445,758.1291,587.9697,997.33100.0%
5125121x37.1535.996.49285.58285.5813,781.07100.0%
5125122x72.2071.693.21172.81289.4313,847.60100.0%
5125124x145.02140.072.90355.85528.3614,116.36100.0%
5125128x275.20277.101.83368.16578.6714,239.66100.0%
51251216x474.68523.861.02752.621,336.5615,141.44100.0%
51251232x898.03962.430.541,531.252,384.9416,562.79100.0%
51251264x1,527.111,552.740.312,967.934,315.2920,502.20100.0%
512512128x1,559.601,629.410.265,376.4716,065.1531,836.52100.0%
512512256x1,930.721,994.110.2215,425.3029,546.3249,162.70100.0%
512512512x2,442.702,476.930.2038,251.5877,935.8095,352.75100.0%
5125121024x2,424.452,468.270.2056,625.59108,305.63127,284.6266.1%
5121,0241x37.0835.306.52286.49286.4914,024.59100.0%
5121,0242x59.0552.885.20164.54277.2618,388.59100.0%
5121,0244x128.9587.473.70358.72524.8122,202.23100.0%
5121,0248x216.65157.141.78990.191,352.4824,914.45100.0%
5121,02416x397.52324.461.35876.061,571.0822,085.27100.0%
5121,02432x745.73561.950.771,496.252,093.0627,128.41100.0%
5121,02464x1,363.96942.650.452,846.654,082.5731,783.78100.0%
5121,024128x2,231.741,539.180.274,771.456,796.2639,179.48100.0%
5121,024256x2,346.571,602.710.2518,835.6943,445.0272,241.25100.0%
5121,024512x2,480.891,690.250.2450,370.47109,282.33134,873.4298.2%
5121,0241024x2,531.131,726.470.2453,992.26112,515.55140,087.0750.8%
5122,0481x37.0336.596.53292.89292.8913,529.64100.0%
5122,0482x58.9952.834.25441.43564.6218,455.10100.0%
5122,0484x122.0683.083.93171.14286.2523,268.02100.0%
5122,0488x212.44171.142.55503.58850.6423,040.88100.0%
5122,04816x400.94309.401.41872.751,606.0824,883.00100.0%
5122,04832x661.64466.450.861,700.502,618.6032,235.41100.0%
5122,04864x909.07603.470.572,851.084,353.5735,803.04100.0%
5122,048128x1,423.92964.620.375,284.057,081.0543,020.78100.0%
5122,048256x2,220.191,454.700.2719,007.6648,226.9676,763.07100.0%
5122,048512x2,507.551,685.170.2451,802.48110,453.56137,639.18100.0%
5122,0481024x2,515.231,695.440.2454,393.26114,036.42141,199.0150.5%
1,0241281x32.77249.621.54541.84541.843,906.02100.0%
1,0241282x65.34498.721.04547.14550.503,916.87100.0%
1,0241284x114.80876.460.74677.74999.274,455.22100.0%
1,0241288x204.601,565.030.41945.521,587.344,999.89100.0%
1,02412816x336.292,579.800.271,420.302,611.146,082.75100.0%
1,02412832x516.003,991.240.182,520.954,079.257,854.14100.0%
1,02412864x709.455,447.650.134,268.297,505.4611,500.44100.0%
1,024128128x685.265,288.060.127,698.2516,823.7620,309.04100.0%
1,0245121x37.6771.734.3340.0740.0713,591.73100.0%
1,0245122x75.27143.633.0340.2141.1513,601.13100.0%
1,0245124x149.21284.791.9850.7868.2713,719.78100.0%
1,0245128x287.07551.541.02255.31575.6814,197.58100.0%
1,02451216x541.611,045.100.64496.371,065.2915,026.26100.0%
1,02451232x950.571,823.330.361,156.792,071.1117,213.68100.0%
1,02451264x1,561.083,104.990.212,456.024,138.9320,233.15100.0%
1,024512128x1,391.032,785.500.197,093.1924,973.3042,755.56100.0%
1,0241,0241x36.9339.526.15547.07547.0724,642.86100.0%
1,0241,0242x72.9077.374.43475.92537.2525,188.60100.0%
1,0241,0244x124.18143.793.47268.22521.2326,514.78100.0%
1,0241,0248x248.82317.061.79374.12576.8724,451.64100.0%
1,0241,02416x413.06524.331.07893.032,110.6429,953.88100.0%
1,0241,02432x736.70992.770.581,759.962,542.0729,453.49100.0%
1,0241,02464x1,253.591,708.280.343,951.415,575.1133,857.86100.0%
1,0241,024128x1,504.162,075.410.247,608.6023,808.1147,487.74100.0%
1,0241,024256x1,739.982,381.370.2225,477.4157,432.7589,605.42100.0%
1,0241,024512x1,982.522,716.270.2153,847.21113,239.73141,004.2481.6%
1,0242,0481x36.8145.555.73542.35542.3521,380.27100.0%
1,0242,0482x67.7386.794.12543.94546.9022,308.79100.0%
1,0242,0484x137.17164.682.13441.14582.1623,536.73100.0%
1,0242,0488x212.44251.861.771,191.361,586.0030,252.11100.0%
1,0242,04816x363.52465.621.111,391.612,089.1729,055.18100.0%
1,0242,04832x441.89536.150.822,799.924,051.6933,131.56100.0%
1,0242,04864x988.631,331.810.394,807.298,086.9136,174.92100.0%
1,0242,048128x1,608.332,211.240.247,620.7213,204.2747,811.40100.0%
1,0242,048256x1,761.722,380.540.2226,324.9359,716.1491,265.53100.0%
1,0242,048512x1,984.752,643.290.2152,953.44112,229.76140,618.7278.9%
2,0481281x28.96445.250.901,053.241,053.244,420.01100.0%
2,0481282x57.81885.050.631,057.481,060.704,426.99100.0%
2,0481284x115.521,768.730.481,051.241,059.384,431.03100.0%
2,0481288x155.882,382.710.301,584.273,119.116,564.51100.0%
2,04812816x236.953,619.230.202,682.754,478.128,616.31100.0%
2,04812832x339.335,183.660.144,515.228,075.2412,046.74100.0%
2,04812864x375.125,778.780.117,226.1016,586.7621,590.49100.0%
2,048128128x413.436,350.240.1113,914.4826,554.8036,025.11100.0%
2,048128256x357.215,461.050.1232,192.7074,805.7091,424.87100.0%
2,0485121x35.01134.572.701,060.261,060.2614,624.52100.0%
2,0485122x70.00267.891.921,060.711,061.2514,626.30100.0%
2,0485124x105.15464.641.102,072.493,084.3016,865.73100.0%
2,0485128x220.39915.000.791,687.263,061.7717,103.29100.0%
2,04851216x374.781,612.590.473,316.785,124.1519,367.17100.0%
2,04851232x518.802,136.950.305,949.5813,095.1629,256.39100.0%
2,04851264x825.933,298.830.196,999.6018,636.2137,866.38100.0%
2,048512128x918.243,622.520.1614,947.5338,036.3759,474.00100.0%
2,048512256x1,086.204,342.900.1539,813.9778,340.69107,484.92100.0%
2,048512512x1,113.804,420.970.1655,641.09114,750.46137,237.3961.9%
2,0481,0241x35.21118.043.061,050.641,050.6416,645.82100.0%
2,0481,0242x64.61175.352.611,065.561,070.8722,108.15100.0%
2,0481,0244x107.64275.571.90553.691,072.2227,538.80100.0%
2,0481,0248x191.86574.301.201,824.013,076.9726,096.41100.0%
2,0481,02416x283.79958.040.703,466.945,531.3129,224.01100.0%
2,0481,02432x527.581,540.390.445,154.6210,145.5138,741.41100.0%
2,0481,02464x895.012,723.870.277,900.8513,733.7644,357.33100.0%
2,0481,024128x1,059.252,953.350.2013,763.9444,792.6276,494.56100.0%
2,0481,024256x1,354.323,787.480.1842,272.0191,134.05122,340.13100.0%
2,0481,024512x1,322.183,726.250.1854,536.92112,584.20143,982.5656.8%
2,0482,0481x35.7693.213.701,053.661,053.6621,086.37100.0%
2,0482,0482x65.67219.402.201,065.521,068.9217,712.98100.0%
2,0482,0484x70.41141.633.101,049.651,066.8849,831.98100.0%
2,0482,0488x181.62658.951.021,703.603,096.6023,285.41100.0%
2,0482,04816x299.32877.740.703,820.956,093.3933,485.99100.0%
2,0482,04832x335.91949.900.595,316.719,092.1735,963.94100.0%
2,0482,04864x921.792,677.890.277,496.3014,154.7544,086.72100.0%
2,0482,048128x1,099.003,109.870.2214,283.6939,067.2470,100.14100.0%
2,0482,048256x1,227.233,400.060.1944,171.4698,211.26133,375.05100.0%
2,0482,048512x1,323.673,659.880.1853,419.33111,541.69142,419.1455.9%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H20
GPU Count8
GPU Memory (Total)760 GB
GPU Driver570.195.03
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)500 W
CPU ModelIntel(R) Xeon(R) Platinum 8469C
RAM1,007 GB

Software Configuration

Inference FrameworkvLLM
Framework Versionv0.9.0
OSUbuntu
OS Version24.04.3 LTS (Noble Numbat)
Kernel Version6.8.0-79-generic
Python Version3.12.3

Model Configuration

Providermeta-llama
Model Namellama-3.3-70b-instruct
QuantizationBF16

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model LengthUnknown
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory UtilizationUnknown
Temperature0.70
Top-P1.00
Top-K-1