NVIDIA H200 NVL (2x) - llama-3.3-70b-instruct

November 5, 2025 at 01:17 AM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
5,005.29
Peak generation speed
Best Input TPS
11,042.39
Peak prefill speed
Best Energy Efficiency
0.03 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
89.35 ms
Lowest latency
Best E2E (P95)
3,415.33 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
128512512x5,005.291,324.840.052,427.875,140.8443,582.32100.0%
1281281x27.9226.612.68795.04795.044,583.48100.0%
1281282x73.0871.371.18150.24164.633,493.24100.0%
1281284x142.02140.920.60192.74215.243,590.97100.0%
1281288x278.26279.080.32257.27289.093,654.82100.0%
12812816x482.30503.790.18360.08417.614,058.49100.0%
12812832x752.67768.600.12442.65591.245,192.43100.0%
12812864x1,289.491,299.860.08541.75865.636,041.39100.0%
128128128x2,650.372,661.090.05788.601,412.555,696.16100.0%
128128256x3,398.563,431.270.041,373.253,773.898,499.89100.0%
128128512x2,970.733,084.060.044,658.187,243.6119,628.31100.0%
1285121x37.819.014.3289.3589.3513,541.46100.0%
1285122x75.8318.512.17130.16139.7313,494.58100.0%
1285124x149.2637.021.10155.39164.0713,712.81100.0%
1285128x289.1274.510.58238.93257.7713,771.64100.0%
12851216x603.17152.740.29255.61268.5313,452.28100.0%
12851232x920.12252.040.19338.18360.7816,146.04100.0%
12851264x1,856.85484.350.11418.79508.8016,556.82100.0%
128512128x3,275.42853.070.07632.27844.4518,653.78100.0%
128512256x4,186.701,109.880.06697.022,597.4025,169.79100.0%
1285121024x4,755.611,263.010.0519,924.9147,890.5686,526.48100.0%
1281,0241x36.725.284.9592.3192.3123,067.54100.0%
1281,0242x63.7110.952.81133.18145.1822,430.66100.0%
1281,0244x117.2720.471.53186.52202.6724,085.42100.0%
1281,0248x223.7737.670.82251.62276.7525,891.73100.0%
1281,02416x457.1381.440.42314.47371.9822,830.40100.0%
1281,02432x840.78146.310.24411.31543.3126,436.87100.0%
1281,02464x1,494.27248.330.14518.99809.3532,503.49100.0%
1281,024128x2,760.55462.110.09755.121,386.4334,825.00100.0%
1281,024256x3,914.60678.590.07931.251,602.1546,388.73100.0%
1281,024512x4,600.33798.650.061,185.022,683.5076,297.81100.0%
1281,0241024x4,528.41785.950.0611,352.5826,843.03107,604.55100.0%
1282,0481x37.886.284.6891.4291.4219,400.24100.0%
1282,0482x60.0810.312.99134.36146.6123,717.39100.0%
1282,0484x118.3620.921.52189.71202.4323,676.08100.0%
1282,0488x213.8240.510.84258.96281.8823,275.86100.0%
1282,04816x503.0888.180.38353.34403.1822,803.79100.0%
1282,04832x671.79121.130.30399.77515.3626,463.96100.0%
1282,04864x1,474.69252.270.15549.11865.9029,670.19100.0%
1282,048128x2,246.25382.410.11767.071,288.3635,194.88100.0%
1282,048256x3,381.87568.640.07939.701,608.9650,723.91100.0%
1282,048512x4,195.17729.800.061,499.582,814.4177,645.75100.0%
1282,0481024x4,270.75736.140.0621,339.3653,492.77122,579.06100.0%
5121281x37.47145.201.00131.14131.143,415.33100.0%
5121282x53.82209.170.581,451.361,452.904,753.99100.0%
5121284x139.02537.060.27259.85291.683,674.69100.0%
5121288x267.921,033.490.14329.30434.613,797.43100.0%
51212816x502.821,948.440.09494.13721.574,026.33100.0%
51212832x816.043,291.610.06738.661,307.294,749.16100.0%
51212864x1,266.514,929.620.041,220.142,326.786,276.78100.0%
512128128x1,760.986,903.370.031,862.244,230.298,836.21100.0%
512128256x2,035.717,948.800.032,873.797,833.2815,098.84100.0%
512128512x2,093.838,200.930.034,745.8414,808.8126,285.51100.0%
5121281024x2,157.038,418.740.0310,380.2237,590.1347,400.88100.0%
5125121x38.1938.662.68123.85123.8512,801.30100.0%
5125122x75.3773.231.38158.62160.1513,584.37100.0%
5125124x147.38142.340.74252.07274.9613,890.65100.0%
5125128x279.83278.350.38328.40441.9914,168.39100.0%
51251216x541.17560.370.21568.60696.0214,130.77100.0%
51251232x959.121,037.370.12719.101,274.9215,281.08100.0%
51251264x1,696.681,751.870.081,067.152,296.3918,000.80100.0%
512512128x2,700.042,823.380.061,864.284,337.1522,154.65100.0%
512512256x3,411.423,486.460.052,826.837,823.0335,474.48100.0%
512512512x3,590.843,666.140.044,569.4014,453.3963,376.18100.0%
5125121024x3,293.073,356.230.0543,812.0694,115.43145,345.5199.7%
5121,0241x36.2224.283.35132.18132.1820,401.79100.0%
5121,0242x58.0846.011.92197.58226.3121,253.58100.0%
5121,0244x134.28112.400.84255.14286.9517,424.14100.0%
5121,0248x212.96144.930.61329.41435.9425,094.45100.0%
5121,02416x413.65310.780.31605.18771.9224,535.04100.0%
5121,02432x674.40491.530.19740.831,324.6730,777.87100.0%
5121,02464x1,296.67881.480.111,242.082,355.6335,918.05100.0%
5121,024128x2,443.691,632.670.071,841.444,333.9538,517.16100.0%
5121,024256x3,216.142,201.340.062,882.407,831.9056,774.49100.0%
5121,024512x3,289.632,231.170.064,712.9615,020.97103,062.90100.0%
5121,0241024x3,199.392,185.860.0643,269.87119,174.75176,237.7379.2%
5122,0481x33.9132.472.88132.64132.6415,245.17100.0%
5122,0482x69.6047.301.79147.85175.1220,835.47100.0%
5122,0484x131.7598.560.93251.57283.9819,943.97100.0%
5122,0488x209.58158.320.59355.20439.7624,817.30100.0%
5122,04816x397.11320.340.32499.62730.8722,648.00100.0%
5122,04832x696.97492.430.21779.041,347.2227,111.52100.0%
5122,04864x788.77532.040.181,251.452,380.1434,781.22100.0%
5122,048128x1,953.871,318.560.091,809.664,134.3440,569.76100.0%
5122,048256x2,982.511,995.080.062,882.337,894.1957,801.13100.0%
5122,048512x3,007.982,023.120.064,777.1114,798.78105,435.54100.0%
5122,0481024x3,074.732,068.800.0643,876.74118,550.31178,204.1380.0%
1,0241281x36.15275.350.57189.49189.493,540.71100.0%
1,0241282x71.63546.730.29220.71221.923,571.25100.0%
1,0241284x134.031,023.300.16350.91413.113,814.30100.0%
1,0241288x247.881,896.150.09584.74715.594,121.60100.0%
1,02412816x389.902,992.570.061,542.101,863.525,214.90100.0%
1,02412832x689.105,287.180.041,140.112,303.105,845.70100.0%
1,02412864x948.967,329.760.031,827.684,293.728,400.83100.0%
1,024128128x1,197.359,203.720.033,025.688,419.8213,266.40100.0%
1,024128256x1,308.5910,106.310.035,067.1115,421.4023,981.27100.0%
1,024128512x1,254.419,644.750.0310,452.6338,032.4147,257.35100.0%
1,0245121x32.5762.032.01192.21192.2115,717.97100.0%
1,0245122x74.72142.580.94222.23224.3613,702.00100.0%
1,0245124x144.99276.740.49343.23405.1114,117.09100.0%
1,0245128x283.36541.890.27574.24702.1114,439.58100.0%
1,02451216x544.541,056.340.15938.371,253.8514,837.07100.0%
1,02451232x976.171,872.550.091,240.042,263.4316,685.81100.0%
1,02451264x1,516.782,977.140.061,825.454,371.9720,923.48100.0%
1,024512128x2,254.674,477.240.052,997.038,358.8027,654.52100.0%
1,024512256x2,652.845,292.040.045,018.4115,351.3846,566.72100.0%
1,024512512x2,208.244,403.710.0518,729.1782,899.88108,919.26100.0%
1,0241,0241x37.2145.702.50197.47197.4721,307.30100.0%
1,0241,0242x69.1287.471.33191.88247.9722,153.56100.0%
1,0241,0244x120.64139.640.82352.89418.2727,367.78100.0%
1,0241,0248x220.76277.080.44584.42718.4527,241.81100.0%
1,0241,02416x422.16573.650.23937.531,251.8825,331.13100.0%
1,0241,02432x742.711,028.340.141,092.032,277.1828,857.04100.0%
1,0241,02464x1,239.861,751.810.091,799.754,324.0933,495.82100.0%
1,0241,024128x2,002.232,800.450.062,993.198,386.6244,129.81100.0%
1,0241,024256x2,634.613,612.400.055,022.0615,430.1967,922.29100.0%
1,0241,024512x2,312.133,169.700.0623,316.56109,238.54149,854.09100.0%
1,0242,0481x37.7344.972.55193.25193.2521,654.08100.0%
1,0242,0482x73.8096.011.23231.57245.9020,292.35100.0%
1,0242,0484x114.20139.150.81259.21293.3227,554.89100.0%
1,0242,0488x212.57249.070.47381.98457.7929,761.01100.0%
1,0242,04816x357.09461.260.28615.23990.3828,212.58100.0%
1,0242,04832x514.74638.840.201,104.982,285.0933,406.86100.0%
1,0242,04864x1,076.831,404.890.101,753.624,248.0337,716.58100.0%
1,0242,048128x1,326.671,778.960.092,998.118,342.3843,383.76100.0%
1,0242,048256x2,179.742,964.390.064,890.7214,737.1469,248.52100.0%
1,0242,048512x2,201.842,973.830.0623,117.45109,174.36150,616.80100.0%
1,0242,0481024x1,747.022,354.150.0726,081.9049,347.33155,676.7941.6%
2,0481281x33.98522.430.35396.35396.353,767.08100.0%
2,0481282x68.611,050.390.17252.68360.493,722.55100.0%
2,0481284x125.181,916.630.10538.45662.004,080.61100.0%
2,0481288x219.133,349.450.06774.731,200.314,647.26100.0%
2,04812816x353.465,443.960.041,357.332,244.195,693.25100.0%
2,04812832x501.597,662.500.031,825.284,347.498,045.08100.0%
2,04812864x624.309,537.490.032,979.228,482.9512,887.04100.0%
2,048128128x721.8711,042.390.035,221.1116,636.8922,153.17100.0%
2,048128256x676.4810,366.110.0310,927.6238,866.2146,286.87100.0%
2,0485121x37.66144.741.18372.84372.8413,596.60100.0%
2,0485122x73.79282.410.58257.63367.4513,871.15100.0%
2,0485124x141.68542.300.31542.82668.9514,449.08100.0%
2,0485128x240.511,032.790.18700.491,175.3515,131.33100.0%
2,04851216x474.021,930.210.101,322.212,217.7016,150.44100.0%
2,04851232x768.243,214.760.071,815.934,368.5219,348.21100.0%
2,04851264x1,168.974,773.660.052,990.348,496.9625,973.89100.0%
2,048512128x1,615.106,443.910.045,186.6116,614.4638,315.72100.0%
2,048512256x1,407.605,666.350.0415,135.7866,343.0686,245.12100.0%
2,0481,0241x34.56125.041.29330.72330.7215,711.60100.0%
2,0481,0242x62.85177.840.90367.62382.3421,664.27100.0%
2,0481,0244x121.19332.740.48465.67616.1823,207.62100.0%
2,0481,0248x196.99503.650.311,077.801,209.9931,060.05100.0%
2,0481,02416x351.091,059.870.161,624.962,247.0927,929.83100.0%
2,0481,02432x560.381,683.670.111,817.184,324.1136,818.86100.0%
2,0481,02464x1,023.303,047.480.072,978.018,366.4239,153.75100.0%
2,0481,024128x1,604.624,468.880.055,197.8716,839.4253,503.91100.0%
2,0481,024256x1,564.894,367.550.0516,177.4673,770.45106,816.38100.0%
2,0482,0481x38.06118.511.37332.52332.5216,578.68100.0%
2,0482,0482x71.48251.460.66245.84354.2415,494.76100.0%
2,0482,0484x75.22143.561.00527.52651.5949,795.42100.0%
2,0482,0488x195.94541.740.28585.93944.7527,621.33100.0%
2,0482,04816x295.81977.750.171,221.681,716.4731,237.02100.0%
2,0482,04832x368.851,042.500.161,625.863,324.9038,792.39100.0%
2,0482,04864x679.141,893.870.102,320.126,195.8438,422.89100.0%
2,0482,048128x1,131.853,012.850.075,092.4816,115.2755,709.06100.0%
2,0482,048256x1,402.473,893.690.0616,574.3576,412.29109,099.05100.0%
2,0482,048512x1,171.673,201.600.0746,965.95116,072.02156,419.5559.6%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H200 NVL
GPU Count2
GPU Memory (Total)280 GB
GPU Driver580.95.05
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)600 W
CPU ModelIntel(R) Xeon(R) 6960P
RAM2,267 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.11.0
OSUbuntu
OS Version22.04.5 LTS (Jammy Jellyfish)
Kernel Version5.15.0-88-generic
Python Version3.10.12

Model Configuration

Providermeta-llama
Model Namellama-3.3-70b-instruct
QuantizationFP16

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length8192
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.90
Temperature0.70
Top-P1.00
Top-K-1