NVIDIA H200 NVL (2x) - llama-2-70b-hf (50% Max Batch Token)

November 6, 2025 at 05:32 AM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
4,620.81
Peak generation speed
Best Input TPS
8,844.22
Peak prefill speed
Best Energy Efficiency
0.03 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
51.53 ms
Lowest latency
Best E2E (P95)
417.20 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
1285121024x4,620.811,285.700.0513,890.5145,498.0575,536.6099.7%
1281281x40.7038.792.06249.17249.173,144.66100.0%
1281282x84.5885.271.2168.9479.712,921.35100.0%
1281284x132.08167.330.6598.40112.043,026.15100.0%
1281288x290.96330.540.32158.48181.503,098.11100.0%
12812816x429.53432.680.211,713.491,756.244,744.26100.0%
12812832x1,062.731,102.410.10335.31509.943,505.3796.9%
12812864x1,873.371,929.940.06365.14622.474,079.11100.0%
128128128x2,806.303,015.840.05461.991,103.725,017.62100.0%
128128256x2,730.773,254.060.052,215.283,906.339,279.74100.0%
1285121x42.7953.821.5657.5457.542,243.45100.0%
1285122x53.6281.251.6763.5775.822,938.26100.0%
1285124x130.3342.881.3191.88101.8911,841.40100.0%
1285128x262.6579.450.63112.31129.9612,908.75100.0%
12851216x460.65136.190.34159.64177.1015,109.66100.0%
12851232x1,232.45321.470.16161.22192.4712,688.76100.0%
12851264x2,094.19570.370.10159.15224.9113,812.2398.4%
128512128x3,606.33956.100.07153.98237.2015,618.61100.0%
128512256x4,385.661,197.760.06545.952,438.6722,224.1199.6%
128512512x4,586.661,263.560.052,147.844,978.4444,935.5199.6%
1281,0241x42.735.184.7056.3256.3223,542.34100.0%
1281,0242x63.2410.703.1171.6273.6522,697.16100.0%
1281,0244x138.5121.431.44108.94124.7923,698.66100.0%
1281,0248x171.0943.361.10176.07199.0623,661.75100.0%
1281,02416x572.5282.730.37205.08248.1924,895.21100.0%
1281,02432x1,059.84160.320.22283.33425.7225,503.85100.0%
1281,02464x2,076.91288.090.12339.40566.7928,165.67100.0%
1281,024128x3,290.92499.370.08464.351,073.8632,070.3699.2%
1281,024256x4,418.04645.460.06729.181,796.8348,636.2899.6%
1281,024512x4,462.24657.550.061,524.863,810.4192,176.3599.6%
1281,0241024x3,445.74507.680.078,253.1250,705.05148,280.8761.9%
1282,0481x43.542.595.0275.1175.1147,039.28100.0%
1282,0482x49.185.324.2482.3493.8744,914.89100.0%
1282,0484x125.3710.771.7295.40108.3547,180.83100.0%
1282,0488x341.7821.420.66166.10183.2447,921.86100.0%
1282,04816x507.5141.080.45267.32300.7150,194.55100.0%
1282,04832x1,049.4779.210.24246.18404.9151,690.45100.0%
1282,04864x1,502.65137.330.17572.38736.0459,289.11100.0%
1282,048128x3,175.12239.080.10527.091,067.3868,093.49100.0%
1282,048256x4,069.91326.500.073,921.754,456.2699,645.9799.6%
1282,048512x3,412.85270.330.092,408.954,874.55237,316.3199.6%
5121281x42.11163.160.87122.34122.343,040.15100.0%
5121282x84.57328.710.46134.54135.923,022.41100.0%
5121284x162.54627.940.26186.28210.923,147.19100.0%
5121288x307.231,185.120.15333.18375.533,323.56100.0%
51212816x494.282,112.320.09477.20665.093,721.80100.0%
51212832x920.243,585.940.05639.931,269.744,370.13100.0%
51212864x1,327.725,309.900.041,019.722,223.405,873.71100.0%
512128128x1,703.847,070.540.031,862.524,495.598,663.9999.2%
512128256x1,925.047,784.950.032,968.338,556.9215,231.4099.6%
512128512x1,924.507,897.100.035,269.3116,052.7026,054.1499.4%
5125121x43.3552.832.3960.5560.559,364.70100.0%
5125122x87.4985.011.2870.4683.1811,701.14100.0%
5125124x171.40165.540.6683.8691.5711,941.06100.0%
5125128x309.53329.110.36116.45130.3411,992.62100.0%
51251216x613.37593.960.20166.61192.6112,498.5193.8%
51251232x1,057.271,233.560.11179.87198.4612,854.04100.0%
51251264x1,528.891,747.260.074,259.354,367.7218,181.65100.0%
512512128x3,372.603,663.020.05188.10281.4216,684.77100.0%
512512256x3,334.693,555.790.052,994.708,656.5134,082.19100.0%
5121,0241x43.2670.351.9261.7961.797,026.31100.0%
5121,0242x42.1440.832.66863.62906.7623,198.44100.0%
5122,0481x43.4610.524.3151.5351.5347,126.23100.0%
5122,0482x51.7068.861.9073.4983.3713,838.71100.0%
5122,0484x100.2741.751.6587.4795.1344,550.29100.0%
5122,0488x321.1781.980.60113.17127.3848,166.78100.0%
5122,04816x488.09158.810.39142.52160.7049,933.75100.0%
5122,04832x991.49299.970.21805.26861.2353,165.25100.0%
5122,04864x1,835.62540.920.13182.63247.6458,736.83100.0%
5122,048128x2,857.34851.800.091,797.363,975.0074,388.41100.0%
5122,048256x2,982.35862.350.082,967.738,685.29141,902.96100.0%
5122,048512x2,618.11803.740.0911,511.4321,940.83254,397.1081.8%
1,0241281x9.69497.450.121,519.791,519.791,936.12100.0%
1,0241282x82.74631.540.28133.74194.723,085.61100.0%
1,0241284x152.931,167.560.16298.57369.673,339.51100.0%
1,0241288x202.101,680.540.111,602.001,684.224,651.79100.0%
1,02412816x271.022,198.270.072,309.514,002.797,111.53100.0%
1,02412832x445.743,745.320.051,904.334,833.678,042.6996.9%
1,02412864x830.596,837.550.042,187.685,153.338,906.1098.4%
1,024128128x857.536,828.210.034,397.3510,507.6117,954.4899.2%
1,024128256x1,032.848,288.910.0310,138.3521,984.1029,981.8199.6%
1,024128512x1,092.938,844.220.0314,013.4543,277.1852,334.9899.8%
1,0241281024x685.895,606.020.0560,078.96113,340.03128,184.2573.5%
1,0245121x31.021,440.180.05183.14183.14652.55100.0%
1,0245122x85.93163.970.90128.13188.7511,909.97100.0%
1,0245124x156.54320.380.48291.51367.2312,188.99100.0%
1,0245128x321.52622.460.26477.74601.2612,560.76100.0%
1,02451216x590.291,149.050.14798.791,208.2613,638.44100.0%
1,02451232x948.332,060.790.091,206.602,346.0315,166.26100.0%
1,02451264x1,505.773,301.700.061,829.484,394.4318,881.47100.0%
1,024512128x2,097.304,384.790.053,303.039,726.7628,008.0199.2%
1,024512256x2,446.925,275.400.045,439.2718,069.7346,143.28100.0%
1,024512512x1,825.264,024.350.0524,749.7193,250.52119,667.3599.8%
1,0241,0241x43.0841.022.75188.65188.6523,769.04100.0%
1,0241,0242x62.2482.341.60145.88207.3623,067.85100.0%
1,0241,0244x168.14160.470.71296.28370.5824,355.57100.0%
1,0241,0248x314.69300.900.381,786.461,863.8926,015.40100.0%
1,0241,02416x458.83526.070.231,345.144,549.0429,825.89100.0%
1,0241,02432x933.03998.820.133,475.685,066.0831,399.46100.0%
1,0241,02464x1,419.521,772.640.092,193.386,274.1435,307.02100.0%
1,0241,024128x2,373.512,529.940.066,542.2312,501.9249,579.72100.0%
1,0241,024256x2,608.183,006.570.0510,830.8024,957.2981,464.75100.0%
1,0241,024512x1,978.492,256.150.0714,339.6735,081.08161,952.4375.4%
1,0242,0481x42.7020.333.61978.17978.1747,956.82100.0%
1,0242,0482x87.4941.741.81132.12194.2446,812.05100.0%
1,0242,0484x170.3681.290.93327.71371.5348,079.55100.0%
1,0242,0488x267.68161.020.56478.01604.0448,617.19100.0%
1,0242,04816x595.18302.060.29687.611,214.4151,970.53100.0%
1,0242,04832x936.98559.040.181,341.592,388.5556,127.24100.0%
1,0242,04864x1,507.27913.950.125,965.826,993.1967,506.8598.4%
1,0242,048128x2,487.841,493.640.083,249.229,338.9383,896.30100.0%
1,0242,048256x2,239.681,301.360.095,993.5719,325.88183,930.1699.6%
2,0481281x39.35604.980.29325.10325.103,252.94100.0%
2,0481282x29.06882.860.191,397.481,529.514,292.67100.0%
2,0485121x43.46167.061.1066.4366.4311,779.43100.0%
2,0485122x59.76332.650.5782.6392.1311,406.95100.0%
2,0485124x163.03624.020.29367.43633.5612,552.53100.0%
2,0485128x307.391,174.630.161,063.431,213.8313,315.82100.0%
2,04851216x510.752,068.440.101,135.892,330.2815,072.88100.0%
2,04851232x790.783,332.600.072,092.534,577.2418,089.9196.9%
2,04851264x1,168.504,812.390.053,425.189,691.2725,821.04100.0%
2,048512128x1,330.645,654.140.0411,200.1623,041.5544,109.18100.0%
2,048512256x1,151.444,949.830.0520,999.0179,174.7899,405.53100.0%
2,0481,0241x9.074,462.590.05347.24347.24417.20100.0%
2,0481,0242x85.97164.500.92203.37325.8123,816.30100.0%
2,0481,0244x146.63280.620.493,667.373,813.3327,924.54100.0%
2,0481,0248x262.00612.240.28837.251,093.7425,539.01100.0%
2,0481,02416x453.081,088.720.162,541.602,700.3428,715.24100.0%
2,0481,02432x766.321,872.460.104,596.245,711.8033,371.68100.0%
2,0481,02464x1,211.882,868.200.074,116.1511,827.3143,470.76100.0%
2,0481,024128x1,749.034,195.990.056,169.5118,528.0459,352.02100.0%
2,0481,024256x1,329.912,981.270.0722,969.04113,885.97152,654.4891.8%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H200 NVL
GPU Count2
GPU Memory (Total)280 GB
GPU Driver580.95.05
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)600 W
CPU ModelIntel(R) Xeon(R) 6960P
RAM2,267 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.11.0
OSUbuntu
OS Version22.04.5 LTS (Jammy Jellyfish)
Kernel Version5.15.0-88-generic
Python Version3.10.12

Model Configuration

Providermeta-llama
Model Namellama-2-70b-hf
QuantizationFP16

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length4096
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.93
Temperature0.70
Top-P1.00
Top-K-1