NVIDIA H200 NVL (2x) - mistral-nemo-instruct-2407

November 7, 2025 at 04:42 AM

Dataset: reference (v1.0)

Best Performance

Click a metric to highlight the best run in the table below

Best Output TPS
12,204.48
Peak generation speed
Best Input TPS
47,690.47
Peak prefill speed
Best Energy Efficiency
0.01 kWh/MT
Energy cost per 1M tokens
Best TTFT (P95)
68.02 ms
Lowest latency
Best E2E (P95)
1,203.89 ms
Lowest latency

Test Matrix Results

Performance across different input/output token combinations and concurrency levels

Input TokensOutput TokensConcurrencyOutput TPSInput TPSEnergy Cost
(kWh/MT)
TTFT MeanTTFT P95E2E P95Success Rate
Best Run for Output TPS
1282,0481024x12,204.482,169.010.021,856.574,985.4019,485.78100.0%
1281281x69.6866.410.49702.56702.561,836.58100.0%
1281282x202.05197.320.26119.93127.741,260.44100.0%
1281284x279.48277.290.19287.06631.231,748.18100.0%
1281288x724.70726.820.09229.06261.001,400.90100.0%
12812816x1,349.141,359.030.04276.20326.031,498.17100.0%
12812832x2,530.322,538.370.03292.63347.701,550.68100.0%
12812864x4,310.384,386.230.02362.19413.661,694.31100.0%
128128128x3,927.234,054.210.021,236.392,469.523,891.30100.0%
1285121x112.4826.800.8092.4792.474,551.90100.0%
1285122x217.5855.620.40135.37136.404,462.76100.0%
1285124x447.55111.010.31145.74168.754,568.10100.0%
1285128x839.00223.750.16201.98223.054,579.30100.0%
12851216x1,196.29318.860.10225.31261.216,444.62100.0%
12851232x2,085.68550.010.06327.38410.197,376.47100.0%
12851264x5,405.751,429.770.03310.36356.005,485.44100.0%
128512128x8,038.042,186.200.02306.26410.726,435.18100.0%
128512256x8,397.622,437.950.02402.51760.727,047.09100.0%
128512512x9,057.002,747.040.02994.993,864.4211,657.82100.0%
1285121024x10,837.683,411.940.021,245.943,747.9910,674.00100.0%
1281,0241x108.6029.980.8178.9178.914,059.39100.0%
1281,0242x219.9548.230.6479.0781.535,158.56100.0%
1281,0244x372.6371.300.41114.82122.637,038.65100.0%
1281,0248x663.11148.170.20181.92210.346,885.42100.0%
1281,02416x1,125.52209.680.12226.93261.619,728.69100.0%
1281,02432x2,312.61408.770.07265.74313.709,909.10100.0%
1281,02464x4,202.86742.420.04307.05387.2710,746.09100.0%
1281,024128x6,766.041,232.150.03300.19454.4611,700.08100.0%
1281,024256x8,352.471,597.270.02391.28879.5213,019.80100.0%
1281,024512x10,274.082,013.100.02666.491,496.5514,521.35100.0%
1281,0241024x11,047.332,294.490.022,332.125,789.8919,348.84100.0%
1282,0481x113.3122.810.9189.3289.325,337.62100.0%
1282,0482x191.8647.300.50126.60132.575,207.93100.0%
1282,0484x368.3878.430.37128.50152.766,340.38100.0%
1282,0488x682.96150.320.22170.83202.106,733.27100.0%
1282,04816x1,088.25208.300.14241.45276.039,582.11100.0%
1282,04832x1,907.73337.340.08281.07332.1711,233.38100.0%
1282,04864x3,110.18519.890.05296.64335.8213,039.77100.0%
1282,048128x4,600.62805.500.03316.61406.6212,705.36100.0%
1282,048256x8,568.581,584.310.02385.19839.3114,284.56100.0%
1282,048512x9,457.331,716.220.02611.261,387.1316,407.42100.0%
5121281x106.31411.960.22116.17116.171,203.89100.0%
5121282x212.10824.360.1389.08108.141,204.93100.0%
5121284x387.291,496.220.07150.89184.661,307.86100.0%
5121288x747.992,885.320.04233.26251.571,357.80100.0%
51212816x1,320.455,200.520.02277.80306.261,506.48100.0%
51212832x2,265.088,832.870.01376.38455.511,759.32100.0%
51212864x3,710.4514,486.360.01401.58649.472,061.17100.0%
512128128x5,639.8322,087.440.01457.01711.952,625.68100.0%
512128256x6,075.2826,114.460.01406.32604.803,722.33100.0%
512128512x5,928.9124,659.180.011,382.323,390.606,475.88100.0%
5121281024x4,393.1720,312.100.014,094.478,631.4011,388.98100.0%
5125121x110.08106.640.5868.0268.024,650.20100.0%
5125122x223.73218.250.3971.7471.944,547.90100.0%
5125124x450.01434.630.19114.18118.894,545.26100.0%
5125128x829.15806.290.10192.79222.864,883.39100.0%
51251216x1,520.551,598.710.06203.40250.204,946.15100.0%
51251232x2,840.072,947.550.03269.81312.585,347.57100.0%
51251264x5,017.335,113.110.02298.92356.846,036.25100.0%
512512128x8,285.628,612.640.01344.02475.007,025.60100.0%
512512256x8,454.579,306.820.01365.05615.428,147.38100.0%
512512512x10,293.7511,221.560.01548.931,520.989,820.31100.0%
5125121024x9,793.7810,922.490.014,870.6813,158.5921,459.89100.0%
5121,0241x112.0989.390.6186.6886.685,538.41100.0%
5121,0242x170.99108.850.5694.0996.518,925.21100.0%
5121,0244x358.66212.850.32159.80178.669,227.02100.0%
5121,0248x692.67502.670.15208.56243.607,664.15100.0%
5121,02416x1,075.25925.910.09283.78364.818,191.71100.0%
5121,02432x2,109.431,592.120.05291.26326.548,584.55100.0%
5121,02464x3,961.482,802.740.03404.35596.7211,081.75100.0%
5121,024128x7,168.144,927.700.02350.03481.7912,498.84100.0%
5121,024256x8,254.256,080.820.02376.13723.4715,082.69100.0%
5121,024512x9,794.107,186.370.021,415.224,208.1419,596.39100.0%
5121,0241024x11,035.238,000.490.016,939.0514,278.8928,329.94100.0%
5122,0481x113.9595.310.56111.52111.525,194.05100.0%
5122,0482x224.70178.860.30130.36149.075,549.27100.0%
5122,0484x366.67247.280.28127.38145.987,851.58100.0%
5122,0488x630.89465.360.15211.89249.258,133.60100.0%
5122,04816x1,084.54862.330.09301.86383.218,383.82100.0%
5122,04832x1,865.001,422.300.05315.76359.179,439.40100.0%
5122,04864x3,792.582,627.370.03371.18526.5510,996.24100.0%
5122,048128x4,302.452,877.890.03458.69715.2513,757.68100.0%
5122,048256x7,144.444,825.200.02508.18779.1717,324.46100.0%
5122,048512x8,207.835,955.420.021,004.023,364.2320,255.05100.0%
5122,0481024x10,223.857,396.280.026,139.1014,628.5028,345.80100.0%
1,0241281x104.32794.620.11128.73128.731,226.72100.0%
1,0241282x205.621,569.480.07108.79132.711,237.50100.0%
1,0241284x383.812,930.280.04170.16215.161,328.39100.0%
1,0241288x680.855,208.110.02265.34334.021,493.71100.0%
1,02412816x1,228.559,424.720.01334.99380.171,652.48100.0%
1,02412832x2,004.9015,390.300.01512.30620.801,986.45100.0%
1,02412864x3,068.9923,586.050.01610.82997.972,570.15100.0%
1,024128128x4,146.7132,048.900.01735.541,357.543,744.96100.0%
1,024128256x4,478.0334,448.600.011,859.303,771.877,056.80100.0%
1,024128512x5,004.6339,504.510.012,254.884,604.169,693.22100.0%
1,0241281024x5,207.3141,923.760.013,861.2511,262.9614,070.71100.0%
1,0245121x107.36204.450.37110.68110.684,768.86100.0%
1,0245122x215.17410.590.2795.35109.674,741.91100.0%
1,0245124x425.07811.330.14141.66161.074,814.59100.0%
1,0245128x822.491,572.890.07217.77273.874,962.44100.0%
1,02451216x1,454.072,824.200.04310.79397.175,537.62100.0%
1,02451232x2,780.485,505.780.02309.03388.295,643.09100.0%
1,02451264x3,020.536,033.960.021,147.694,011.6510,279.75100.0%
1,024512128x5,306.3210,655.830.01731.051,579.7711,617.70100.0%
1,024512256x7,868.9215,677.960.011,163.112,218.1712,812.71100.0%
1,024512512x8,187.5516,642.480.012,992.3410,205.9723,187.44100.0%
1,0245121024x8,051.3516,957.460.017,244.8720,471.8432,474.42100.0%
1,0241,0241x112.01120.150.53116.99116.998,105.19100.0%
1,0241,0242x201.19264.910.36118.11152.697,277.54100.0%
1,0241,0244x358.15467.140.21190.96250.628,137.48100.0%
1,0241,0248x664.10901.380.12270.69314.818,626.64100.0%
1,0241,02416x1,166.071,522.240.07375.60456.1810,310.05100.0%
1,0241,02432x2,194.152,899.970.04482.23621.7010,572.99100.0%
1,0241,02464x3,541.124,988.820.03671.581,138.9012,356.82100.0%
1,0241,024128x5,990.248,414.630.02809.531,405.6714,378.36100.0%
1,0241,024256x7,368.3810,177.010.011,487.673,723.6320,970.09100.0%
1,0241,024512x8,355.1711,666.360.015,473.8913,068.4331,362.48100.0%
1,0241,0241024x9,158.9313,390.290.018,288.4920,222.2536,298.65100.0%
1,0242,0481x111.46160.760.43133.49133.496,054.36100.0%
1,0242,0482x185.63204.930.31152.44175.749,344.93100.0%
1,0242,0484x389.57491.080.20171.10191.537,916.79100.0%
1,0242,0488x646.95786.760.11307.39365.249,883.49100.0%
1,0242,04816x1,074.911,421.300.08325.09394.4110,838.65100.0%
1,0242,04832x2,140.732,900.240.04512.78637.4010,469.82100.0%
1,0242,04864x3,308.784,416.490.03549.42893.1012,424.01100.0%
1,0242,048128x4,022.915,609.130.02824.471,670.4114,477.22100.0%
1,0242,048256x6,932.469,553.800.021,195.152,341.0819,901.69100.0%
1,0242,048512x7,861.0610,863.850.015,729.6814,761.5833,087.58100.0%
1,0242,0481024x8,817.9812,405.080.019,741.8121,527.8439,762.38100.0%
2,0481281x100.551,545.950.06169.63169.631,272.69100.0%
2,0481282x184.042,817.400.03233.20266.101,384.03100.0%
2,0481284x354.085,421.160.02221.09296.981,438.78100.0%
2,0481288x636.029,721.740.01266.68374.521,590.17100.0%
2,04812816x997.5515,311.310.01492.37746.472,021.00100.0%
2,04812832x1,578.4824,131.120.01638.481,063.872,551.46100.0%
2,04812864x2,304.3635,433.470.01951.091,704.833,425.81100.0%
2,048128128x2,547.3638,936.070.011,616.203,621.076,225.80100.0%
2,048128256x2,840.1243,495.350.012,358.407,786.2110,081.43100.0%
2,048128512x3,112.0347,690.470.015,168.9413,691.0717,860.64100.0%
2,0481281024x2,932.9045,241.380.0114,626.1731,166.6637,831.28100.0%
2,0485121x110.30423.960.23146.76146.764,641.44100.0%
2,0485122x222.75852.510.16148.17152.084,593.56100.0%
2,0485124x360.241,543.110.06280.91324.155,071.59100.0%
2,0485128x686.183,116.690.05307.18446.575,015.09100.0%
2,04851216x1,238.175,502.370.03396.66559.335,655.40100.0%
2,04851232x2,253.639,357.260.02586.43930.806,611.41100.0%
2,04851264x3,271.8613,458.540.011,043.282,066.729,187.38100.0%
2,048512128x5,004.0520,257.340.011,400.262,948.7812,119.59100.0%
2,048512256x6,600.2326,646.570.012,343.345,607.2218,348.60100.0%
2,048512512x6,297.6425,322.790.019,668.0022,137.9334,725.82100.0%
2,0485121024x4,353.1216,892.410.0141,756.3481,539.1995,689.78100.0%
2,0481,0241x107.10328.820.28156.14156.145,974.14100.0%
2,0481,0242x194.54610.910.14142.63144.166,345.17100.0%
2,0481,0244x313.181,072.510.13155.23200.067,125.15100.0%
2,0481,0248x544.211,845.320.07249.49293.458,036.32100.0%
2,0481,02416x935.523,361.740.04312.13388.038,249.38100.0%
2,0481,02432x1,832.665,528.540.03586.91769.1211,238.94100.0%
2,0481,02464x2,929.888,646.010.02953.632,117.6214,283.44100.0%
2,0481,024128x4,549.9412,634.130.011,456.654,610.6719,300.41100.0%
2,0481,024256x6,007.3417,341.220.013,281.328,225.3927,349.94100.0%
2,0481,024512x6,552.0918,933.220.0112,466.7731,520.0646,939.03100.0%
2,0481,0241024x6,444.4018,223.990.0137,450.0180,752.7198,674.39100.0%
2,0482,0481x111.34337.620.27159.68159.685,818.77100.0%
2,0482,0482x204.79551.970.23154.41165.407,020.07100.0%
2,0482,0484x321.06998.730.12303.08375.657,820.41100.0%
2,0482,0488x537.971,967.810.08317.68467.607,495.69100.0%
2,0482,04816x762.902,542.550.06459.87626.319,881.09100.0%
2,0482,04832x1,489.724,835.550.03688.871,211.0811,135.92100.0%
2,0482,04864x2,345.057,069.880.02990.781,870.5613,594.13100.0%
2,0482,048128x3,280.738,850.550.021,464.333,030.2519,351.30100.0%
2,0482,048256x5,485.4415,397.950.012,468.715,759.0827,811.93100.0%
2,0482,048512x6,022.9017,082.970.0111,761.1531,396.8847,875.87100.0%
2,0482,0481024x6,189.4616,978.320.0139,877.9587,219.64103,400.26100.0%

Hardware Configuration

GPU ManufacturerNVIDIA
GPU ModelNVIDIA H200 NVL
GPU Count2
GPU Memory (Total)280 GB
GPU Driver580.95.05
CUDA VersionUnknown
Compute Capability9.0
Power Limit (per GPU)600 W
CPU ModelIntel(R) Xeon(R) 6960P
RAM2,267 GB

Software Configuration

Inference FrameworkvLLM
Framework Version0.11.0
OSUbuntu
OS Version22.04.5 LTS (Jammy Jellyfish)
Kernel Version5.15.0-88-generic
Python Version3.10.12

Model Configuration

Providermistralai
Model Namemistral-nemo-instruct-2407
QuantizationFP16

Inference Configuration

Runtime parameters used across all benchmark runs

Max Model Length8192
Tensor Parallel Size1
Pipeline Parallel Size1
GPU Memory Utilization0.90
Temperature0.70
Top-P1.00
Top-K-1