NVIDIA Models

haimaker.ai provides API access to all 5 NVIDIA models, with context windows up to 1M tokens, priced from $0.04 to $0.50 per 1M input tokens. OpenAI-compatible endpoint, instant access.

Models

Max context

From

$0.04

/1M input

Modes

chat

NewestNVIDIA Nemotron 3 Ultra 550B A55B BF16

NVIDIA Nemotron 3 Ultra 550B A55B BF16

NVIDIA

nvidia/nemotron-3-ultra-550b-a55b

560.5B params1M ctxIn: $0.50/1MOut: $2.50/1M

function callingreasoning

Nemotron 3 Nano Omni 30B A3B Reasoning BF16

NVIDIA

nvidia/nemotron-3-nano-30b-a3b

33.0B params66K ctxIn: $0.05/1MOut: $0.20/1M

function callingreasoning

NVIDIA Nemotron 3 Super 120B A12B NVFP4

NVIDIA

nvidia/nemotron-3-super-120b-a12b

67.2B params1M ctxIn: $0.09/1MOut: $0.45/1M

function callingreasoning

NVIDIA Nemotron Nano 9B v2

NVIDIA

nvidia/nemotron-nano-9b-v2

8.9B params66K ctxIn: $0.04/1MOut: $0.16/1M

function callingreasoning

Llama 3.3 Nemotron Super 49B V1.5

NVIDIA

nvidia/llama-3.3-nemotron-super-49b-v1.5

131K ctxIn: $0.10/1MOut: $0.40/1M

function callingreasoning