meta-llama/llama-4-scoutLlama 4 Scout 17B 16E Instruct (meta-llama/llama-4-scout) is a llama4 model from Meta Llama with a 10,000,000-token context window and 16,384 max output tokens, priced at $0.08/1M input and $0.30/1M output tokens. Available via the haimaker.ai OpenAI-compatible API.
Llama 4 Scout is a chat model by Meta Llama. It supports a 10M token context window. Supports function calling, vision.
| Mode | chat |
| Context Window | 10,000,000 tokens |
| Max Output | 16,384 tokens |
| Function Calling | Supported |
| Vision | Supported |
| Reasoning | - |
| Web Search | - |
| Url Context | - |
| Architecture | Llama4ForConditionalGeneration |
| Model Type | llama4 |
| Base Model | meta-llama/Llama-4-Scout-17B-16E |
| Languages | ar, de, en, es, fr, hi, id, it, pt, th, tl, vi |
| Library | transformers |
from openai import OpenAI
client = OpenAI(
base_url="https://api.haimaker.ai/v1",
api_key="YOUR_API_KEY",
)
response = client.chat.completions.create(
model="meta-llama/llama-4-scout",
messages=[
{"role": "user", "content": "Hello, how are you?"}
],
)
print(response.choices[0].message.content)Llama 4 Scout 17B 16E Instruct (meta-llama/llama-4-scout) has a 10,000,000-token context window and supports up to 16,384 output tokens per request.
Llama 4 Scout 17B 16E Instruct is priced at $0.08 per 1M input tokens and $0.30 per 1M output tokens when accessed via the haimaker.ai OpenAI-compatible API.
Llama 4 Scout 17B 16E Instruct supports function calling, vision.
Send requests to https://api.haimaker.ai/v1/chat/completions with model "meta-llama/llama-4-scout" using any OpenAI-compatible SDK. Authentication uses a Bearer API key from https://app.haimaker.ai.
OpenAI-compatible endpoint. Start building in minutes.