Llama Guard 4 12B
meta-llama/llama-guard-4-12bLlama Guard 4 12B (meta-llama/llama-guard-4-12b) is a llama4 12.0B-parameter model from Meta Llama with a 163,840-token context window and 16,384 max output tokens, priced at $0.18/1M input and $0.18/1M output tokens. Available via the haimaker.ai OpenAI-compatible API.
Overview
Llama Guard 4 12B is a chat model by Meta Llama. It has 12.0B parameters. It supports a 164K token context window. Supports vision.
Features & Capabilities
| Mode | chat |
| Context Window | 163,840 tokens |
| Max Output | 16,384 tokens |
| Function Calling | Not supported |
| Vision | Supported |
| Reasoning | Not supported |
| Web Search | Not supported |
| Url Context | Not supported |
Technical Details
| Architecture | Llama4ForConditionalGeneration |
| Model Type | llama4 |
| Languages | en |
| Library | transformers |
API Usage
from openai import OpenAI
client = OpenAI(
base_url="https://api.haimaker.ai/v1",
api_key="YOUR_API_KEY",
)
response = client.chat.completions.create(
model="meta-llama/llama-guard-4-12b",
messages=[
{"role": "user", "content": "Hello, how are you?"}
],
)
print(response.choices[0].message.content)Frequently Asked Questions
What is the context window of Llama Guard 4 12B?
Llama Guard 4 12B (meta-llama/llama-guard-4-12b) has a 163,840-token context window and supports up to 16,384 output tokens per request.
How much does Llama Guard 4 12B cost?
Llama Guard 4 12B is priced at $0.18 per 1M input tokens and $0.18 per 1M output tokens when accessed via the haimaker.ai OpenAI-compatible API.
What features does Llama Guard 4 12B support?
Llama Guard 4 12B supports vision.
How do I use Llama Guard 4 12B via API?
Send requests to https://api.haimaker.ai/v1/chat/completions with model "meta-llama/llama-guard-4-12b" using any OpenAI-compatible SDK. Authentication uses a Bearer API key from https://app.haimaker.ai.
Use Llama Guard 4 12B with the haimaker API
OpenAI-compatible endpoint. Start building in minutes.