mistralai/ministral-8b-2512Ministral 3 8B Instruct 2512 GGUF (mistralai/ministral-8b-2512) is an AI model from Mistral AI with a 262,144-token context window and 262,144 max output tokens, priced at $0.15/1M input and $0.15/1M output tokens. Available via the haimaker.ai OpenAI-compatible API.
A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.
A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.
This model includes different quantization levels of the instruct post-trained version in GGUF, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.
The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 12GB of VRAM in FP8, and less if further quantized.
Learn more in our blog post and paper.
We recommend deploying with the following best practices:
This model is licensed under the Apache 2.0 License.
You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.| Mode | chat |
| Context Window | 262,144 tokens |
| Max Output | 262,144 tokens |
| Function Calling | Supported |
| Vision | Supported |
| Reasoning | - |
| Web Search | - |
| Url Context | - |
| Base Model | mistralai/Ministral-3-8B-Instruct-2512 |
| Languages | en, fr, es, de, it, pt, nl, zh, ja, ko, ar |
| Library | vllm |
from openai import OpenAI
client = OpenAI(
base_url="https://api.haimaker.ai/v1",
api_key="YOUR_API_KEY",
)
response = client.chat.completions.create(
model="mistralai/ministral-8b-2512",
messages=[
{"role": "user", "content": "Hello, how are you?"}
],
)
print(response.choices[0].message.content)Ministral 3 8B Instruct 2512 GGUF (mistralai/ministral-8b-2512) has a 262,144-token context window and supports up to 262,144 output tokens per request.
Ministral 3 8B Instruct 2512 GGUF is priced at $0.15 per 1M input tokens and $0.15 per 1M output tokens when accessed via the haimaker.ai OpenAI-compatible API.
Ministral 3 8B Instruct 2512 GGUF supports function calling, vision.
Send requests to https://api.haimaker.ai/v1/chat/completions with model "mistralai/ministral-8b-2512" using any OpenAI-compatible SDK. Authentication uses a Bearer API key from https://app.haimaker.ai.
OpenAI-compatible endpoint. Start building in minutes.