Haimaker.ai Logo
Mistral AI logo

Ministral 3 8B Instruct 2512 GGUF

mistralai/ministral-8b-2512
Chatapache-2.0
Mistral AI|
Function CallingVision
|Released Oct 2025 · Updated Jan 2026

Ministral 3 8B Instruct 2512 GGUF (mistralai/ministral-8b-2512) is an AI model from Mistral AI with a 262,144-token context window and 262,144 max output tokens, priced at $0.15/1M input and $0.15/1M output tokens. Available via the haimaker.ai OpenAI-compatible API.

Context Window
262K
tokens
Max Output
262K
tokens
Input Price
$0.15
/1M tokens
Output Price
$0.15
/1M tokens

Overview

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

Model Card

Ministral 3 8B Instruct 2512 GGUF

A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.

This model includes different quantization levels of the instruct post-trained version in GGUF, fine-tuned for instruction tasks, making it ideal for chat and instruction based use cases.

The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 12GB of VRAM in FP8, and less if further quantized.

Learn more in our blog post and paper.

Key Features

Ministral 3 8B consists of two main architectural components:
  • 8.4B Language Model
  • 0.4B Vision Encoder
The Ministral 3 8B Instruct model offers the following capabilities:
  • Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
  • Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
  • System Prompt: Maintains strong adherence and support for system prompts.
  • Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
  • Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
  • Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
  • Large Context Window: Supports a 256k context window.

Recommended Settings

We recommend deploying with the following best practices:

  • System Prompt: Define a clear environment and use case, including guidance on how to effectively leverage tools in agentic systems.

  • Sampling Parameters: Use a temperature below 0.1 for daily-driver and production environments ; Higher temperatures may be explored for creative use cases - developers are encouraged to experiment with alternative settings.

  • Tools: Keep the set of tools well-defined and limit their number to the minimum required for the use case - Avoiding overloading the model with an excessive number of tools.

  • Vision: When deploying with vision capabilities, we recommend maintaining an aspect ratio close to 1:1 (width-to-height) for images. Avoiding the use of overly thin or wide images - crop them as needed to ensure optimal performance.


License

This model is licensed under the Apache 2.0 License.

You must not use this model in a manner that infringes, misappropriates, or otherwise violates any third party’s rights, including intellectual property rights.

Features & Capabilities

Modechat
Context Window262,144 tokens
Max Output262,144 tokens
Function CallingSupported
VisionSupported
Reasoning-
Web Search-
Url Context-

Technical Details

Base Modelmistralai/Ministral-3-8B-Instruct-2512
Languagesen, fr, es, de, it, pt, nl, zh, ja, ko, ar
Libraryvllm

API Usage

from openai import OpenAI

client = OpenAI(
    base_url="https://api.haimaker.ai/v1",
    api_key="YOUR_API_KEY",
)

response = client.chat.completions.create(
    model="mistralai/ministral-8b-2512",
    messages=[
        {"role": "user", "content": "Hello, how are you?"}
    ],
)

print(response.choices[0].message.content)

Frequently Asked Questions

What is the context window of Ministral 3 8B Instruct 2512 GGUF?

Ministral 3 8B Instruct 2512 GGUF (mistralai/ministral-8b-2512) has a 262,144-token context window and supports up to 262,144 output tokens per request.

How much does Ministral 3 8B Instruct 2512 GGUF cost?

Ministral 3 8B Instruct 2512 GGUF is priced at $0.15 per 1M input tokens and $0.15 per 1M output tokens when accessed via the haimaker.ai OpenAI-compatible API.

What features does Ministral 3 8B Instruct 2512 GGUF support?

Ministral 3 8B Instruct 2512 GGUF supports function calling, vision.

How do I use Ministral 3 8B Instruct 2512 GGUF via API?

Send requests to https://api.haimaker.ai/v1/chat/completions with model "mistralai/ministral-8b-2512" using any OpenAI-compatible SDK. Authentication uses a Bearer API key from https://app.haimaker.ai.

Use Ministral 3 8B Instruct 2512 GGUF with the haimaker API

OpenAI-compatible endpoint. Start building in minutes.

Get API Access

More from Mistral AI