inclusionai/ring-2.6-1tRing 2.6 1T (inclusionai/ring-2.6-1t) is a bailing_hybrid 1025.7B-parameter model from Inclusionai with a 262,144-token context window and 65,536 max output tokens, priced at $0.07/1M input and $0.63/1M output tokens. Available via the haimaker.ai OpenAI-compatible API.
Ring 2.6 1t is a chat model by Inclusionai. It has 1025.7B parameters. It supports a 262K token context window. Supports function calling, reasoning.
๐ค Hugging Face | ๐ค ModelScope | ๐ ling.tbox.cn
Introducing Ring-2.6-1T: a trillion-parameter flagship reasoning model designed for real-world complex task scenarios, making it available to developers, researchers, and enterprise environments for validation, adaptation, and further development.
The goal of Ring-2.6-1T is not simply to pursue larger parameter scale , but to address the real production environments that large models are entering: agent workflows, engineering development, scientific research analysis, complex business systems, and enterprise automation processes. In these scenarios, models need not only to "answer questions," but also to understand context, plan steps, invoke tools, execute continuously, and maintain stability over long-horizon tasks.
Ring-2.6-1T has achieved key upgrade in three areas:
You can download Ring-2.6-1T from the following table. If you are located in mainland China, we also provide the model on ModelScope to speed up the download process.
| Model | Context Length | Download |
| :---------: | :----------------: | :---------------------------------------------------------------------------------------------------------------------------------------------------: |
| Ring-2.6-1T | 128K -> 256K (YaRN) | ๐ค HuggingFace ๐ค ModelScope |
Note: If you are interested in the previous version, please visit the past model collections on Huggingface or ModelScope.
In real business systems, models often face not isolated Q&A, but continuous, multi-turn, complex tasks that require tool collaboration. Ring-2.6-1T has been specifically enhanced for such scenarios, enabling more stable task decomposition, step planning, tool invocation, error correction, and context continuation.
Looking at benchmark results, Ring-2.6-1T high demonstrates outstanding performance in real-world task execution evaluations: achieving 87.60 on PinchBench, notably higher than GPT-5.4 xHigh and Gemini-3.1-Pro high; scoring 63.82 on ClawEval, ranking among the top comparable models; and reaching 95.32 on Tau2-Bench in the Telecom scenario, with a gap of less than 1 point from the highest-scoring model, demonstrating its stable execution capability in complex business processes, tool collaboration, and industry-specific tasks.
This means that Ring-2.6-1T not only understands user intent but can also continuously drive tasks forward in real workflows. Whether in personal assistant agents, enterprise process automation, or code generation, task decomposition, and engineering collaboration in coding agent scenarios, Ring-2.6-1T functions more like a workflow engine that is executable, responsive to feedback, and capable of iteration.
In practice, not all tasks require the same level of reasoning resources. A format conversion or information organization task has entirely different demands on the model's depth of thinking compared to a math competition problem or a complex system analysis.
To address this, Ring-2.6-1T introduces an adjustable Reasoning Effort mechanism, supporting two reasoning effort levels: high and xhigh.
Conducting reinforcement learning training on trillion-parameter models is itself an enormous engineering challenge. In traditional synchronous RL training, policy generation (rollout) and gradient updates are tightly coupled, leading to:
We will later submit our model to SGLang official release, now we can prepare the environment following steps:
git clone -b ling_2_5 git@github.com:antgroup/sglang.git
cd sglang
Install the python packages
pip install --upgrade pip
pip install -e "python"
Both BF16 and FP8 models are supported by SGLang now. It depends on the dtype of the model in ${MODEL_PATH}.
Here is the example to run Ring-2.6-1T with multiple GPU nodes, where the master node IP is ${MASTER_IP} and server port is ${PORT}:
# Node 0:
python -m sglang.launch_server --model-path $MODEL_PATH --tp-size 8 --pp-size 4 --dp-size 1 --trust-remote-code --dist-init-addr $MASTER_IP:2345 --port $PORT --nnodes 4 --node-rank 0
Node 1:
python -m sglang.launch_server --model-path $MODEL_PATH --tp-size 8 --pp-size 4 --dp-size 1 --trust-remote-code --dist-init-addr $MASTER_IP:2345 --port $PORT --nnodes 4 --node-rank 1
Node 2:
python -m sglang.launch_server --model-path $MODEL_PATH --tp-size 8 --pp-size 4 --dp-size 1 --trust-remote-code --dist-init-addr $MASTER_IP:2345 --port $PORT --nnodes 4 --node-rank 2
Node 3:
python -m sglang.launch_server --model-path $MODEL_PATH --tp-size 8 --pp-size 4 --dp-size 1 --trust-remote-code --dist-init-addr $MASTER_IP:2345 --port $PORT --nnodes 4 --node-rank 3
This is only an example. Please adjust arguments according to your actual environment.
curl -s http://${MASTER_IP}:${PORT}/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "auto", "messages": [{"role": "user", "content": "What is the capital of France?"}]}'
This code repository is licensed under the MIT License.

| Mode | chat |
| Context Window | 262,144 tokens |
| Max Output | 65,536 tokens |
| Function Calling | Supported |
| Vision | - |
| Reasoning | Supported |
| Web Search | - |
| Url Context | - |
| Architecture | BailingMoeV2_5ForCausalLM |
| Model Type | bailing_hybrid |
| Library | transformers |
from openai import OpenAI
client = OpenAI(
base_url="https://api.haimaker.ai/v1",
api_key="YOUR_API_KEY",
)
response = client.chat.completions.create(
model="inclusionai/ring-2.6-1t",
messages=[
{"role": "user", "content": "Hello, how are you?"}
],
)
print(response.choices[0].message.content)Ring 2.6 1T (inclusionai/ring-2.6-1t) has a 262,144-token context window and supports up to 65,536 output tokens per request.
Ring 2.6 1T is priced at $0.07 per 1M input tokens and $0.63 per 1M output tokens when accessed via the haimaker.ai OpenAI-compatible API.
Ring 2.6 1T supports function calling, reasoning.
Send requests to https://api.haimaker.ai/v1/chat/completions with model "inclusionai/ring-2.6-1t" using any OpenAI-compatible SDK. Authentication uses a Bearer API key from https://app.haimaker.ai.
OpenAI-compatible endpoint. Start building in minutes.