Can OpenCode use Ollama?

Yes. OpenCode can use Ollama through its OpenAI-compatible local endpoint at http://localhost:11434/v1. Add Ollama as a provider, use a placeholder API key, and set the model name exactly as it appears in 'ollama list'.

What local model should I use with OpenCode?

Start with Gemma 4 or Qwen3.5 if you have 16GB to 32GB of memory. Larger models can work better, but they slow down quickly during tool-heavy coding sessions.

Is OpenCode with Ollama good enough for real coding?

It is good for explanations, small edits, boilerplate, and private code review. It is much weaker for long multi-file refactors, where a cloud fallback is still useful.

Use Ollama with OpenCode: Local Model Setup Guide

OpenCode with Ollama is the setup people want when they are tired of sending every coding prompt to a cloud API. It works. It is also slower and more fragile than the demos make it look.

The right expectation is simple: local OpenCode is excellent for small, private, repetitive work. It is not the model you should trust with a messy multi-file migration unless you enjoy babysitting.

Install Ollama

On macOS:

brew install --cask ollama-app
open -a Ollama

On Linux:

curl -fsSL https://ollama.com/install.sh | sh

Then pull a model:

ollama pull gemma4

Check that it is available:

ollama list

Use the exact model name from that output in your OpenCode config.

Configure OpenCode

OpenCode can talk to OpenAI-compatible providers. Ollama exposes a compatible endpoint at:

http://localhost:11434/v1

Add an Ollama provider in your OpenCode config:

{
  "provider": {
    "ollama": {
      "npm": "@ai-sdk/openai-compatible",
      "options": {
        "baseURL": "http://localhost:11434/v1"
      },
      "models": {
        "gemma4:latest": {}
      }
    }
  }
}

If OpenCode asks for auth, use a placeholder key:

{
  "ollama": {
    "type": "api",
    "key": "ollama"
  }
}

Restart OpenCode and switch to the Ollama model from the model picker.

Models to try first

Gemma 4

Good first pick. It handles explanations, simple edits, and small coding tasks well. Runs on modest machines compared with bigger coding models.

Qwen3.5

Often better for code, especially if you can run a larger variant. The 27B-class models are more useful than tiny models, but they need real memory.

Llama 3.3

Good general model if you have the hardware. Less convenient on smaller laptops.

Performance expectations

Recent local-model threads all say the quiet part out loud: prompts can work, code can be good, and the whole thing can still feel slow once tool calls start stacking up.

That is normal. A coding agent is not a single chat request. It reads files, plans, edits, checks output, and loops. Local inference makes every loop more visible.

To make it tolerable:

Keep context small
Use smaller models for simple edits
Keep the model warm
Close memory-heavy apps
Use a cloud fallback for long refactors

Keep Ollama warm

export OLLAMA_KEEP_ALIVE="-1"

Restart Ollama after setting it. This avoids repeated cold starts during a coding session.

When to use a cloud fallback

Use local OpenCode for:

Reading unfamiliar code
Drafting small changes
Generating tests
Explaining errors
Working with private files

Use a cloud model for:

Multi-file refactors
Hard debugging
Architecture changes
Anything you do not want to review line by line

The best setup is not local-only. It is local-first.

If your real target is Gemma 4 specifically, read Gemma 4 Ollama setup. If you are using OpenClaw instead of OpenCode, use Gemma 4 with OpenClaw.

ADD A CLOUD FALLBACK WITH HAIMAKER