How do I install Gemma 4 with Ollama?

Install Ollama, start the local server, then run 'ollama pull gemma4'. After the download finishes, test it with 'ollama run gemma4:latest'.

How much RAM does Gemma 4 need in Ollama?

The common Gemma 4 8B Ollama model fits comfortably on a Mac or PC with 16GB of memory. Larger variants need 24GB or more and are much less forgiving during coding-agent sessions.

Can I use Gemma 4 from Ollama with coding agents?

Yes. Ollama exposes a local OpenAI-compatible API at http://localhost:11434/v1, so tools like OpenClaw, OpenCode, Continue, and other coding agents can route requests to Gemma 4 locally.

Gemma 4 Ollama Setup: Install, Run, and Tune Locally

The search demand around Gemma 4 is pretty clear: people want the shortest path from “I heard this runs locally” to “my agent is using it without an API bill.”

Ollama is that path. It is not the fastest possible runtime, and it is not where you go if you want to hand-tune every quant. But for a local model that you can install in a few minutes and wire into coding tools, it is the least annoying option.

Quick install

On macOS:

brew install --cask ollama-app
open -a Ollama
ollama pull gemma4
ollama run gemma4:latest "Write a tiny Python function"

On Linux:

curl -fsSL https://ollama.com/install.sh | sh
ollama pull gemma4
ollama run gemma4:latest "Explain what Ollama is doing"

On Windows, install the Ollama app, open PowerShell, and run:

ollama pull gemma4
ollama run gemma4:latest "Hello from Gemma 4"

That is enough to prove the model works. The local API runs at http://localhost:11434.

Use the local API

Most coding agents do not talk to Ollama’s native API directly. They expect an OpenAI-compatible endpoint. Ollama provides one at:

http://localhost:11434/v1

Use any placeholder API key if the tool requires one. Ollama does not validate it locally.

{
  "baseURL": "http://localhost:11434/v1",
  "apiKey": "ollama",
  "model": "gemma4:latest"
}

For OpenClaw specifically, use the dedicated setup guide: Gemma 4 with OpenClaw using Ollama.

For OpenCode, use Ollama with OpenCode.

Keep the model warm

The first request after a model unloads feels slow because Ollama has to load weights back into memory. For a coding assistant, that gets old quickly.

On macOS or Linux:

export OLLAMA_KEEP_ALIVE="-1"

Then restart Ollama. This keeps Gemma 4 loaded instead of unloading it after a few idle minutes.

Pick the right machine

For casual local coding work, 16GB unified memory or 16GB system RAM is the floor. That is enough for the smaller Gemma 4 model and a normal editor.

For bigger variants, give yourself 24GB or more. On a 24GB GPU or a 32GB Mac, local coding agents become much less painful. On weaker hardware, the model may still run, but every tool call feels like waiting for a build that should have been cached.

What works well

Gemma 4 through Ollama is good for:

Explaining code
Writing small functions
Generating config files
Drafting tests
Summarizing logs
Handling private code that should not leave your laptop

It is weaker on long, multi-file refactors. If the task requires keeping a whole system in its head, use Gemma 4 for the first pass and escalate to a stronger cloud model when accuracy matters.

Common fixes

Ollama is not responding

Check that the server is running:

curl http://localhost:11434/api/tags

If that fails, start the app again or run ollama serve.

The agent says the model does not exist

Run:

ollama list

Use the exact model name shown there, usually gemma4:latest.

Responses are too slow

Reduce context size, close memory-heavy apps, and keep the model warm. If you are trying to use a large Gemma 4 variant on a 16GB machine, switch down before blaming the agent.

The honest version

Gemma 4 + Ollama is a very good local setup for routine work. It is private, cheap, and easy to roll back if you do not like it.

It is not a Claude Opus replacement. Treat it as the local model you use first, not the only model you use forever. That one change makes the setup much more useful.

ADD A CLOUD FALLBACK WITH HAIMAKER