Deploy a Local AI Coding Assistant with Ollama and Open WebUI on Linux (Docker Guide)

Running a coding-focused AI assistant locally is no longer a “lab-only” experiment. With modern open-source tooling, you can host a private chatbot on your own Linux machine, avoid sending prompts to third-party services, and keep sensitive code snippets inside your network. In this tutorial, you will install Ollama (a lightweight local LLM runtime) and Open WebUI (a clean web interface) using Docker. The result is a fast, browser-based AI assistant you can use for debugging, code review, documentation drafts, and scripting help.

What You Will Build

By the end, you will have a local web app accessible at http://localhost:3000 (or your server’s IP) that talks to an Ollama service running on the same host. You will also learn how to persist data, pull a model, and validate that everything is working. This guide assumes Ubuntu Server or another modern Linux distribution with Docker support.

Prerequisites

Before starting, confirm you have: (1) a Linux server or desktop with at least 8 GB RAM (16 GB+ is better for larger models), (2) Docker and Docker Compose, and (3) enough disk space for model files (often 4–20 GB depending on the model). If you plan to access the UI from another device, ensure your firewall allows inbound TCP 3000.

Step 1: Install Docker and Docker Compose

If Docker is not installed yet on Ubuntu, you can install it with the official repository packages. Run:

sudo apt update
sudo apt install -y docker.io docker-compose-plugin

Enable and start Docker:

sudo systemctl enable --now docker

Optional but recommended: allow your user to run Docker without sudo (log out and back in after):

sudo usermod -aG docker $USER

Step 2: Create a Project Folder

Create a dedicated directory so your configuration and persistent volumes stay organized:

mkdir -p ~/local-ai-webui
cd ~/local-ai-webui

Step 3: Create a Docker Compose File

Create a file named docker-compose.yml and paste the following. This configuration runs Ollama and Open WebUI, and stores data in Docker volumes so updates do not wipe your models or settings:

nano docker-compose.yml

<pre>services: ollama: image: ollama/ollama:latest container_name: ollama restart: unless-stopped ports: - "11434:11434" volumes: - ollama:/root/.ollama open-webui: image: ghcr.io/open-webui/open-webui:main container_name: open-webui restart: unless-stopped ports: - "3000:8080" environment: - OLLAMA_BASE_URL=http://ollama:11434 volumes: - openwebui:/app/backend/data depends_on: - ollama volumes: ollama: openwebui:</pre>

Save and exit. The key line is OLLAMA_BASE_URL, which tells Open WebUI how to reach the Ollama container internally.

Step 4: Start the Services

Launch everything in the background:

docker compose up -d

Check container status:

docker ps

You should see both ollama and open-webui running.

Step 5: Download a Model

Ollama doesn’t ship with a model by default. Pull one that fits your hardware. For a solid balance of speed and quality, many users start with a 7–8B class model. Run:

docker exec -it ollama ollama pull llama3.1:8b

If disk space is tight, choose a smaller model. If you have more RAM and want higher quality, try a larger model, but expect slower responses on CPU-only systems.

Step 6: Open the Web Interface

In your browser, open:

http://localhost:3000

If you’re on a remote server, use:

http://SERVER_IP:3000

On first launch, Open WebUI typically asks you to create an admin account. After logging in, select your downloaded model (for example, llama3.1:8b) and send a test prompt like “Explain this Bash one-liner” or “Refactor this Python function for readability.”

Step 7: Basic Troubleshooting

Open WebUI loads but no models appear: Confirm the model exists inside Ollama with docker exec -it ollama ollama list. If the list is empty, re-run the pull command.

Connection errors to Ollama: Verify the containers are on the same Docker network (Compose does this automatically) and that OLLAMA_BASE_URL points to http://ollama:11434, not localhost.

Slow responses: Local inference is hardware-dependent. Smaller models respond faster. Also check CPU and memory usage with docker stats. If the system is swapping heavily, reduce model size.

Step 8: Updating Safely

To update to newer images without losing data, run:

docker compose pull
docker compose up -d

Because models and settings are stored in volumes, your downloaded models and WebUI configuration should remain intact.

Conclusion

A local AI assistant is a practical upgrade for developers and IT teams who want speed, privacy, and control. With Ollama handling model execution and Open WebUI providing a friendly interface, you can run a capable coding helper on a single Linux host and keep your prompts and snippets in-house. Once this baseline is working, consider placing it behind a reverse proxy, enabling HTTPS, or restricting access to your LAN for a more production-ready setup.

Comments