Set Up a Local AI Coding Assistant with Ollama and Open WebUI (No Cloud Required)

Why run a local AI assistant?

If you write code or manage infrastructure, an AI assistant can speed up tasks like generating scripts, explaining logs, reviewing configs, and drafting documentation. The problem is that many tools send prompts and code to a cloud service. In regulated environments, that is not always allowed. Running an AI model locally gives you better privacy, predictable costs, and the option to keep everything inside your LAN.

In this tutorial, you will install Ollama (a lightweight local model runtime) and Open WebUI (a clean browser interface) on Linux using Docker. The result is a self-hosted, ChatGPT-like web app that talks to your local models.

What you need

Requirements: a modern Linux server or workstation (Ubuntu/Debian/Fedora are fine), at least 8 GB RAM (16 GB+ recommended), and enough disk space for models (10–30 GB is common). CPU-only works, but a supported GPU can improve speed significantly. You also need a user with sudo privileges and outbound internet access to download containers and models.

Step 1: Install Docker (and Docker Compose)

If Docker is already installed, you can skip this section. On Ubuntu/Debian, one of the simplest approaches is using the official repository packages. Install Docker Engine and enable the service. If your distribution provides Docker Compose as a plugin, you can use docker compose (with a space) instead of the older docker-compose binary.

After installation, verify it works by running a test container. If you want to avoid using sudo for every Docker command, add your user to the docker group and log out/in once.

Step 2: Create a project folder

Create a dedicated directory for your stack so it is easy to manage and back up later. For example:

mkdir -p ~/local-ai && cd ~/local-ai

Step 3: Create a Docker Compose file for Ollama and Open WebUI

Create a file named docker-compose.yml in the folder. This setup uses persistent volumes so that downloaded models and WebUI data survive reboots and container upgrades.

nano docker-compose.yml

Paste the following:

<pre> version: "3.9" services: ollama: image: ollama/ollama:latest container_name: ollama restart: unless-stopped ports: - "11434:11434" volumes: - ollama:/root/.ollama open-webui: image: ghcr.io/open-webui/open-webui:main container_name: open-webui restart: unless-stopped depends_on: - ollama ports: - "3000:8080" environment: - OLLAMA_BASE_URL=http://ollama:11434 volumes: - openwebui:/app/backend/data volumes: ollama: openwebui: </pre>

Step 4: Start the services

From the same directory, start the stack:

docker compose up -d

Check that both containers are running:

docker ps

If something fails, view logs:

docker logs -f ollama

docker logs -f open-webui

Step 5: Download a model into Ollama

Ollama pulls models on demand. A practical starting point is a small-to-mid model that fits your RAM. For example, you can pull a general-purpose model like:

docker exec -it ollama ollama pull llama3.2

You can list installed models at any time:

docker exec -it ollama ollama list

If you prefer a coding-focused model, search Ollama’s library and choose one that matches your hardware. The key is to start small, confirm everything works, then scale up to larger models.

Step 6: Open WebUI in your browser

Open your browser and go to:

http://localhost:3000

If you installed this on a server, replace localhost with the server’s IP or hostname (for example, http://10.0.0.20:3000). The first time you load the page, you will create an admin account. After login, Open WebUI should detect Ollama automatically via the OLLAMA_BASE_URL setting.

Step 7: Basic usage tips (for real work)

To make the assistant useful for sysadmin and helpdesk tasks, be specific and provide context. Instead of “Fix this,” try prompts like: “Explain what this Nginx config does and suggest safer defaults” or “Write a bash script that checks disk usage and emails an alert”. When you paste logs, remove secrets and tokens, even if you are staying local.

If responses are slow, use a smaller model, reduce parallel users, or move the stack to a machine with more RAM. Local models are sensitive to memory pressure; swapping to disk can make them feel unusable.

Troubleshooting common problems

Open WebUI loads but shows no models: confirm Ollama is reachable from the WebUI container. The environment variable should be OLLAMA_BASE_URL=http://ollama:11434. Also confirm the model is actually pulled with ollama list.

Port already in use: if another service uses port 3000, change the mapping to something else, for example "8085:8080", then restart with docker compose up -d.

Downloads are slow or failing: this is usually DNS, proxy, or firewall related. Test connectivity from the host, and check whether your environment requires an HTTP proxy for container traffic.

Next steps: make it production-friendly

For a lab setup, HTTP on a LAN is fine. For teams, place Open WebUI behind a reverse proxy like Nginx or Caddy, add TLS, and restrict access with SSO or at least strong passwords. Finally, back up Docker volumes so you do not lose your settings and conversation history when you migrate.

Comments