Self-Host Open WebUI with Ollama on Ubuntu Using Docker Compose (GPU Optional)

Overview

This tutorial shows how to self-host Open WebUI with Ollama on Ubuntu using Docker Compose. You will get a clean, repeatable setup that runs on CPU or GPU, stores model data persistently, and can be upgraded with a single command. Open WebUI provides a modern interface, while Ollama runs local large language models such as Llama 3, Mistral, Phi-3, and CodeLlama.

Prerequisites

You will need a fresh Ubuntu 22.04 or 24.04 server (cloud VM or local machine), a user with sudo rights, and at least 8 GB of RAM. If you plan to use a GPU, an NVIDIA card is recommended. Open ports 3000 (Web UI) and 11434 (Ollama API) on your firewall or security group.

1) Install Docker and Compose

Install the official Docker Engine and the Compose plugin on Ubuntu. Log out and back in (or run newgrp) after adding your user to the docker group.

sudo apt update
sudo apt install -y ca-certificates curl gnupg

sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
  sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
  https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo $VERSION_CODENAME) stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

sudo usermod -aG docker $USER
newgrp docker
docker --version
docker compose version

2) Optional: Enable NVIDIA GPU Support

If you have an NVIDIA GPU, install the NVIDIA driver and the NVIDIA Container Toolkit. This lets Ollama use your GPU for faster inference.

sudo apt install -y ubuntu-drivers-common
sudo ubuntu-drivers autoinstall
sudo reboot

After the reboot, install the container toolkit and configure Docker:

distribution=$(. /etc/os-release; echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
  sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit.gpg

curl -fsSL https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
  sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit.gpg] https://#g' | \
  sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Verify GPU access with Docker:

docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi

3) Create the Docker Compose Stack

Create a project directory and a Compose file that defines two services: Ollama (LLM runtime) and Open WebUI (frontend). The volumes preserve your models and settings across restarts.

mkdir -p ~/openwebui-ollama
cd ~/openwebui-ollama
nano docker-compose.yml

Paste the following content and save:

version: "3.9"

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama:/root/.ollama
    environment:
      - OLLAMA_KEEP_ALIVE=24h
    # Uncomment the next line if you have GPU configured:
    # gpus: all

  open-webui:
    image: ghcr.io/open-webui/open-webui:latest
    container_name: open-webui
    depends_on:
      - ollama
    restart: unless-stopped
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open-webui:/app/backend/data

volumes:
  ollama:
  open-webui:

4) Start the Services and Download a Model

Bring the stack online. The first start will download container images.

docker compose up -d
docker compose ps

Pull a small model to test. You can add more models later.

docker exec -it ollama ollama pull llama3.2:3b
# Other options: mistral:7b, phi3:mini, qwen2.5:7b

Open your browser to http://SERVER-IP:3000. Create your account in Open WebUI. In the model selector, choose the model you pulled and send a prompt to verify everything works.

5) Persist Data and Backups

Docker volumes keep your models and UI data under /var/lib/docker/volumes. To back them up, stop the stack and archive the data directories. This ensures quick recovery after an OS reinstall or server migration.

docker compose down
sudo tar -czf ollama_data.tgz -C /var/lib/docker/volumes \
  $(docker volume ls -q | grep "_ollama$")/_data

sudo tar -czf openwebui_data.tgz -C /var/lib/docker/volumes \
  $(docker volume ls -q | grep "_open-webui$")/_data

docker compose up -d

6) Secure and Publish (Optional)

If you expose the service on the internet, put it behind a reverse proxy with HTTPS (Caddy, Nginx, or Traefik) and set strong authentication in Open WebUI. Use a DNS name, issue a TLS certificate (Let’s Encrypt), and restrict access with IP allowlists or an identity provider. For small teams, consider running it only on a private network or VPN.

7) Update and Maintenance

Update to the newest images regularly. This pulls security updates, new UI features, and performance improvements.

cd ~/openwebui-ollama
docker compose pull
docker compose up -d

To update models to the latest quantizations or fixes, re-pull them in Ollama. You can remove old ones you no longer need.

docker exec -it ollama ollama pull mistral:7b
docker exec -it ollama ollama list
docker exec -it ollama ollama rm modelname:tag

Troubleshooting

Port already in use: Change the host port mappings in docker-compose.yml (for example, 3001:8080 or 11435:11434) and restart.

GPU not detected: Verify nvidia-smi works on the host and in a test container. Ensure the gpus: all line is uncommented and Docker was restarted after installing the NVIDIA Toolkit.

Slow or failed model pulls: Models can be large. Check disk space (df -h), network speed, and try a smaller model first. You can also mirror models by pre-downloading on another machine and copying the volume data.

Permission errors: Ensure your user is in the docker group (id) and you have logged out/in.

What You Achieved

You now have a production-friendly, self-hosted AI chat stack powered by Open WebUI and Ollama. With Docker Compose, you can start, stop, back up, and upgrade the entire setup with a couple of commands. Add or swap models as your use cases evolve—coding assistants, knowledge chat, or creative writing—while keeping your data local and under your control.

Comments