Overview
This tutorial shows how to self-host Open WebUI with Ollama on Ubuntu using Docker Compose. You will get a clean, repeatable setup that runs on CPU or GPU, stores model data persistently, and can be upgraded with a single command. Open WebUI provides a modern interface, while Ollama runs local large language models such as Llama 3, Mistral, Phi-3, and CodeLlama.
Prerequisites
You will need a fresh Ubuntu 22.04 or 24.04 server (cloud VM or local machine), a user with sudo rights, and at least 8 GB of RAM. If you plan to use a GPU, an NVIDIA card is recommended. Open ports 3000 (Web UI) and 11434 (Ollama API) on your firewall or security group.
1) Install Docker and Compose
Install the official Docker Engine and the Compose plugin on Ubuntu. Log out and back in (or run newgrp) after adding your user to the docker group.
sudo apt update
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | \
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo $VERSION_CODENAME) stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo usermod -aG docker $USER
newgrp docker
docker --version
docker compose version
2) Optional: Enable NVIDIA GPU Support
If you have an NVIDIA GPU, install the NVIDIA driver and the NVIDIA Container Toolkit. This lets Ollama use your GPU for faster inference.
sudo apt install -y ubuntu-drivers-common
sudo ubuntu-drivers autoinstall
sudo reboot
After the reboot, install the container toolkit and configure Docker:
distribution=$(. /etc/os-release; echo $ID$VERSION_ID)
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit.gpg
curl -fsSL https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
Verify GPU access with Docker:
docker run --rm --gpus all nvidia/cuda:12.4.0-base-ubuntu22.04 nvidia-smi
3) Create the Docker Compose Stack
Create a project directory and a Compose file that defines two services: Ollama (LLM runtime) and Open WebUI (frontend). The volumes preserve your models and settings across restarts.
mkdir -p ~/openwebui-ollama
cd ~/openwebui-ollama
nano docker-compose.yml
Paste the following content and save:
version: "3.9"
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
ports:
- "11434:11434"
volumes:
- ollama:/root/.ollama
environment:
- OLLAMA_KEEP_ALIVE=24h
# Uncomment the next line if you have GPU configured:
# gpus: all
open-webui:
image: ghcr.io/open-webui/open-webui:latest
container_name: open-webui
depends_on:
- ollama
restart: unless-stopped
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
volumes:
- open-webui:/app/backend/data
volumes:
ollama:
open-webui:
4) Start the Services and Download a Model
Bring the stack online. The first start will download container images.
docker compose up -d
docker compose ps
Pull a small model to test. You can add more models later.
docker exec -it ollama ollama pull llama3.2:3b
# Other options: mistral:7b, phi3:mini, qwen2.5:7b
Open your browser to http://SERVER-IP:3000. Create your account in Open WebUI. In the model selector, choose the model you pulled and send a prompt to verify everything works.
5) Persist Data and Backups
Docker volumes keep your models and UI data under /var/lib/docker/volumes. To back them up, stop the stack and archive the data directories. This ensures quick recovery after an OS reinstall or server migration.
docker compose down
sudo tar -czf ollama_data.tgz -C /var/lib/docker/volumes \
$(docker volume ls -q | grep "_ollama$")/_data
sudo tar -czf openwebui_data.tgz -C /var/lib/docker/volumes \
$(docker volume ls -q | grep "_open-webui$")/_data
docker compose up -d
6) Secure and Publish (Optional)
If you expose the service on the internet, put it behind a reverse proxy with HTTPS (Caddy, Nginx, or Traefik) and set strong authentication in Open WebUI. Use a DNS name, issue a TLS certificate (Let’s Encrypt), and restrict access with IP allowlists or an identity provider. For small teams, consider running it only on a private network or VPN.
7) Update and Maintenance
Update to the newest images regularly. This pulls security updates, new UI features, and performance improvements.
cd ~/openwebui-ollama
docker compose pull
docker compose up -d
To update models to the latest quantizations or fixes, re-pull them in Ollama. You can remove old ones you no longer need.
docker exec -it ollama ollama pull mistral:7b
docker exec -it ollama ollama list
docker exec -it ollama ollama rm modelname:tag
Troubleshooting
Port already in use: Change the host port mappings in docker-compose.yml (for example, 3001:8080 or 11435:11434) and restart.
GPU not detected: Verify nvidia-smi works on the host and in a test container. Ensure the gpus: all line is uncommented and Docker was restarted after installing the NVIDIA Toolkit.
Slow or failed model pulls: Models can be large. Check disk space (df -h), network speed, and try a smaller model first. You can also mirror models by pre-downloading on another machine and copying the volume data.
Permission errors: Ensure your user is in the docker group (id) and you have logged out/in.
What You Achieved
You now have a production-friendly, self-hosted AI chat stack powered by Open WebUI and Ollama. With Docker Compose, you can start, stop, back up, and upgrade the entire setup with a couple of commands. Add or swap models as your use cases evolve—coding assistants, knowledge chat, or creative writing—while keeping your data local and under your control.
Comments
Post a Comment