How to Build a Private AI Assistant with Ollama and Open WebUI on Ubuntu (No Cloud Required)

Running an AI assistant locally is one of the most practical “advanced” upgrades you can make to a Linux workstation or home lab. You get faster iteration, more privacy, and you avoid sending sensitive text to a third-party cloud service. In this tutorial, you’ll install Ollama (a lightweight local LLM runtime) and Open WebUI (a clean web interface) on Ubuntu, then secure access and confirm everything is working.

This setup is great for a personal helpdesk bot, drafting and summarizing documents, generating scripts, or building an internal knowledge tool. It works on CPU-only systems, but performance improves significantly if you have a modern GPU. The steps below focus on a reliable, repeatable install that you can maintain like any other server service.

Prerequisites

Before you start, make sure you have: Ubuntu 22.04/24.04 (or a compatible Debian-based distro), a user with sudo permissions, at least 8 GB RAM (16 GB recommended for larger models), and roughly 15–30 GB of free disk space depending on which models you download.

Step 1: Update the system

Open a terminal and update your packages to avoid dependency issues:

sudo apt update && sudo apt -y upgrade

Step 2: Install Ollama

Ollama provides a simple command-line experience for downloading and running models. Install it using the official installer:

curl -fsSL https://ollama.com/install.sh | sudo sh

After installation, confirm the service is running:

systemctl status ollama

If it’s not active, start and enable it:

sudo systemctl enable --now ollama

Step 3: Download and test a model

Now pull a model. A good starting point is a smaller “general chat” model to verify your environment first:

ollama pull llama3.1

Then run a quick test:

ollama run llama3.1

Type a prompt like: “Write a bash one-liner to list the 10 largest files in a directory.” If you get a response, the core engine is working.

Step 4: Install Docker (recommended for Open WebUI)

Open WebUI is easiest to deploy in a container. Install Docker using Ubuntu packages:

sudo apt install -y docker.io

Enable the Docker service:

sudo systemctl enable --now docker

Optional but convenient: allow your user to run Docker without sudo (log out and back in after this):

sudo usermod -aG docker $USER

Step 5: Run Open WebUI and connect it to Ollama

Run the container and map it to a local port (3000). We’ll also mount a persistent volume so settings and chat history survive reboots:

docker run -d --name open-webui --restart unless-stopped -p 3000:8080 -e OLLAMA_BASE_URL=http://host.docker.internal:11434 -v open-webui:/app/backend/data ghcr.io/open-webui/open-webui:main

On Linux, host.docker.internal may not work on older Docker builds. If you open the web UI and it cannot reach Ollama, rerun the container using host networking instead:

docker rm -f open-webui

docker run -d --name open-webui --restart unless-stopped --network=host -e OLLAMA_BASE_URL=http://127.0.0.1:11434 -v open-webui:/app/backend/data ghcr.io/open-webui/open-webui:main

Now open your browser and go to http://localhost:3000 (or the server IP with port 3000). Create the first admin account. In most cases, Open WebUI will automatically detect the Ollama endpoint once the environment variable is set.

Step 6: Basic security hardening (don’t skip this)

If this is only for your local machine, binding to localhost is usually enough. If you plan to access it from other devices, you should secure it properly. At a minimum, configure the firewall to allow only trusted networks.

With UFW, you can allow port 3000 only from your LAN (example uses 192.168.1.0/24):

sudo ufw allow from 192.168.1.0/24 to any port 3000 proto tcp

sudo ufw enable

For a more professional setup, place Open WebUI behind Nginx with HTTPS (Let’s Encrypt) and optionally basic auth or SSO. That way, you’re not exposing a plain HTTP admin login to the network.

Step 7: Troubleshooting common issues

Open WebUI can’t see any models: confirm Ollama is running and reachable. Test locally with curl http://127.0.0.1:11434. If the container can’t reach the host, switch to --network=host as shown above.

Model downloads are slow or fail: try again later or switch networks. Large models are multi-GB downloads. Ensure you have enough disk space under /usr/share/ollama (or your configured storage path).

High CPU/RAM usage: use a smaller model, reduce parallel usage, or move the server to a machine with more memory. For older hardware, smaller models typically feel much more responsive.

Final check

At this point you have a private AI assistant running entirely on your own Ubuntu system. Ollama handles the model runtime, and Open WebUI provides an easy interface for chat, prompt testing, and daily use. Once you’re comfortable with the basics, you can explore model choices, system prompts for a “helpdesk” personality, and integrating local documents for internal Q&A workflows.

3.

Comments