How to Run a Private AI Assistant on Linux with Ollama and Open WebUI (No Cloud Required)

Running an AI assistant locally is no longer a science project. With modern open-source tools, you can host a private chatbot on your own Linux machine, keep sensitive data off third-party servers, and still get fast, high-quality responses. In this tutorial, you’ll install Ollama (a lightweight local LLM runtime) and Open WebUI (a clean web interface) to create a self-hosted AI assistant you can access from your browser.

This guide targets Ubuntu/Debian-based systems, but the same approach works on many other Linux distributions with small adjustments. The setup is great for IT documentation drafting, code review, internal knowledge-base Q&A, and quick command-line help—without sending prompts to the cloud.

What You’ll Build

By the end, you will have:

1) Ollama installed and running as a local service
2) A model downloaded and ready to use (for example, Llama 3.x class models)
3) Open WebUI running in Docker, connected to Ollama
4) Optional remote access for your LAN with basic safety notes

Prerequisites

Before you start, make sure you have:

• A Linux server or workstation (8 GB RAM minimum; 16 GB+ recommended)
• At least 15–30 GB free disk space (model files can be large)
• Sudo access
• Docker installed (for Open WebUI)

Step 1: Install Ollama on Linux

Ollama provides a simple way to download and run local models. Install it with the official script:

Command:

curl -fsSL https://ollama.com/install.sh | sh

After installation, confirm the service is working:

ollama --version

If your system uses systemd, Ollama typically runs as a service. You can also test it by listing models (it will likely be empty at first):

ollama list

Step 2: Pull a Model and Test It

Now download a model. A common starting point is a Llama-family instruct model. Pull it using:

ollama pull llama3

Once the download completes, run a quick interactive test:

ollama run llama3

Type a short question (for example, “Explain systemd targets in simple terms”) and confirm you get a response. If this works, your local AI runtime is ready.

Step 3: Install Docker (If Needed)

If Docker is not installed yet, install it on Ubuntu/Debian with:

sudo apt update && sudo apt install -y docker.io

Enable and start Docker:

sudo systemctl enable --now docker

Optional but useful: allow your user to run Docker without sudo (log out and back in after this):

sudo usermod -aG docker $USER

Step 4: Run Open WebUI and Connect It to Ollama

Open WebUI provides a friendly ChatGPT-like interface and supports Ollama as a backend. Start it with Docker:

docker run -d --name open-webui \
-p 3000:8080 \
-e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
-v open-webui:/app/backend/data \
--restart unless-stopped \
ghcr.io/open-webui/open-webui:main

On Linux, host.docker.internal may not be available by default on older Docker versions. If your WebUI can’t connect, rerun the container using the host network mode:

docker rm -f open-webui
docker run -d --name open-webui \
--network=host \
-e OLLAMA_BASE_URL=http://127.0.0.1:11434 \
-v open-webui:/app/backend/data \
--restart unless-stopped \
ghcr.io/open-webui/open-webui:main

Open your browser and go to:

http://localhost:3000

Create the admin account when prompted. After login, you should see your Ollama model available in the model selector. Start a chat and confirm it responds.

Step 5: Make It Usable on Your Local Network (Optional)

If you want to access the WebUI from another device on your LAN, ensure the server firewall allows TCP port 3000. On Ubuntu with UFW:

sudo ufw allow 3000/tcp

Then browse to:

http://YOUR_SERVER_IP:3000

Security note: Don’t expose this directly to the internet without authentication and TLS. If you need remote access, put it behind a VPN (WireGuard is a solid choice) or a reverse proxy with HTTPS.

Troubleshooting Tips

WebUI loads but no models appear: Verify Ollama is running and reachable. On the host, test: curl http://127.0.0.1:11434. If Docker networking is the issue, use the --network=host method.

Slow responses: Try a smaller model, close heavy applications, or run on a machine with more RAM/CPU. Local LLM performance is mostly hardware-dependent.

Disk fills up quickly: Models can consume many gigabytes. Remove unused models with: ollama list then ollama rm MODELNAME.

Wrap-Up

With Ollama and Open WebUI, you can run a capable private AI assistant on Linux in under an hour. It’s an excellent setup for IT pros, developers, and small teams who want AI features without cloud costs or privacy concerns. Once it’s working, you can experiment with different models, create prompt presets, and build a local workflow that feels like a modern AI platform—fully under your control.

LifeBytes Journal

Search This Blog

Embracing Zero Trust Architecture: The Future of Cybersecurity in 2026