Run Your Own AI Code Assistant with Ollama + Open WebUI on Linux (No Cloud Needed)

Why host a local AI assistant?

If you write scripts, manage servers, or handle helpdesk tickets, an AI assistant can speed up routine work like summarizing logs, drafting commands, or explaining configuration files. The problem is that many cloud tools send your prompts and snippets to third-party services. A local setup keeps sensitive data on your own machine, works offline, and can be tuned for your workflow.

In this tutorial, you will install Ollama (a lightweight local LLM runtime) and Open WebUI (a web interface similar to popular chat tools) on Linux. The result is a private AI assistant you can access from your browser on your LAN.

What you need

Hardware: A modern 64-bit Linux system. For acceptable performance, aim for 16 GB RAM or more. A GPU helps but is not required for basic use. Lighter models can run on CPU-only machines, including small servers.

Software: A recent Linux distribution (Ubuntu/Debian/Fedora are all fine), Docker for Open WebUI, and basic terminal access with sudo.

Step 1: Install Ollama

Ollama runs the model locally and exposes an API that other tools (like Open WebUI) can call. Install it using the official script:

Command:

curl -fsSL https://ollama.com/install.sh | sh

After installation, check that the service is working:

ollama --version

On many distros, Ollama runs as a service. If you need to confirm it is active:

systemctl status ollama

Step 2: Pull a model and test it

Next, download a model. If you are CPU-only or want fast responses, start with a smaller model. For general coding help, you can also try code-focused models once the basics work.

Example (general model):

ollama pull llama3.1

Run a quick prompt to confirm everything works:

ollama run llama3.1

Type a question like “Explain what journald does on Linux” and confirm you get a response. Exit with /bye or Ctrl+C depending on your shell behavior.

Step 3: Install Docker (if not installed)

Open WebUI is easiest to deploy with Docker. On Ubuntu/Debian, you can install Docker like this:

sudo apt update

sudo apt install -y docker.io

sudo systemctl enable --now docker

Optional but recommended: allow your user to run Docker without sudo (log out and back in after this):

sudo usermod -aG docker $USER

Step 4: Run Open WebUI and connect it to Ollama

Open WebUI will provide a clean browser interface and conversation history. The key is pointing it at Ollama’s API endpoint.

First, make sure Ollama is listening locally. By default it is typically available at http://127.0.0.1:11434. Now start Open WebUI in Docker:

docker run -d --name open-webui --restart unless-stopped -p 3000:8080 -e OLLAMA_BASE_URL=http://host.docker.internal:11434 -v open-webui:/app/backend/data ghcr.io/open-webui/open-webui:main

On Linux, host.docker.internal may not be available depending on your Docker version. If the UI cannot connect, rerun the container using host networking instead:

docker rm -f open-webui

docker run -d --name open-webui --restart unless-stopped --network=host -e OLLAMA_BASE_URL=http://127.0.0.1:11434 -v open-webui:/app/backend/data ghcr.io/open-webui/open-webui:main

Now open your browser and visit:

http://localhost:3000

Create the first admin user when prompted. Once logged in, you should see available Ollama models. If you do not, go to settings and verify the Ollama base URL.

Step 5: Enable LAN access (optional and safer if restricted)

If you want to access the assistant from another device on your network, bind the service to a reachable interface and restrict it with firewall rules. For Open WebUI using Docker with port publishing, ensure your firewall only allows trusted subnets to connect to port 3000.

For example, on Ubuntu with UFW you can allow only your local subnet (adjust the CIDR):

sudo ufw allow from 192.168.1.0/24 to any port 3000 proto tcp

Avoid exposing the service directly to the internet. If you need remote access, put it behind a VPN (WireGuard is a good choice) or a reverse proxy with authentication.

Troubleshooting tips

Open WebUI shows “cannot reach Ollama”: Confirm Ollama is running with systemctl status ollama. Then check connectivity from the container. If you are using port mapping, the simplest fix on Linux is often --network=host.

Model downloads are slow or fail: Verify DNS and outbound access. Large models can be tens of gigabytes. If disk space is tight, remove unused models with ollama list and ollama rm <model>.

Responses are too slow: Try a smaller model, reduce context size in settings, and close other memory-heavy applications. CPU-only systems benefit from lightweight models and shorter prompts.

Next steps: make it useful for real admin work

Once the UI is running, build a few saved prompts for your daily tasks: “Summarize this syslog excerpt,” “Write a Bash one-liner to find large files,” or “Draft a polite helpdesk reply.” Because the assistant is local, you can safely paste internal error messages, configuration snippets, or playbook fragments without sending them to a third party.

With Ollama and Open WebUI, you get a practical self-hosted AI assistant that fits nicely into a Linux admin toolbox: fast to deploy, easy to maintain, and private by design.

LifeBytes Journal

Search This Blog