Deploy a Local AI Chatbot on Linux with Ollama and Open WebUI (No Cloud Required)

Why Run a Local AI Chatbot?

If you like using ChatGPT-style assistants but you work with sensitive data, cloud tools can be a non-starter. Running a local AI chatbot on your own Linux machine gives you control over privacy, lets you work offline, and can reduce ongoing costs. Thanks to modern lightweight model runners, you can now deploy an AI assistant in minutes without building anything from source.

In this tutorial, you will set up Ollama (a simple local LLM runtime) and Open WebUI (a clean web interface) on a Linux server. The result is a private, browser-based AI chatbot you can access on your LAN.

What You Need

System requirements: A modern Linux distribution (Ubuntu/Debian/RHEL-based), at least 8 GB RAM (16 GB recommended), and 15–30 GB of free disk space depending on the model you choose. A GPU is optional; CPU-only works, but responses may be slower.

Network requirements: If you want to access the chatbot from other devices, ensure you can reach the server over the network and that any firewall rules allow the chosen port.

Step 1: Install Ollama

Ollama is the engine that downloads and runs local AI models. On most Linux systems, the fastest method is the official install script. Open a terminal and run:

curl -fsSL https://ollama.com/install.sh | sh

After installation, confirm the service is working:

ollama --version

If your system uses systemd (most servers do), Ollama typically runs as a service. You can check its status with:

systemctl status ollama

Step 2: Pull a Model and Test It

Next, download a model. For a good balance between quality and speed on typical hardware, many users start with smaller variants. Example:

ollama pull llama3.2

Then run a quick interactive test:

ollama run llama3.2

Type a prompt, press Enter, and confirm you get a response. If the model feels slow, try a smaller one or ensure your server is not memory constrained.

Step 3: Install Open WebUI (Docker Method)

Open WebUI provides the familiar chat interface in your browser. The most reliable way to install it is using Docker, because updates are easy and dependencies stay isolated.

First, install Docker if you don’t already have it. On Ubuntu/Debian, this common approach works (adjust for your distro if needed):

sudo apt update && sudo apt install -y docker.io

sudo systemctl enable --now docker

Now start Open WebUI. The key is to point it to Ollama. If Ollama runs on the same machine, you can use host networking for simplicity:

sudo docker run -d --name open-webui --restart unless-stopped --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 ghcr.io/open-webui/open-webui:main

Open WebUI will typically be available on port 8080. From a browser on the server, test:

http://localhost:8080

Step 4: Access It from Another Device (LAN)

To use the chatbot from your laptop or phone on the same network, browse to:

http://SERVER_IP:8080

If it doesn’t load, check firewall rules. On Ubuntu with UFW, you can allow the port like this:

sudo ufw allow 8080/tcp

Also confirm that Docker is running and the container is healthy:

sudo docker ps

Step 5: Add and Switch Models in the Web Interface

Once logged into Open WebUI, you can select available models that Ollama has downloaded. If you want more choices, pull additional models on the server:

ollama pull mistral

ollama pull qwen2.5

Refresh the model list in the UI and switch models depending on your task. Smaller models respond faster; larger models can be better at reasoning and writing, but need more RAM and CPU.

Troubleshooting Tips

Open WebUI loads, but no models appear: Verify the environment variable points to Ollama correctly. If you used host networking, http://127.0.0.1:11434 is usually correct. Also confirm Ollama is listening:

ss -lntp | grep 11434

Model downloads are slow: Try again off-peak, confirm DNS/network stability, and ensure you have enough disk space. Model pulls can be several gigabytes.

Responses are very slow: Check RAM usage with free -h. If the system is swapping, performance will drop sharply. Consider a smaller model or upgrading memory.

Keep It Updated

To update Open WebUI, pull the latest container and recreate it:

sudo docker pull ghcr.io/open-webui/open-webui:main

sudo docker stop open-webui && sudo docker rm open-webui

sudo docker run -d --name open-webui --restart unless-stopped --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 ghcr.io/open-webui/open-webui:main

For Ollama, rerun the installer script occasionally or follow your distro’s recommended update path if you installed it through a package manager.

Conclusion

You now have a fully local AI chatbot running on Linux with Ollama and Open WebUI. This setup is practical for internal helpdesk use, drafting documentation, summarizing logs, or experimenting with prompts without sending data to third-party services. From here, you can harden access with a reverse proxy, enable HTTPS, and standardize your model choices for your team.

LifeBytes Journal

Search This Blog