Running an AI assistant locally is no longer just a hobby project. With today’s open models and lightweight runtimes, you can build a private “coding helper” that works even when the internet is down, keeps your prompts off third-party servers, and still feels fast enough for daily use. In this tutorial, you’ll install Ollama (a local LLM runtime) and Open WebUI (a clean web interface) on a Linux machine, then connect them and load a practical code-focused model.
This setup is ideal for admins, developers, and helpdesk teams who want quick answers for scripting, log parsing, config explanations, and command-line guidance without sending internal details to a cloud AI provider.
What You’ll Need
Requirements: A modern Linux distro (Ubuntu/Debian/Fedora are all fine), at least 8 GB RAM (16 GB recommended), and 15–30 GB free disk space depending on the model. CPU-only is supported; if you have a compatible GPU, responses may be faster, but it is not required for a functional installation.
Step 1: Install Ollama
Ollama provides a simple way to download and run large language models locally. On most distributions, the fastest method is the official install script. Open a terminal and run:
Command:
curl -fsSL https://ollama.com/install.sh | sh
After installation, confirm the service is available:
Command:
ollama --version
If you’re on a systemd-based distro, Ollama usually starts automatically. If not, start it manually or check the service status:
Commands:
sudo systemctl status ollama
sudo systemctl enable --now ollama
Step 2: Download a Coding-Friendly Model
You can choose different models depending on your hardware. For a balanced setup on a typical workstation, start with a mid-sized instruction model. For coding tasks, many users prefer models tuned for code completion and explanations.
Download a model with:
Example command:
ollama pull deepseek-coder:latest
Then test a quick prompt:
Example command:
ollama run deepseek-coder:latest
Type something like “Explain what a reverse proxy is in simple terms” or “Write a bash script to rotate logs.” If you get a response, the runtime is working.
Step 3: Install Docker (for Open WebUI)
Open WebUI is commonly deployed as a container. If Docker is not installed, install it using your distro’s package manager. On Ubuntu/Debian, this usually works:
Commands (Ubuntu/Debian example):
sudo apt update
sudo apt install -y docker.io
sudo systemctl enable --now docker
To avoid typing sudo for every Docker command, add your user to the docker group (log out and back in afterward):
Command:
sudo usermod -aG docker $USER
Step 4: Run Open WebUI and Connect It to Ollama
Now you’ll start Open WebUI and point it to Ollama’s API. By default, Ollama listens on http://localhost:11434. The container needs to reach the host service. On many Linux systems, you can use the host network mode to keep it simple.
Command:
docker run -d --name open-webui --restart always --network=host -e OLLAMA_BASE_URL=http://127.0.0.1:11434 -v open-webui:/app/backend/data ghcr.io/open-webui/open-webui:main
When it’s running, open your browser and visit:
URL: http://localhost:8080
Create an admin user when prompted. After login, Open WebUI should automatically detect available Ollama models. If you don’t see your model, check the model list from the terminal:
Command:
ollama list
Step 5: Make It Useful for Real Work (Practical Settings)
To get consistent, “helpdesk-ready” answers, create a default system prompt in Open WebUI that matches your environment. For example, set a short instruction like: “You are a Linux and networking assistant. Ask clarifying questions before suggesting risky commands. Provide commands with brief explanations.” This reduces accidental destructive advice and keeps responses focused.
For troubleshooting and scripting, you can also create saved prompts such as:
• “Analyze this error log and list likely root causes in order.”
• “Suggest a safe rollback plan before applying changes.”
• “Convert this one-liner into a readable bash script with comments.”
Common Problems and Fixes
Open WebUI loads, but no models appear: Verify Ollama is running: systemctl status ollama. Then confirm the base URL is correct and reachable. If you didn’t use --network=host, the container may not reach localhost on the host.
Responses are slow: Use a smaller model, close other memory-heavy apps, and consider increasing swap. On limited hardware, a 7B model often feels far more responsive than a 13B+ model.
Disk space disappears quickly: Models are large. Remove unused models with ollama rm MODELNAME, and periodically review ollama list.
Next Steps: Hardening and Remote Access
Once everything works locally, you can place Open WebUI behind a reverse proxy (like Nginx) with HTTPS, restrict access by IP, and enable authentication. If you plan to share it with a small team, consider running it on a dedicated VM and keeping a strict update routine for the container image and the host OS.
With Ollama and Open WebUI, you get a clean, private AI assistant that can help with scripts, configs, and troubleshooting—without turning your internal prompts into someone else’s training data.
Comments
Post a Comment