Deploy a Private AI Code Assistant on Linux with Ollama and Open WebUI (Docker)

Running a private AI assistant locally is becoming a practical option for developers and IT teams who want faster responses, lower cloud costs, and better control over sensitive code. In this tutorial, you will set up a self-hosted AI “code helper” on a Linux server using Ollama (for running large language models locally) and Open WebUI (a clean web interface). The result is a browser-based assistant you can use for code reviews, script generation, troubleshooting, and documentation drafts—without sending prompts to external services.

What You’ll Build

You will deploy two components: Ollama, which downloads and serves models via a local API, and Open WebUI, which connects to Ollama and provides a chat UI with conversation history. This guide uses Docker to keep the installation clean and easy to update.

Prerequisites

Before you start, prepare a Linux machine (Ubuntu 22.04/24.04, Debian 12, or similar) with at least 8 GB RAM (16 GB is better for larger models) and 20+ GB free disk. A GPU is optional, but a modern CPU works fine for smaller models. You also need Docker and Docker Compose (or the Docker Compose plugin).

Step 1: Install Docker (Ubuntu/Debian)

If Docker is not installed, run the commands below. On other distributions, use the official Docker documentation for your package manager.

Commands:

sudo apt update sudo apt install -y ca-certificates curl gnupg sudo install -m 0755 -d /etc/apt/keyrings curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt update sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Optional but recommended: allow your user to run Docker without sudo.

sudo usermod -aG docker $USER

Log out and back in after changing group membership.

Step 2: Create a Project Directory

Create a dedicated folder for your deployment so configuration and volumes stay organized.

mkdir -p ~/private-ai && cd ~/private-ai

Step 3: Create a Docker Compose File

Create a file named docker-compose.yml with the content below. It starts Ollama and Open WebUI, stores model data on disk, and makes the web UI available on port 3000.

cat > docker-compose.yml <<'EOF' services: ollama: image: ollama/ollama:latest container_name: ollama restart: unless-stopped ports: - "11434:11434" volumes: - ollama:/root/.ollama open-webui: image: ghcr.io/open-webui/open-webui:main container_name: open-webui restart: unless-stopped ports: - "3000:8080" environment: - OLLAMA_BASE_URL=http://ollama:11434 volumes: - openwebui:/app/backend/data depends_on: - ollama volumes: ollama: openwebui: EOF

Step 4: Start the Services

Bring the stack up in the background and confirm both containers are healthy.

docker compose up -d docker ps

Now open your browser and go to http://YOUR_SERVER_IP:3000. The first user you create in Open WebUI typically becomes the admin, depending on the version.

Step 5: Download a Model with Ollama

Ollama pulls models on demand. For a lightweight code-focused start, try a smaller model first. Run the command below to download and test a model from inside the Ollama container.

docker exec -it ollama ollama pull codellama:7b docker exec -it ollama ollama run codellama:7b

If you prefer a general assistant model, you can also try:

docker exec -it ollama ollama pull llama3.1:8b

Once pulled, go back to Open WebUI, start a new chat, and select the model. Your prompts will be processed locally on your server.

Step 6: Basic Hardening and Access Tips

If this server is not strictly internal, place Open WebUI behind a reverse proxy such as Nginx or Caddy and enable HTTPS. At a minimum, restrict access with a firewall so only your office IP/VPN can reach port 3000. On Ubuntu with UFW, you can allow only your admin workstation and block the rest.

sudo ufw allow from YOUR_IP to any port 3000 proto tcp sudo ufw enable

Troubleshooting Common Problems

Open WebUI can’t see Ollama models: confirm the environment variable OLLAMA_BASE_URL points to http://ollama:11434 (container-to-container), and verify Ollama is listening: docker logs ollama.

Slow responses: smaller models respond faster on CPU. Also check system load and RAM usage. If the machine is swapping heavily, upgrade RAM or choose a smaller model.

Disk usage grows quickly: model files are large. Keep an eye on volumes and remove unused models with docker exec -it ollama ollama list and docker exec -it ollama ollama rm MODELNAME.

Conclusion

With Ollama and Open WebUI, you can run a capable private AI code assistant on your own Linux server in under an hour. This setup is ideal for testing prompts safely, speeding up daily scripting tasks, and keeping sensitive code and logs under your control. Once it’s running, you can experiment with different models, tighten access via HTTPS and VPN, and even dedicate a GPU host later for faster generation.

LifeBytes Journal

Search This Blog