Run a Local AI Assistant on Windows 11: Install Ollama and Open WebUI with Optional GPU Acceleration

Overview

This step-by-step guide shows you how to run a local AI assistant on Windows 11 using Ollama and Open WebUI. You will install Ollama, download a model, and connect a user-friendly web interface via Docker. The tutorial is beginner-friendly yet covers advanced options like GPU acceleration, authentication, and storage tuning. By the end, you will have a private, fast, and offline-capable AI setup on your own PC.

Prerequisites

Before you start, make sure you have: Windows 11 (22H2 or newer), administrator rights, and at least 8 GB RAM. For GPU acceleration, install the latest graphics driver. Ollama uses CUDA for NVIDIA GPUs and DirectML for AMD/Intel; GPU use is automatic if supported. You do not need WSL for this guide. An optional Docker Desktop installation is required for Open WebUI.

Step 1 — Install Ollama for Windows

1) Download the official installer from https://ollama.com/download and complete the setup.
2) Open PowerShell and verify the installation: ollama --version.
3) Start the Ollama service if it is not already running: ollama serve (you can keep it in the background by closing the window after confirming it is running as a service).

Step 2 — Pull and test a model

1) In PowerShell, download a model. For a good balance of speed and quality, try: ollama pull llama3.
2) Run it interactively: ollama run llama3, then ask a question like: What can you do?.
3) Exit the session with /bye when finished. Models are stored locally in %LOCALAPPDATA%\Ollama\models by default.

Step 3 — Install Docker Desktop (for Open WebUI)

Open WebUI gives you a clean web interface for prompts, chat history, and multi-model workflows. Install Docker Desktop from https://www.docker.com/products/docker-desktop/ and start it. Ensure the Docker engine is running (the whale icon should be active in the system tray).

Step 4 — Launch Open WebUI linked to Ollama

Run the following Docker command in PowerShell to start Open WebUI and connect it to your local Ollama instance exposed at http://localhost:11434:
docker run -d --name open-webui -p 3000:8080 -e OLLAMA_BASE_URL=http://host.docker.internal:11434 -v open-webui:/app/backend/data ghcr.io/open-webui/open-webui:latest
Once the container is healthy, open http://localhost:3000 in your browser. Choose a model (for example, llama3) and start chatting.

Optional — Enable authentication for Open WebUI

To protect your UI with a login, recreate the container with auth variables:
docker rm -f open-webui
docker run -d --name open-webui -p 3000:8080 -e OLLAMA_BASE_URL=http://host.docker.internal:11434 -e WEBUI_AUTH=true -e DEFAULT_USERNAME=admin -e DEFAULT_PASSWORD=changeMeNow -v open-webui:/app/backend/data ghcr.io/open-webui/open-webui:latest
Visit http://localhost:3000 and sign in with your credentials.

Optional — GPU acceleration tips

Ollama automatically uses your GPU when supported drivers are present. To nudge usage, you can set the number of GPUs: setx OLLAMA_NUM_GPU 1 then restart the Ollama service or your PC. If you have an NVIDIA GPU, ensure the latest Game Ready or Studio driver is installed. For AMD/Intel, keep your driver and Windows up to date to benefit from DirectML improvements. During the first run, the model may compile kernels; subsequent runs are faster.

Optional — Move the models folder to another drive

If you want models on a larger drive, set this environment variable and restart the service: setx OLLAMA_MODELS "D:\Ollama\Models". Move the existing folder from %LOCALAPPDATA%\Ollama\models to the new location to avoid re-downloading large files.

Troubleshooting

Open WebUI cannot connect to Ollama: Make sure Ollama is running: curl http://localhost:11434/api/tags should return a JSON list of models. If it works on the host but not in Docker, confirm the container uses host.docker.internal and port 11434 as shown in the command. Also check Windows Firewall for any blocked inbound rules on Docker or Ollama.

Models are slow or fail to load: Try a smaller model first: ollama pull phi3:mini and run ollama run phi3:mini. Close heavy apps, ensure you have enough RAM/VRAM, and avoid aggressive antivirus scanning of the models folder.

Docker errors on startup: Open Docker Desktop and verify that the engine is running. If ports are already in use, change the mapping (for example, -p 3001:8080) and refresh the browser at the new address.

Usage tips

Inside Open WebUI, create multiple chats per model for different tasks, enable markdown rendering, and configure system prompts for role-specific behavior. In PowerShell, you can also run one-off prompts without the UI: ollama run llama3 "Write a haiku about morning coffee." For reproducibility, export your Open WebUI data with the named volume and back it up regularly.

What you achieved

You now have a private, local AI assistant on Windows 11 powered by Ollama and Open WebUI. You can switch models, run fully offline, and take advantage of your GPU for faster responses. This setup is ideal for coding help, note-taking, drafting, and research without sending your data to external servers.

LifeBytes Journal

Search This Blog