← Back to Home

Complete Ollama + OpenClaw Local AI Assistant Setup: Multi-Platform Messaging Integration and Model Configuration

OllamaOpenClawlocal AIAI assistantVPSself-hosted

After running local AI on a Raspberry Pi for six months, I migrated my entire workflow to a VPS this year: Ollama for inference, OpenClaw as the messaging gateway, connected to WeChat/Telegram/QQ, running 24/7 without downtime. The standout feature of this combo is that OpenClaw has native support for Ollama as its inference backend — ollama launch openclaw connects everything in one command. This guide walks through my complete process from installation to production-ready setup, covering **Ollama deployment config**, **OpenClaw installation and init**, **multi-platform channel access**, and **model selection with cloud API fallback** — plus 3 real pitfalls I hit along the way.

Why the Ollama + OpenClaw Combo

If you want to build your own AI assistant, there are two common paths:

Route A: Cloud API only (OpenAI / Claude / MiniMax, etc.). Convenient, but you pay per conversation and your data passes through third-party servers.

Route B: Fully local deployment (Ollama + frontend UI). Your data stays private, but connecting to multiple messaging platforms requires custom code.

Ollama + OpenClaw is a third path — local inference as primary, cloud API as backup. Ollama handles model management and inference, OpenClaw bridges the AI to messaging channels (WeChat, Telegram, Discord, QQ, etc.). You can run local models at home, then automatically switch to a cloud API when you're away.

According to Ollama's official docs (verified 2026-05), ollama launch openclaw automatically handles OpenClaw installation, security setup, model selection, and gateway startup — no manual multi-line configuration required.

Step 1: Install Ollama (VPS Environment)

On a Linux VPS, there are three ways to install Ollama, in order of recommendation:

Method 1: Official install script (recommended)

curl -fsSL https://ollama.com/install.sh | sh

The script auto-detects your system environment, installs to /usr/local/bin/ollama, creates an ollama user and systemd service. After installation, start the service:

ollama serve  # Start Ollama service

Verify it's running:

curl http://localhost:11434/api/generate -d '{"model":"gemma3","prompt":"hello","stream":false}'

A JSON response containing an response field means Ollama is operational.

Method 2: Docker (for VPSes already running Docker)

docker run -d \
  --name ollama \
  -p 11434:11434 \
  -v ollama_data:/root/.ollama \
  ollama/ollama:latest

Note: Docker mode doesn't run as a systemd service by default — after a VPS reboot you'll need docker start ollama manually. For production, Method 1 is recommended.

Method 3: Manual binary download

Download the correct architecture binary from ollama.com/download, ideal for minimal Linux environments where the install script won't work:

chmod +x Ollama-linux-amd64
sudo mv Ollama-linux-amd64 /usr/local/bin/ollama

Version check:

ollama --version

At the time of writing, the current version is in the 0.5.x series. Check github.com/ollama/ollama/releases for the exact latest version.

Step 2: Pull and Manage Models

After installing Ollama, no models are included by default. Pull them manually:

Pull base models

ollama pull gemma3               # Pull Gemma 3b (good for light tasks)
ollama pull qwen3.5              # Pull Qwen 3.5b (vision support)
ollama pull deepseek-v4-flash    # DeepSeek Flash (cost-effective)

List downloaded models

ollama list

Output looks like:

NAME                    SIZE      MODIFIED
deepseek-v4-flash       4.7GB     5 hours ago
gemma3                  4.9GB     2 days ago
qwen3.5                 7.2GB     3 days ago

**Disk space note:** Model files are stored in ~/.ollama/models. If your VPS storage is tight, use ollama pull --no-cache to avoid redundant downloads.

GPU config (if VPS has NVIDIA GPU):

Ollama auto-detects NVIDIA GPUs — no extra config needed. For multi-GPU systems, specify with an environment variable:

CUDA_VISIBLE_DEVICES=0 ollama serve

On VPSes without a GPU, CPU inference is significantly slower. My tests showed qwen3.5b running at about 3-5 seconds/token on a 2-core CPU VPS.

Step 3: Install and Start OpenClaw

OpenClaw can be installed two ways: via Ollama (simplest), or manually via npm.

Method 1: Via Ollama (recommended)

ollama launch openclaw

This command runs through these steps:

1. Detects whether OpenClaw is installed; if not, installs it via npm

2. Starts the security setup wizard (first run only)

3. Prompts you to select a default model from your downloaded list

4. Auto-starts the Gateway daemon and opens the TUI

Method 2: npm manual install

npm install -g openclaw
openclaw gateway start
openclaw configure  # Interactive setup

Verify OpenClaw is running:

openclaw gateway status

Normal output should show Gateway running with the currently configured model name.

OpenClaw config file locations:

Step 4: Connect Messaging Channels (WeChat/Telegram/QQ)

OpenClaw supports multiple messaging channels simultaneously. Using Telegram and QQ as examples:

Connecting Telegram:

1. Create a bot via Telegram's BotFather (@BotFather) and get your HTTP API Token

2. Run the config command:

openclaw configure --section channels

3. Select Telegram and enter your token

Connecting QQ (QQ Channels):

openclaw configure --section channels

QQ Channel mode requires a QQ Bot Token and AppID from the QQ Open Platform.

Multiple channels online simultaneously:

OpenClaw's Gateway is designed as a unified entry point for multi-channel messaging. Messages from different channels converge to the same AI backend — no need to configure AI separately for each channel.

Pitfalls during channel setup:

Step 5: Configure Cloud API Fallback (Auto-Switch When Away)

Local models are limited by VPS compute power. When you're away, you may need to switch to a cloud API. OpenClaw's model config supports multi-model routing.

Config file example (~/.openclaw/config.yml):

model: qwen3.5:local
fallback:
  - name: kimi-k2.5:cloud
    type: cloud
  - name: minimax-m2.7:cloud
    type: cloud

When the local model times out (default 30 seconds), OpenClaw automatically tries the cloud models in the fallback list. Cloud API keys must be configured in OpenClaw in advance.

When to use cloud API:

The 3 Real Pitfalls I Hit

Pitfall 1: Models disappear after Ollama service restart

Ollama's models are stored in ~/.ollama/models. If a faulty cleanup script or Docker prune runs before a VPS reboot, the models directory can get wiped too.

**Fix:** Redirect ~/.ollama to a data volume (if you have one), or back up regularly:

# Backup models (as needed)
tar -czf ollama_models_backup.tar.gz ~/.ollama/models

**Pitfall 2: OpenClaw's --yes flag fails in non-interactive environments**

ollama launch openclaw --yes is the officially recommended automation flag, theoretically skipping interactive prompts. In my testing, if ~/.openclaw/config.yml already exists with complete model configuration, --yes skips selection and uses the saved config. But if the config file doesn't exist, --yes still attempts interactive setup — causing it to hang in cron scripts.

**Fix:** Prepare your config.yml in advance. Make sure the config is complete before using --yes in cron:

# Check if config file exists
ls -la ~/.openclaw/config.yml

If it doesn't exist, run a full interactive config session once to generate the config file first, then use --yes in cron.

Pitfall 3: Telegram bot message delays

If the network path between OpenClaw's Gateway and Telegram is unstable, message delays can exceed 10 seconds. This isn't an OpenClaw issue — it's Telegram API's polling mechanism limitation.

Fix: Set up a reverse proxy on your VPS and switch Telegram from polling to webhook mode to reduce network round-trips. You'll need to set a webhook URL pointing to your proxy in Telegram BotFather.

Who This Is For — and Who It Isn't

Good fit:

Not a good fit:

Related Tools Comparison

ToolModel ManagementMessaging ChannelsLocal-FirstOpen Source
Ollama + OpenClaw✅ Ollama✅ Multi-channel✅ Yes✅ Fully open source
OpenAI API + Telegram Bot❌ None✅ Requires custom build❌ Cloud only✅ Bot is open source
Jan AI✅ Built-in❌ None✅ Yes✅ Open source
LM Studio✅ Built-in❌ None✅ Yes✅ Open source

If you only need to use AI at your desk, LM Studio or Jan AI are simpler. If you need multi-platform messaging access, Ollama + OpenClaw is currently the most complete open-source solution.

---

**Disclosure:** Ollama and OpenClaw are both open-source projects (MIT/Apache 2.0). The author has no commercial relationship with either project. The MiniMax API mentioned at the end is a paid service — check MiniMax official pricing before configuring.

👉 Get started: https://platform.minimaxi.com/subscribe/token-plan?code=E5yur9NOub&source=link

📌 This article was AI-assisted generated and human-reviewed | TechPassive — An AI-driven content testing site focused on real tool reviews

🔗 Recommended Tools

These are carefully selected tools. Using our affiliate links supports us to keep producing quality content:

DigitalOcean Cloud Vultr VPS 🏠 Amazon Best Sellers 📱 Amazon Devices 🔧 Amazon Renewed 🏠 Home Appliances 🎮 Apps & Games 📚 Books 💊 Health & Home 🎬 Movies & TV ⚽ Sports & Outdoors 🎯 Video Games 💻 Computers ⭐ MiniMax Token Plan
← Back to Home