← Back to Home

Langfuse Self-Hosted Deployment Traps

n8nMCPLangfuseLLM engineering platformSelf-hostedAI toolsDocker

# Langfuse Self-Hosted Deployment Traps: 6 Real Problems with PostgreSQL, ClickHouse, and Redis

Langfuse (GitHub 27,403 ⭐, verified May 2026) is a tool I can't live without when building AI development platforms. As an open-source LLM engineering platform, it integrates with OpenTelemetry, Langchain, OpenAI SDK, and LiteLLM — supporting LLM observability, eval metrics, prompt management, and datasets.

But when I upgraded from v2 to v3, the complexity far exceeded my expectations. v3's architecture changed dramatically: it now requires ClickHouse (OLAP storage), Redis (caching), and S3/MinIO (object storage) as additional dependencies. A single PostgreSQL instance no longer handles production workloads.

This article documents the 6 real pitfalls I hit deploying Langfuse v3 on Ubuntu 24.04 VPS, with specific solutions.

Architecture Changes: Why v3 Is Way More Complex Than v2

Langfuse v2 needed just one container plus PostgreSQL. But v3 (stable released Dec 6, 2024) introduced a completely new architecture:

ComponentPurposeMinimum Config
Langfuse Server + WorkerAPI/Web UI + background tasks2 CPU + 3 GB RAM (Server), 2 CPU + 4 GB RAM (Worker)
PostgreSQL >=12Transactional data storage1 CPU + 1 GB RAM
ClickHousetraces/observations/scores storage2 CPU + 4 GB RAM
Redis/ValkeyQueue management and caching1 CPU + 1 GB RAM
S3/MinIOFile storage50 GB SSD

This means v3 requires at minimum 4 CPU cores + 9GB RAM. If your VPS falls short, the Worker container will OOM crash immediately.

Pitfall 1: PostgreSQL Table Permissions Causing Migration Failure

After deployment, container logs showed:

DbError { severity: "ERROR", code: SqlState(E42501), message: "permission denied for table projects" }

This happens because Langfuse uses Prisma for database migrations. When the database user doesn't match the table owner, migrations fail.

Solution: Create a dedicated migration user or transfer table ownership.

-- Check table owners
SELECT
    table_schema,
    table_name,
    table_owner
FROM information_schema.tables
WHERE table_schema = 'public';

-- Transfer ownership (replace 'new_user' with your actual migration user)
ALTER TABLE table_name OWNER TO new_user;

Alternatively, use the DIRECT_URL environment variable to configure a dedicated migration user:

DIRECT_URL=postgresql://migration_user:password@host:5432/langfuse

Pitfall 2: NEXTAUTH_URL Mismatch

Login page kept spinning forever. Container logs revealed:

Error: NEXTAUTH_URL does not match current URL

Langfuse uses NextAuth for authentication, and the container's NEXTAUTH_URL must exactly match the external access URL. If you're using Nginx reverse proxy with X-Forwarded-Proto/X-Forwarded-Host headers, you need to set both:

NEXTAUTH_URL=https://your-langfuse-domain.com
NEXTAUTH_URL_INTERNAL=http://localhost:3000

If your Langfuse is deployed under a sub-path (e.g., https://domain.com/langfuse), additional configuration is required:

NEXTAUTH_URL=https://domain.com/langfuse

Pitfall 3: ClickHouse Migration SSL Configuration

When using cloud databases (e.g., Alibaba Cloud RDS), ClickHouse migrations may fail due to SSL issues. Error looks like:

Error: connection refused or timeout

Solution: Enable ClickHouse migration SSL:

CLICKHOUSE_MIGRATION_SSL=true
CLICKHOUSE_URL=https://your-clickhouse-host:8443

Self-hosted ClickHouse default ports: 8123 (HTTP) and 9000 (TCP).

Pitfall 4: Timezone Misconfiguration Causing Data Chaos

Langfuse requires all infrastructure components to default to UTC timezone. If PostgreSQL or ClickHouse is set to a non-UTC timezone, data query results become incorrect.

Check PostgreSQL timezone:

SHOW timezone;

If it doesn't return UTC, modify postgresql.conf:

timezone = 'UTC'

For ClickHouse, add at startup:

CLICKHOUSE_TZ=UTC

Symptom: trace timestamps on the dashboard are 8 hours off from actual time, sorting is completely broken.

Pitfall 5: Worker Container OOM

When server RAM is insufficient, the Worker container OOMs and restarts continuously, queue tasks keep piling up. Logs show:

JavaScript heap out of memory

You need to set Node.js memory limit or increase server RAM:

NODE_OPTIONS="--max-old-space-size=3072"

Also set resource limits in docker-compose.yml:

services:
  worker:
    deploy:
      resources:
        limits:
          memory: 4G

Pitfall 6: v2 to v3 Data Migration

Upgrading from v2 to v3 means dealing with database schema changes. Langfuse provides an official migration script, but the process is lengthy:

# Backup v2 data
pg_dump -h localhost -U langfuse -d langfuse > langfuse_v2_backup.sql

# Check migration docs
# https://langfuse.com/self-hosting/upgrade/upgrade-guides/upgrade-v2-to-v3

# Migration runs automatically on v3 startup
docker compose up -d

Migration time depends on data volume — 10GB takes roughly 30 minutes.

Minimum Configuration Recommendations

ScenarioCPURAMStorage
Test/Dev2 cores4 GB20 GB SSD
Small production4 cores16 GB100 GB NVMe
Medium production8 cores32 GB200 GB NVMe

Price reference (May 2026):

Summary

Langfuse v3's architectural complexity increased dramatically — from a single container in v2 to a system requiring 5 coordinated components. Self-hosting requires focus on:

1. Resource planning: v3 minimum is 4 cores 9GB RAM, 3x higher than v2

2. Permission configuration: PostgreSQL migration user needs correct authorization

3. Timezone consistency: UTC is the only supported timezone

4. SSL configuration: Cloud databases need extra SSL setup

5. OOM monitoring: Worker memory limits must be set

If you just want to try Langfuse, use the Docker Compose quickstart (official one-liner script). But for production use, prepare at least 4 cores 16GB configuration.

👉 Try MiniMax for AI workflow automation: https://platform.minimaxi.com/subscribe/token-plan?code=E5yur9NOub&source=link

**Recommended for your VPS setup**: APC Performance surge protector (6-outlet, 8ft cord) — reliable power protection for your self-hosted server.

Related tools:

📌 This article was AI-assisted generated and human-reviewed | TechPassive — An AI-driven content testing site focused on real tool reviews

🔗 Related Tech Articles

Deep dive into related technical topics:

← Back to Home