curated_compose/OTEL_INTEGRATION_NOTES.md

4.5 KiB

OTEL / Observability Integration Notes

Last updated: 2026-06-15
Author: Agent Zero analysis
Scope: All curated_compose stacks


TL;DR

  • LGTM is the central OTEL backend (traces, metrics, logs via Grafana/Tempo/Loki/Prometheus).
  • n8n → LGTM directly ( working).
  • Langfuse → LGTM (Langfuse's own self-traces, working).
  • Headroom → Langfuse (intentional — LLM-specific observability).
  • Chroma not wired (env vars exist but compose ignores them).
  • Dify no OTEL support yet.

Stack-by-Stack Telemetry Status

Stack Sends to LGTM? Sends to Langfuse? Configured? Notes
LGTM Receives OTLP gRPC :4317, HTTP :4318
n8n Yes No Active OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318 on main + worker
Langfuse Yes Active Own traces to LGTM; stack at docker/langfuse/compose.yaml
Headroom No Yes Active OTEL_EXPORTER_OTLP_ENDPOINT=http://langfuse-web:3000/api/public/otel/v1
Chroma No No Not wired .env.example has CHROMA_OPEN_TELEMETRY__ENDPOINT, compose ignores it
Dify No No None No OTEL env vars in compose or .env.example

Architecture

Headroom Proxy ──OTEL──→ Langfuse ──OTEL──→ LGTM
                              │
                              └── ClickHouse (analytics)
                              └── Postgres (metadata)
                              └── Redis (queues)
                              └── MinIO (S3 storage)

n8n (main + worker) ──OTEL──→ LGTM

[Chroma] ──❌──→ LGTM
[Dify]   ──❌──→ LGTM

Why Headroom → Langfuse (not direct to LGTM)?

Langfuse is purpose-built for LLM observability — it tracks cost per token, prompt versions, user attribution, and LLM-specific metrics that Tempo/Grafana don't natively understand. Headroom's traces are most valuable inside Langfuse.

Langfuse then exports its own internal traces to LGTM for infrastructure-wide correlation.


Known Issues / Action Items

🔴 Chroma — OTEL Not Wired

Problem: docker/chroma/.env.example defines:

CHROMA_OPEN_TELEMETRY__ENDPOINT=
CHROMA_OPEN_TELEMETRY__SERVICE_NAME=chromadb
OTEL_EXPORTER_OTLP_HEADERS=

But docker/chroma/compose.yaml does not pass these env vars into the chroma service.

Fix: Add to compose.yaml service environment::

CHROMA_OPEN_TELEMETRY__ENDPOINT: ${CHROMA_OPEN_TELEMETRY__ENDPOINT:-http://lgtm:4318}
CHROMA_OPEN_TELEMETRY__SERVICE_NAME: ${CHROMA_OPEN_TELEMETRY__SERVICE_NAME:-chromadb}
OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS:-}

🟡 Dify — No OTEL Support

Problem: Dify doesn't expose OTEL configuration natively. It's Python/Flask-based but there's no auto-instrumentation or manual instrumentation in the current compose.

Recommendation: Wait for upstream Dify to add native OTEL support. Do not create custom patches per SKILL.md conventions.

🟢 Langfuse Network Verification

Status: Headroom joins external network langfuse with name: langfuse_langfuse. This is auto-created by Docker Compose from the docker/langfuse/ directory. This should work on deployment.

Verify after deploy: docker network inspect langfuse_langfuse should show both langfuse-web and headroom-proxy containers.

🟡 Unified Log Collection

All stacks emit container logs. For collecting these into LGTM/Loki:

  • Option A (simplest): Configure Docker daemon with Loki log driver globally on Unraid.
  • Option B (per-stack): Add Promtail sidecar to each compose.

Recommendation: Option A — configure once at the Docker daemon level.


Files Referenced

File Purpose
docker/chroma/compose.yaml Chroma vector DB stack
docker/chroma/.env.example Chroma config (OTEL vars present)
docker/dify/docker-compose.yaml Dify LLM platform
docker/dify/.env.example Dify config (no OTEL vars)
docker/headroom/compose.yaml Headroom LLM proxy
docker/langfuse/compose.yaml Langfuse observability
docker/lgtm/docker-compose.yaml LGTM (OTEL backend)
docker/lgtm/.env.example LGTM config
docker/n8n/docker-compose.yaml n8n automation
docker/n8n/.env.example n8n config (OTEL vars present)
SKILL.md Homelab conventions and design rules