Add OTEL wiring and env examples across compose stacks
This commit is contained in:
parent
666212a5c2
commit
8f8827aba1
6 changed files with 448 additions and 0 deletions
114
OTEL_INTEGRATION_NOTES.md
Normal file
114
OTEL_INTEGRATION_NOTES.md
Normal file
|
|
@ -0,0 +1,114 @@
|
|||
# OTEL / Observability Integration Notes
|
||||
|
||||
> **Last updated**: 2026-06-15
|
||||
> **Author**: Agent Zero analysis
|
||||
> **Scope**: All `curated_compose` stacks
|
||||
|
||||
---
|
||||
|
||||
## TL;DR
|
||||
|
||||
- **LGTM** is the central OTEL backend (traces, metrics, logs via Grafana/Tempo/Loki/Prometheus).
|
||||
- **n8n** → LGTM directly (✅ working).
|
||||
- **Langfuse** → LGTM (Langfuse's own self-traces, ✅ working).
|
||||
- **Headroom** → Langfuse (intentional — LLM-specific observability).
|
||||
- **Chroma** → ❌ not wired (env vars exist but compose ignores them).
|
||||
- **Dify** → ❌ no OTEL support yet.
|
||||
|
||||
---
|
||||
|
||||
## Stack-by-Stack Telemetry Status
|
||||
|
||||
| Stack | Sends to LGTM? | Sends to Langfuse? | Configured? | Notes |
|
||||
|-------|---------------|-------------------|-------------|-------|
|
||||
| **LGTM** | — | — | ✅ | Receives OTLP gRPC `:4317`, HTTP `:4318` |
|
||||
| **n8n** | ✅ Yes | ❌ No | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318` on main + worker |
|
||||
| **Langfuse** | ✅ Yes | — | ✅ Active | Own traces to LGTM; stack at `docker/langfuse/compose.yaml` |
|
||||
| **Headroom** | ❌ No | ✅ Yes | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://langfuse-web:3000/api/public/otel/v1` |
|
||||
| **Chroma** | ❌ No | ❌ No | ❌ Not wired | `.env.example` has `CHROMA_OPEN_TELEMETRY__ENDPOINT`, compose ignores it |
|
||||
| **Dify** | ❌ No | ❌ No | ❌ None | No OTEL env vars in compose or `.env.example` |
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Headroom Proxy ──OTEL──→ Langfuse ──OTEL──→ LGTM
|
||||
│
|
||||
└── ClickHouse (analytics)
|
||||
└── Postgres (metadata)
|
||||
└── Redis (queues)
|
||||
└── MinIO (S3 storage)
|
||||
|
||||
n8n (main + worker) ──OTEL──→ LGTM
|
||||
|
||||
[Chroma] ──❌──→ LGTM
|
||||
[Dify] ──❌──→ LGTM
|
||||
```
|
||||
|
||||
### Why Headroom → Langfuse (not direct to LGTM)?
|
||||
|
||||
Langfuse is purpose-built for LLM observability — it tracks cost per token, prompt versions, user attribution, and LLM-specific metrics that Tempo/Grafana don't natively understand. Headroom's traces are most valuable inside Langfuse.
|
||||
|
||||
Langfuse then exports its own internal traces to LGTM for infrastructure-wide correlation.
|
||||
|
||||
---
|
||||
|
||||
## Known Issues / Action Items
|
||||
|
||||
### 🔴 Chroma — OTEL Not Wired
|
||||
|
||||
**Problem**: `docker/chroma/.env.example` defines:
|
||||
```
|
||||
CHROMA_OPEN_TELEMETRY__ENDPOINT=
|
||||
CHROMA_OPEN_TELEMETRY__SERVICE_NAME=chromadb
|
||||
OTEL_EXPORTER_OTLP_HEADERS=
|
||||
```
|
||||
|
||||
But `docker/chroma/compose.yaml` does **not** pass these env vars into the `chroma` service.
|
||||
|
||||
**Fix**: Add to `compose.yaml` service `environment:`:
|
||||
```yaml
|
||||
CHROMA_OPEN_TELEMETRY__ENDPOINT: ${CHROMA_OPEN_TELEMETRY__ENDPOINT:-http://lgtm:4318}
|
||||
CHROMA_OPEN_TELEMETRY__SERVICE_NAME: ${CHROMA_OPEN_TELEMETRY__SERVICE_NAME:-chromadb}
|
||||
OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS:-}
|
||||
```
|
||||
|
||||
### 🟡 Dify — No OTEL Support
|
||||
|
||||
**Problem**: Dify doesn't expose OTEL configuration natively. It's Python/Flask-based but there's no auto-instrumentation or manual instrumentation in the current compose.
|
||||
|
||||
**Recommendation**: Wait for upstream Dify to add native OTEL support. Do not create custom patches per SKILL.md conventions.
|
||||
|
||||
### 🟢 Langfuse Network Verification
|
||||
|
||||
**Status**: Headroom joins external network `langfuse` with `name: langfuse_langfuse`. This is auto-created by Docker Compose from the `docker/langfuse/` directory. **This should work** on deployment.
|
||||
|
||||
**Verify after deploy**: `docker network inspect langfuse_langfuse` should show both `langfuse-web` and `headroom-proxy` containers.
|
||||
|
||||
### 🟡 Unified Log Collection
|
||||
|
||||
All stacks emit container logs. For collecting these into LGTM/Loki:
|
||||
|
||||
- **Option A** (simplest): Configure Docker daemon with Loki log driver globally on Unraid.
|
||||
- **Option B** (per-stack): Add Promtail sidecar to each compose.
|
||||
|
||||
**Recommendation**: Option A — configure once at the Docker daemon level.
|
||||
|
||||
---
|
||||
|
||||
## Files Referenced
|
||||
|
||||
| File | Purpose |
|
||||
|------|---------|
|
||||
| `docker/chroma/compose.yaml` | Chroma vector DB stack |
|
||||
| `docker/chroma/.env.example` | Chroma config (OTEL vars present) |
|
||||
| `docker/dify/docker-compose.yaml` | Dify LLM platform |
|
||||
| `docker/dify/.env.example` | Dify config (no OTEL vars) |
|
||||
| `docker/headroom/compose.yaml` | Headroom LLM proxy |
|
||||
| `docker/langfuse/compose.yaml` | Langfuse observability |
|
||||
| `docker/lgtm/docker-compose.yaml` | LGTM (OTEL backend) |
|
||||
| `docker/lgtm/.env.example` | LGTM config |
|
||||
| `docker/n8n/docker-compose.yaml` | n8n automation |
|
||||
| `docker/n8n/.env.example` | n8n config (OTEL vars present) |
|
||||
| `SKILL.md` | Homelab conventions and design rules |
|
||||
Loading…
Add table
Add a link
Reference in a new issue