feat: signoz

This commit is contained in:
Robbie 2026-06-16 11:20:25 -04:00
parent dc39e02242
commit ebcc6e4d2d
4 changed files with 219 additions and 30 deletions

View file

@ -1,6 +1,6 @@
# OTEL / Observability Integration Notes # OTEL / Observability Integration Notes
> **Last updated**: 2026-06-15 > **Last updated**: 2026-06-16
> **Author**: Agent Zero analysis > **Author**: Agent Zero analysis
> **Scope**: All `curated_compose` stacks > **Scope**: All `curated_compose` stacks
@ -8,9 +8,10 @@
## TL;DR ## TL;DR
- **LGTM** is the central OTEL backend (traces, metrics, logs via Grafana/Tempo/Loki/Prometheus). - **SigNoz** is the central OTEL backend (traces, metrics, logs, APM in one platform).
- **n8n** → LGTM directly (✅ working). - **Alloy** and **LGTM** are deprecated — replaced by SigNoz.
- **Langfuse** → LGTM (Langfuse's own self-traces, ✅ working). - **n8n** → SigNoz directly (✅ working).
- **Langfuse** → SigNoz (Langfuse's own self-traces, ✅ working).
- **Headroom** → Langfuse (intentional — LLM-specific observability). - **Headroom** → Langfuse (intentional — LLM-specific observability).
- **Chroma** → ❌ not wired (env vars exist but compose ignores them). - **Chroma** → ❌ not wired (env vars exist but compose ignores them).
- **Dify** → ❌ no OTEL support yet. - **Dify** → ❌ no OTEL support yet.
@ -19,38 +20,67 @@
## Stack-by-Stack Telemetry Status ## Stack-by-Stack Telemetry Status
| Stack | Sends to LGTM? | Sends to Langfuse? | Configured? | Notes | | Stack | Sends to SigNoz? | Sends to Langfuse? | Configured? | Notes |
|-------|---------------|-------------------|-------------|-------| |-------|-----------------|-------------------|-------------|-------|
| **LGTM** | — | — | ✅ | Receives OTLP gRPC `:4317`, HTTP `:4318` | | **SigNoz** | — | — | ✅ | Receives OTLP gRPC `:4317`, HTTP `:4318` |
| **n8n** | ✅ Yes | ❌ No | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318` on main + worker | | **n8n** | ✅ Yes | ❌ No | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318` on main + worker |
| **Langfuse** | ✅ Yes | — | ✅ Active | Own traces to LGTM; stack at `docker/langfuse/compose.yaml` | | **Langfuse** | ✅ Yes | — | ✅ Active | Own traces to SigNoz; stack at `docker/langfuse/compose.yaml` |
| **Headroom** | ❌ No | ✅ Yes | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://langfuse-web:3000/api/public/otel/v1` | | **Headroom** | ❌ No | ✅ Yes | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://langfuse-web:3000/api/public/otel/v1` |
| **Chroma** | ❌ No | ❌ No | ❌ Not wired | `.env.example` has `CHROMA_OPEN_TELEMETRY__ENDPOINT`, compose ignores it | | **Chroma** | ❌ No | ❌ No | ❌ Not wired | `.env.example` has `CHROMA_OPEN_TELEMETRY__ENDPOINT`, compose ignores it |
| **Dify** | ❌ No | ❌ No | ❌ None | No OTEL env vars in compose or `.env.example` | | **Dify** | ❌ No | ❌ No | ❌ None | No OTEL env vars in compose or `.env.example` |
| **Zitadel** | ✅ Yes | ❌ No | ✅ Active | `ZITADEL_INSTRUMENTATION_TRACE_EXPORTER_ENDPOINT=http://lgtm:4318` |
--- ---
## Architecture ## Architecture
``` ```
Headroom Proxy ──OTEL──→ Langfuse ──OTEL──→ LGTM Headroom Proxy ──OTEL──→ Langfuse ──OTEL──→ SigNoz
└── ClickHouse (analytics) └── ClickHouse (analytics)
└── Postgres (metadata) └── Postgres (metadata)
└── Redis (queues) └── Redis (queues)
└── MinIO (S3 storage) └── MinIO (S3 storage)
n8n (main + worker) ──OTEL──→ LGTM n8n (main + worker) ──OTEL──→ SigNoz
Zitadel ──────────────OTEL──→ SigNoz
[Chroma] ──❌──→ LGTM [Chroma] ──❌──→ SigNoz
[Dify] ──❌──→ LGTM [Dify] ──❌──→ SigNoz
``` ```
### Why Headroom → Langfuse (not direct to LGTM)? ### Why Headroom → Langfuse (not direct to SigNoz)?
Langfuse is purpose-built for LLM observability — it tracks cost per token, prompt versions, user attribution, and LLM-specific metrics that Tempo/Grafana don't natively understand. Headroom's traces are most valuable inside Langfuse. Langfuse is purpose-built for LLM observability — it tracks cost per token, prompt versions, user attribution, and LLM-specific metrics that SigNoz doesn't natively understand. Headroom's traces are most valuable inside Langfuse.
Langfuse then exports its own internal traces to LGTM for infrastructure-wide correlation. Langfuse then exports its own internal traces to SigNoz for infrastructure-wide correlation.
---
## Migration Notes (Alloy + LGTM → SigNoz)
### What changed
- **Removed**: `docker/alloy/` stack (OTEL collector) and `docker/lgtm/` stack (Grafana all-in-one)
- **Added**: `docker/signoz/` stack (all-in-one observability: collector + UI + storage)
- **SigNoz pipeline aliases**: `signoz`, `otel`, `lgtm` — existing stacks referencing `lgtm:4318` or `otel:4318` continue to work without changes
### Cross-stack endpoint mapping
| Old | New | Notes |
|-----|-----|-------|
| `alloy:4317` (gRPC) | `signoz:4317` (gRPC) | Same port, new host |
| `alloy:4318` / `lgtm:4318` (HTTP) | `signoz:4318` (HTTP) | Same port, new host |
| `lgtm:3000` (Grafana UI) | `signoz:3301` (SigNoz UI) | Different port |
### Stacks that need endpoint updates
Stacks with hardcoded `lgtm:4318` in their compose will still resolve via the `lgtm` alias on the `pipeline` network. No immediate changes required, but consider updating to `signoz:4318` for clarity:
- `docker/n8n/docker-compose.yaml``OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318`
- `docker/langfuse/compose.yaml``OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318`
- `docker/chroma/compose.yaml``CHROMA_OPEN_TELEMETRY__ENDPOINT=http://lgtm:4318`
- `docker/zitadel/compose.yaml``ZITADEL_INSTRUMENTATION_TRACE_EXPORTER_ENDPOINT=http://lgtm:4318`
--- ---
@ -69,7 +99,7 @@ But `docker/chroma/compose.yaml` does **not** pass these env vars into the `chro
**Fix**: Add to `compose.yaml` service `environment:`: **Fix**: Add to `compose.yaml` service `environment:`:
```yaml ```yaml
CHROMA_OPEN_TELEMETRY__ENDPOINT: ${CHROMA_OPEN_TELEMETRY__ENDPOINT:-http://lgtm:4318} CHROMA_OPEN_TELEMETRY__ENDPOINT: ${CHROMA_OPEN_TELEMETRY__ENDPOINT:-http://signoz:4318}
CHROMA_OPEN_TELEMETRY__SERVICE_NAME: ${CHROMA_OPEN_TELEMETRY__SERVICE_NAME:-chromadb} CHROMA_OPEN_TELEMETRY__SERVICE_NAME: ${CHROMA_OPEN_TELEMETRY__SERVICE_NAME:-chromadb}
OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS:-} OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS:-}
``` ```
@ -80,15 +110,9 @@ OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS:-}
**Recommendation**: Wait for upstream Dify to add native OTEL support. Do not create custom patches per SKILL.md conventions. **Recommendation**: Wait for upstream Dify to add native OTEL support. Do not create custom patches per SKILL.md conventions.
### 🟢 Langfuse Network Verification
**Status**: Headroom joins external network `langfuse` with `name: langfuse_langfuse`. This is auto-created by Docker Compose from the `docker/langfuse/` directory. **This should work** on deployment.
**Verify after deploy**: `docker network inspect langfuse_langfuse` should show both `langfuse-web` and `headroom-proxy` containers.
### 🟡 Unified Log Collection ### 🟡 Unified Log Collection
All stacks emit container logs. For collecting these into LGTM/Loki: All stacks emit container logs. For collecting these into SigNoz/Loki:
- **Option A** (simplest): Configure Docker daemon with Loki log driver globally on Unraid. - **Option A** (simplest): Configure Docker daemon with Loki log driver globally on Unraid.
- **Option B** (per-stack): Add Promtail sidecar to each compose. - **Option B** (per-stack): Add Promtail sidecar to each compose.
@ -101,14 +125,15 @@ All stacks emit container logs. For collecting these into LGTM/Loki:
| File | Purpose | | File | Purpose |
|------|---------| |------|---------|
| `docker/signoz/compose.yaml` | SigNoz observability stack (replaces Alloy + LGTM) |
| `docker/signoz/.env.example` | SigNoz config |
| `docker/chroma/compose.yaml` | Chroma vector DB stack | | `docker/chroma/compose.yaml` | Chroma vector DB stack |
| `docker/chroma/.env.example` | Chroma config (OTEL vars present) | | `docker/chroma/.env.example` | Chroma config (OTEL vars present) |
| `docker/dify/docker-compose.yaml` | Dify LLM platform | | `docker/dify/docker-compose.yaml` | Dify LLM platform |
| `docker/dify/.env.example` | Dify config (no OTEL vars) | | `docker/dify/.env.example` | Dify config (no OTEL vars) |
| `docker/headroom/compose.yaml` | Headroom LLM proxy | | `docker/headroom/compose.yaml` | Headroom LLM proxy |
| `docker/langfuse/compose.yaml` | Langfuse observability | | `docker/langfuse/compose.yaml` | Langfuse observability |
| `docker/lgtm/docker-compose.yaml` | LGTM (OTEL backend) |
| `docker/lgtm/.env.example` | LGTM config |
| `docker/n8n/docker-compose.yaml` | n8n automation | | `docker/n8n/docker-compose.yaml` | n8n automation |
| `docker/n8n/.env.example` | n8n config (OTEL vars present) | | `docker/n8n/.env.example` | n8n config (OTEL vars present) |
| `docker/zitadel/compose.yaml` | Zitadel IAM |
| `SKILL.md` | Homelab conventions and design rules | | `SKILL.md` | Homelab conventions and design rules |

View file

@ -0,0 +1,35 @@
# =============================================================================
# SigNoz — OpenTelemetry Observability Platform
# =============================================================================
# Copy to .env and edit for your deployment.
# cp .env.example .env
# The actual .env is deployed by Dockhand and should not be committed.
#
# Replaces both Alloy (OTEL collector) and LGTM (Grafana/Prometheus/Tempo/Loki).
# All stacks should point their OTLP exporters to signoz on the pipeline network.
# =============================================================================
# -----------------------------------------------------------------------------
# SigNoz Image Version
# -----------------------------------------------------------------------------
# Pin a specific version for reproducibility. Check releases at:
# https://github.com/SigNoz/signoz/releases
SIGNOZ_VERSION=latest
# -----------------------------------------------------------------------------
# ClickHouse
# -----------------------------------------------------------------------------
CLICKHOUSE_VERSION=25.5
CLICKHOUSE_DB=signoz
CLICKHOUSE_USER=admin
CLICKHOUSE_PASSWORD=change-me-clickhouse-password
# -----------------------------------------------------------------------------
# Exposed Ports
# -----------------------------------------------------------------------------
# SigNoz UI
EXPOSE_SIGNOZ_UI_PORT=3301
# OTLP gRPC receiver (used by instrumented apps/services)
EXPOSE_OTLP_GRPC_PORT=4317
# OTLP HTTP receiver (used by instrumented apps/services)
EXPOSE_OTLP_HTTP_PORT=4318

View file

@ -0,0 +1,92 @@
name: signoz
services:
# ===========================================================================
# ClickHouse — columnar storage for all telemetry data
# ===========================================================================
clickhouse:
image: clickhouse/clickhouse-server:${CLICKHOUSE_VERSION:-25.5}
restart: unless-stopped
environment:
CLICKHOUSE_DB: ${CLICKHOUSE_DB:-signoz}
CLICKHOUSE_USER: ${CLICKHOUSE_USER:-admin}
CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD:-change-me-clickhouse-password}
volumes:
- ./clickhouse-data:/var/lib/clickhouse
healthcheck:
test: wget --no-verbose --tries=1 --spider http://localhost:8123/ping || exit 1
interval: 5s
timeout: 3s
retries: 30
start_period: 10s
networks:
- signoz
# ===========================================================================
# SigNoz — all-in-one observability platform (query service + UI + collector)
# ===========================================================================
# Replaces both Alloy (OTEL collector) and LGTM (Grafana/Prometheus/Tempo/Loki).
# Accepts OTLP gRPC (4317) and OTLP HTTP (4318) from all stacks.
# UI on port 3301.
#
# Docs: https://signoz.io/docs/install/docker/
# ===========================================================================
signoz:
image: signoz/signoz:${SIGNOZ_VERSION:-latest}
restart: unless-stopped
depends_on:
clickhouse:
condition: service_healthy
environment:
SIGNOZ_TELEMETRY_STORE: clickhouse
DSN: tcp://clickhouse:9000
CLICKHOUSE_USER: ${CLICKHOUSE_USER:-admin}
CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD:-change-me-clickhouse-password}
CLICKHOUSE_DATABASE: ${CLICKHOUSE_DB:-signoz}
STORAGE: clickhouse
CLICKHOUSE_ENDPOINT: tcp://clickhouse:9000
SIGNOZ_CLICKHOUSE_DSN: tcp://clickhouse:9000
ports:
# SigNoz UI
- ${EXPOSE_SIGNOZ_UI_PORT:-3301}:3301
# OTLP gRPC receiver
- ${EXPOSE_OTLP_GRPC_PORT:-4317}:4317
# OTLP HTTP receiver
- ${EXPOSE_OTLP_HTTP_PORT:-4318}:4318
volumes:
- ./signoz-data:/var/lib/signoz
healthcheck:
test:
- CMD
- wget
- --no-verbose
- --tries=1
- --spider
- http://localhost:3301/api/v1/health
interval: 15s
timeout: 5s
retries: 10
start_period: 30s
networks:
signoz: {}
pipeline:
aliases:
- signoz
- otel
- lgtm
# swag:
# aliases:
# - signoz
networks:
signoz:
name: signoz
driver: bridge
pipeline:
name: pipeline
external: true
# swag:
# name: swag
# external: true
volumes: {}

View file

@ -0,0 +1,37 @@
## -----------------------------------------------------------------------------
## SWAG proxy config for SigNoz
## Domain: signoz.ld50.xyz
## Upstream: signoz:3301 (shared Docker network: ${NETWORKS_EXTERNAL_NAME:-swag})
##
## Install:
## 1) Copy this file into SWAG: /config/nginx/proxy-confs/signoz.subdomain.conf
## 2) Ensure both stacks share the same external Docker network (e.g. `swag`).
## 3) In curated_compose/signoz/compose.yaml, uncomment the swag network + service attachment.
## 4) Reload SWAG.
## -----------------------------------------------------------------------------
server {
listen 443 ssl;
listen [::]:443 ssl;
server_name signoz.ld50.xyz;
include /config/nginx/ssl.conf;
location / {
include /config/nginx/proxy.conf;
set $upstream_app signoz;
set $upstream_port 3301;
set $upstream_proto http;
proxy_pass $upstream_proto://$upstream_app:$upstream_port;
# SigNoz UI uses WebSocket for live query results
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_read_timeout 3600s;
proxy_send_timeout 3600s;
}
}