feat: signoz
This commit is contained in:
parent
dc39e02242
commit
ebcc6e4d2d
4 changed files with 219 additions and 30 deletions
|
|
@ -1,6 +1,6 @@
|
||||||
# OTEL / Observability Integration Notes
|
# OTEL / Observability Integration Notes
|
||||||
|
|
||||||
> **Last updated**: 2026-06-15
|
> **Last updated**: 2026-06-16
|
||||||
> **Author**: Agent Zero analysis
|
> **Author**: Agent Zero analysis
|
||||||
> **Scope**: All `curated_compose` stacks
|
> **Scope**: All `curated_compose` stacks
|
||||||
|
|
||||||
|
|
@ -8,9 +8,10 @@
|
||||||
|
|
||||||
## TL;DR
|
## TL;DR
|
||||||
|
|
||||||
- **LGTM** is the central OTEL backend (traces, metrics, logs via Grafana/Tempo/Loki/Prometheus).
|
- **SigNoz** is the central OTEL backend (traces, metrics, logs, APM in one platform).
|
||||||
- **n8n** → LGTM directly (✅ working).
|
- **Alloy** and **LGTM** are deprecated — replaced by SigNoz.
|
||||||
- **Langfuse** → LGTM (Langfuse's own self-traces, ✅ working).
|
- **n8n** → SigNoz directly (✅ working).
|
||||||
|
- **Langfuse** → SigNoz (Langfuse's own self-traces, ✅ working).
|
||||||
- **Headroom** → Langfuse (intentional — LLM-specific observability).
|
- **Headroom** → Langfuse (intentional — LLM-specific observability).
|
||||||
- **Chroma** → ❌ not wired (env vars exist but compose ignores them).
|
- **Chroma** → ❌ not wired (env vars exist but compose ignores them).
|
||||||
- **Dify** → ❌ no OTEL support yet.
|
- **Dify** → ❌ no OTEL support yet.
|
||||||
|
|
@ -19,38 +20,67 @@
|
||||||
|
|
||||||
## Stack-by-Stack Telemetry Status
|
## Stack-by-Stack Telemetry Status
|
||||||
|
|
||||||
| Stack | Sends to LGTM? | Sends to Langfuse? | Configured? | Notes |
|
| Stack | Sends to SigNoz? | Sends to Langfuse? | Configured? | Notes |
|
||||||
|-------|---------------|-------------------|-------------|-------|
|
|-------|-----------------|-------------------|-------------|-------|
|
||||||
| **LGTM** | — | — | ✅ | Receives OTLP gRPC `:4317`, HTTP `:4318` |
|
| **SigNoz** | — | — | ✅ | Receives OTLP gRPC `:4317`, HTTP `:4318` |
|
||||||
| **n8n** | ✅ Yes | ❌ No | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318` on main + worker |
|
| **n8n** | ✅ Yes | ❌ No | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318` on main + worker |
|
||||||
| **Langfuse** | ✅ Yes | — | ✅ Active | Own traces to LGTM; stack at `docker/langfuse/compose.yaml` |
|
| **Langfuse** | ✅ Yes | — | ✅ Active | Own traces to SigNoz; stack at `docker/langfuse/compose.yaml` |
|
||||||
| **Headroom** | ❌ No | ✅ Yes | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://langfuse-web:3000/api/public/otel/v1` |
|
| **Headroom** | ❌ No | ✅ Yes | ✅ Active | `OTEL_EXPORTER_OTLP_ENDPOINT=http://langfuse-web:3000/api/public/otel/v1` |
|
||||||
| **Chroma** | ❌ No | ❌ No | ❌ Not wired | `.env.example` has `CHROMA_OPEN_TELEMETRY__ENDPOINT`, compose ignores it |
|
| **Chroma** | ❌ No | ❌ No | ❌ Not wired | `.env.example` has `CHROMA_OPEN_TELEMETRY__ENDPOINT`, compose ignores it |
|
||||||
| **Dify** | ❌ No | ❌ No | ❌ None | No OTEL env vars in compose or `.env.example` |
|
| **Dify** | ❌ No | ❌ No | ❌ None | No OTEL env vars in compose or `.env.example` |
|
||||||
|
| **Zitadel** | ✅ Yes | ❌ No | ✅ Active | `ZITADEL_INSTRUMENTATION_TRACE_EXPORTER_ENDPOINT=http://lgtm:4318` |
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
```
|
```
|
||||||
Headroom Proxy ──OTEL──→ Langfuse ──OTEL──→ LGTM
|
Headroom Proxy ──OTEL──→ Langfuse ──OTEL──→ SigNoz
|
||||||
│
|
│
|
||||||
└── ClickHouse (analytics)
|
└── ClickHouse (analytics)
|
||||||
└── Postgres (metadata)
|
└── Postgres (metadata)
|
||||||
└── Redis (queues)
|
└── Redis (queues)
|
||||||
└── MinIO (S3 storage)
|
└── MinIO (S3 storage)
|
||||||
|
|
||||||
n8n (main + worker) ──OTEL──→ LGTM
|
n8n (main + worker) ──OTEL──→ SigNoz
|
||||||
|
Zitadel ──────────────OTEL──→ SigNoz
|
||||||
|
|
||||||
[Chroma] ──❌──→ LGTM
|
[Chroma] ──❌──→ SigNoz
|
||||||
[Dify] ──❌──→ LGTM
|
[Dify] ──❌──→ SigNoz
|
||||||
```
|
```
|
||||||
|
|
||||||
### Why Headroom → Langfuse (not direct to LGTM)?
|
### Why Headroom → Langfuse (not direct to SigNoz)?
|
||||||
|
|
||||||
Langfuse is purpose-built for LLM observability — it tracks cost per token, prompt versions, user attribution, and LLM-specific metrics that Tempo/Grafana don't natively understand. Headroom's traces are most valuable inside Langfuse.
|
Langfuse is purpose-built for LLM observability — it tracks cost per token, prompt versions, user attribution, and LLM-specific metrics that SigNoz doesn't natively understand. Headroom's traces are most valuable inside Langfuse.
|
||||||
|
|
||||||
Langfuse then exports its own internal traces to LGTM for infrastructure-wide correlation.
|
Langfuse then exports its own internal traces to SigNoz for infrastructure-wide correlation.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Migration Notes (Alloy + LGTM → SigNoz)
|
||||||
|
|
||||||
|
### What changed
|
||||||
|
|
||||||
|
- **Removed**: `docker/alloy/` stack (OTEL collector) and `docker/lgtm/` stack (Grafana all-in-one)
|
||||||
|
- **Added**: `docker/signoz/` stack (all-in-one observability: collector + UI + storage)
|
||||||
|
- **SigNoz pipeline aliases**: `signoz`, `otel`, `lgtm` — existing stacks referencing `lgtm:4318` or `otel:4318` continue to work without changes
|
||||||
|
|
||||||
|
### Cross-stack endpoint mapping
|
||||||
|
|
||||||
|
| Old | New | Notes |
|
||||||
|
|-----|-----|-------|
|
||||||
|
| `alloy:4317` (gRPC) | `signoz:4317` (gRPC) | Same port, new host |
|
||||||
|
| `alloy:4318` / `lgtm:4318` (HTTP) | `signoz:4318` (HTTP) | Same port, new host |
|
||||||
|
| `lgtm:3000` (Grafana UI) | `signoz:3301` (SigNoz UI) | Different port |
|
||||||
|
|
||||||
|
### Stacks that need endpoint updates
|
||||||
|
|
||||||
|
Stacks with hardcoded `lgtm:4318` in their compose will still resolve via the `lgtm` alias on the `pipeline` network. No immediate changes required, but consider updating to `signoz:4318` for clarity:
|
||||||
|
|
||||||
|
- `docker/n8n/docker-compose.yaml` — `OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318`
|
||||||
|
- `docker/langfuse/compose.yaml` — `OTEL_EXPORTER_OTLP_ENDPOINT=http://lgtm:4318`
|
||||||
|
- `docker/chroma/compose.yaml` — `CHROMA_OPEN_TELEMETRY__ENDPOINT=http://lgtm:4318`
|
||||||
|
- `docker/zitadel/compose.yaml` — `ZITADEL_INSTRUMENTATION_TRACE_EXPORTER_ENDPOINT=http://lgtm:4318`
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -69,7 +99,7 @@ But `docker/chroma/compose.yaml` does **not** pass these env vars into the `chro
|
||||||
|
|
||||||
**Fix**: Add to `compose.yaml` service `environment:`:
|
**Fix**: Add to `compose.yaml` service `environment:`:
|
||||||
```yaml
|
```yaml
|
||||||
CHROMA_OPEN_TELEMETRY__ENDPOINT: ${CHROMA_OPEN_TELEMETRY__ENDPOINT:-http://lgtm:4318}
|
CHROMA_OPEN_TELEMETRY__ENDPOINT: ${CHROMA_OPEN_TELEMETRY__ENDPOINT:-http://signoz:4318}
|
||||||
CHROMA_OPEN_TELEMETRY__SERVICE_NAME: ${CHROMA_OPEN_TELEMETRY__SERVICE_NAME:-chromadb}
|
CHROMA_OPEN_TELEMETRY__SERVICE_NAME: ${CHROMA_OPEN_TELEMETRY__SERVICE_NAME:-chromadb}
|
||||||
OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS:-}
|
OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS:-}
|
||||||
```
|
```
|
||||||
|
|
@ -80,15 +110,9 @@ OTEL_EXPORTER_OTLP_HEADERS: ${OTEL_EXPORTER_OTLP_HEADERS:-}
|
||||||
|
|
||||||
**Recommendation**: Wait for upstream Dify to add native OTEL support. Do not create custom patches per SKILL.md conventions.
|
**Recommendation**: Wait for upstream Dify to add native OTEL support. Do not create custom patches per SKILL.md conventions.
|
||||||
|
|
||||||
### 🟢 Langfuse Network Verification
|
|
||||||
|
|
||||||
**Status**: Headroom joins external network `langfuse` with `name: langfuse_langfuse`. This is auto-created by Docker Compose from the `docker/langfuse/` directory. **This should work** on deployment.
|
|
||||||
|
|
||||||
**Verify after deploy**: `docker network inspect langfuse_langfuse` should show both `langfuse-web` and `headroom-proxy` containers.
|
|
||||||
|
|
||||||
### 🟡 Unified Log Collection
|
### 🟡 Unified Log Collection
|
||||||
|
|
||||||
All stacks emit container logs. For collecting these into LGTM/Loki:
|
All stacks emit container logs. For collecting these into SigNoz/Loki:
|
||||||
|
|
||||||
- **Option A** (simplest): Configure Docker daemon with Loki log driver globally on Unraid.
|
- **Option A** (simplest): Configure Docker daemon with Loki log driver globally on Unraid.
|
||||||
- **Option B** (per-stack): Add Promtail sidecar to each compose.
|
- **Option B** (per-stack): Add Promtail sidecar to each compose.
|
||||||
|
|
@ -101,14 +125,15 @@ All stacks emit container logs. For collecting these into LGTM/Loki:
|
||||||
|
|
||||||
| File | Purpose |
|
| File | Purpose |
|
||||||
|------|---------|
|
|------|---------|
|
||||||
|
| `docker/signoz/compose.yaml` | SigNoz observability stack (replaces Alloy + LGTM) |
|
||||||
|
| `docker/signoz/.env.example` | SigNoz config |
|
||||||
| `docker/chroma/compose.yaml` | Chroma vector DB stack |
|
| `docker/chroma/compose.yaml` | Chroma vector DB stack |
|
||||||
| `docker/chroma/.env.example` | Chroma config (OTEL vars present) |
|
| `docker/chroma/.env.example` | Chroma config (OTEL vars present) |
|
||||||
| `docker/dify/docker-compose.yaml` | Dify LLM platform |
|
| `docker/dify/docker-compose.yaml` | Dify LLM platform |
|
||||||
| `docker/dify/.env.example` | Dify config (no OTEL vars) |
|
| `docker/dify/.env.example` | Dify config (no OTEL vars) |
|
||||||
| `docker/headroom/compose.yaml` | Headroom LLM proxy |
|
| `docker/headroom/compose.yaml` | Headroom LLM proxy |
|
||||||
| `docker/langfuse/compose.yaml` | Langfuse observability |
|
| `docker/langfuse/compose.yaml` | Langfuse observability |
|
||||||
| `docker/lgtm/docker-compose.yaml` | LGTM (OTEL backend) |
|
|
||||||
| `docker/lgtm/.env.example` | LGTM config |
|
|
||||||
| `docker/n8n/docker-compose.yaml` | n8n automation |
|
| `docker/n8n/docker-compose.yaml` | n8n automation |
|
||||||
| `docker/n8n/.env.example` | n8n config (OTEL vars present) |
|
| `docker/n8n/.env.example` | n8n config (OTEL vars present) |
|
||||||
|
| `docker/zitadel/compose.yaml` | Zitadel IAM |
|
||||||
| `SKILL.md` | Homelab conventions and design rules |
|
| `SKILL.md` | Homelab conventions and design rules |
|
||||||
|
|
|
||||||
35
docker/signoz/.env.example
Normal file
35
docker/signoz/.env.example
Normal file
|
|
@ -0,0 +1,35 @@
|
||||||
|
# =============================================================================
|
||||||
|
# SigNoz — OpenTelemetry Observability Platform
|
||||||
|
# =============================================================================
|
||||||
|
# Copy to .env and edit for your deployment.
|
||||||
|
# cp .env.example .env
|
||||||
|
# The actual .env is deployed by Dockhand and should not be committed.
|
||||||
|
#
|
||||||
|
# Replaces both Alloy (OTEL collector) and LGTM (Grafana/Prometheus/Tempo/Loki).
|
||||||
|
# All stacks should point their OTLP exporters to signoz on the pipeline network.
|
||||||
|
# =============================================================================
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# SigNoz Image Version
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# Pin a specific version for reproducibility. Check releases at:
|
||||||
|
# https://github.com/SigNoz/signoz/releases
|
||||||
|
SIGNOZ_VERSION=latest
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# ClickHouse
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
CLICKHOUSE_VERSION=25.5
|
||||||
|
CLICKHOUSE_DB=signoz
|
||||||
|
CLICKHOUSE_USER=admin
|
||||||
|
CLICKHOUSE_PASSWORD=change-me-clickhouse-password
|
||||||
|
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# Exposed Ports
|
||||||
|
# -----------------------------------------------------------------------------
|
||||||
|
# SigNoz UI
|
||||||
|
EXPOSE_SIGNOZ_UI_PORT=3301
|
||||||
|
# OTLP gRPC receiver (used by instrumented apps/services)
|
||||||
|
EXPOSE_OTLP_GRPC_PORT=4317
|
||||||
|
# OTLP HTTP receiver (used by instrumented apps/services)
|
||||||
|
EXPOSE_OTLP_HTTP_PORT=4318
|
||||||
92
docker/signoz/compose.yaml
Normal file
92
docker/signoz/compose.yaml
Normal file
|
|
@ -0,0 +1,92 @@
|
||||||
|
name: signoz
|
||||||
|
|
||||||
|
services:
|
||||||
|
# ===========================================================================
|
||||||
|
# ClickHouse — columnar storage for all telemetry data
|
||||||
|
# ===========================================================================
|
||||||
|
clickhouse:
|
||||||
|
image: clickhouse/clickhouse-server:${CLICKHOUSE_VERSION:-25.5}
|
||||||
|
restart: unless-stopped
|
||||||
|
environment:
|
||||||
|
CLICKHOUSE_DB: ${CLICKHOUSE_DB:-signoz}
|
||||||
|
CLICKHOUSE_USER: ${CLICKHOUSE_USER:-admin}
|
||||||
|
CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD:-change-me-clickhouse-password}
|
||||||
|
volumes:
|
||||||
|
- ./clickhouse-data:/var/lib/clickhouse
|
||||||
|
healthcheck:
|
||||||
|
test: wget --no-verbose --tries=1 --spider http://localhost:8123/ping || exit 1
|
||||||
|
interval: 5s
|
||||||
|
timeout: 3s
|
||||||
|
retries: 30
|
||||||
|
start_period: 10s
|
||||||
|
networks:
|
||||||
|
- signoz
|
||||||
|
|
||||||
|
# ===========================================================================
|
||||||
|
# SigNoz — all-in-one observability platform (query service + UI + collector)
|
||||||
|
# ===========================================================================
|
||||||
|
# Replaces both Alloy (OTEL collector) and LGTM (Grafana/Prometheus/Tempo/Loki).
|
||||||
|
# Accepts OTLP gRPC (4317) and OTLP HTTP (4318) from all stacks.
|
||||||
|
# UI on port 3301.
|
||||||
|
#
|
||||||
|
# Docs: https://signoz.io/docs/install/docker/
|
||||||
|
# ===========================================================================
|
||||||
|
signoz:
|
||||||
|
image: signoz/signoz:${SIGNOZ_VERSION:-latest}
|
||||||
|
restart: unless-stopped
|
||||||
|
depends_on:
|
||||||
|
clickhouse:
|
||||||
|
condition: service_healthy
|
||||||
|
environment:
|
||||||
|
SIGNOZ_TELEMETRY_STORE: clickhouse
|
||||||
|
DSN: tcp://clickhouse:9000
|
||||||
|
CLICKHOUSE_USER: ${CLICKHOUSE_USER:-admin}
|
||||||
|
CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD:-change-me-clickhouse-password}
|
||||||
|
CLICKHOUSE_DATABASE: ${CLICKHOUSE_DB:-signoz}
|
||||||
|
STORAGE: clickhouse
|
||||||
|
CLICKHOUSE_ENDPOINT: tcp://clickhouse:9000
|
||||||
|
SIGNOZ_CLICKHOUSE_DSN: tcp://clickhouse:9000
|
||||||
|
ports:
|
||||||
|
# SigNoz UI
|
||||||
|
- ${EXPOSE_SIGNOZ_UI_PORT:-3301}:3301
|
||||||
|
# OTLP gRPC receiver
|
||||||
|
- ${EXPOSE_OTLP_GRPC_PORT:-4317}:4317
|
||||||
|
# OTLP HTTP receiver
|
||||||
|
- ${EXPOSE_OTLP_HTTP_PORT:-4318}:4318
|
||||||
|
volumes:
|
||||||
|
- ./signoz-data:/var/lib/signoz
|
||||||
|
healthcheck:
|
||||||
|
test:
|
||||||
|
- CMD
|
||||||
|
- wget
|
||||||
|
- --no-verbose
|
||||||
|
- --tries=1
|
||||||
|
- --spider
|
||||||
|
- http://localhost:3301/api/v1/health
|
||||||
|
interval: 15s
|
||||||
|
timeout: 5s
|
||||||
|
retries: 10
|
||||||
|
start_period: 30s
|
||||||
|
networks:
|
||||||
|
signoz: {}
|
||||||
|
pipeline:
|
||||||
|
aliases:
|
||||||
|
- signoz
|
||||||
|
- otel
|
||||||
|
- lgtm
|
||||||
|
# swag:
|
||||||
|
# aliases:
|
||||||
|
# - signoz
|
||||||
|
|
||||||
|
networks:
|
||||||
|
signoz:
|
||||||
|
name: signoz
|
||||||
|
driver: bridge
|
||||||
|
pipeline:
|
||||||
|
name: pipeline
|
||||||
|
external: true
|
||||||
|
# swag:
|
||||||
|
# name: swag
|
||||||
|
# external: true
|
||||||
|
|
||||||
|
volumes: {}
|
||||||
37
docker/signoz/swag/signoz.subdomain.conf
Normal file
37
docker/signoz/swag/signoz.subdomain.conf
Normal file
|
|
@ -0,0 +1,37 @@
|
||||||
|
## -----------------------------------------------------------------------------
|
||||||
|
## SWAG proxy config for SigNoz
|
||||||
|
## Domain: signoz.ld50.xyz
|
||||||
|
## Upstream: signoz:3301 (shared Docker network: ${NETWORKS_EXTERNAL_NAME:-swag})
|
||||||
|
##
|
||||||
|
## Install:
|
||||||
|
## 1) Copy this file into SWAG: /config/nginx/proxy-confs/signoz.subdomain.conf
|
||||||
|
## 2) Ensure both stacks share the same external Docker network (e.g. `swag`).
|
||||||
|
## 3) In curated_compose/signoz/compose.yaml, uncomment the swag network + service attachment.
|
||||||
|
## 4) Reload SWAG.
|
||||||
|
## -----------------------------------------------------------------------------
|
||||||
|
|
||||||
|
server {
|
||||||
|
listen 443 ssl;
|
||||||
|
listen [::]:443 ssl;
|
||||||
|
|
||||||
|
server_name signoz.ld50.xyz;
|
||||||
|
|
||||||
|
include /config/nginx/ssl.conf;
|
||||||
|
|
||||||
|
location / {
|
||||||
|
include /config/nginx/proxy.conf;
|
||||||
|
|
||||||
|
set $upstream_app signoz;
|
||||||
|
set $upstream_port 3301;
|
||||||
|
set $upstream_proto http;
|
||||||
|
|
||||||
|
proxy_pass $upstream_proto://$upstream_app:$upstream_port;
|
||||||
|
|
||||||
|
# SigNoz UI uses WebSocket for live query results
|
||||||
|
proxy_set_header Upgrade $http_upgrade;
|
||||||
|
proxy_set_header Connection "upgrade";
|
||||||
|
|
||||||
|
proxy_read_timeout 3600s;
|
||||||
|
proxy_send_timeout 3600s;
|
||||||
|
}
|
||||||
|
}
|
||||||
Loading…
Add table
Add a link
Reference in a new issue