Claude Code Monitoring¶
Claude Code exports telemetry via OpenTelemetry (OTLP) to Prometheus and Loki, visualized in a dedicated Grafana dashboard.
Architecture¶
graph LR
CC[Claude Code CLI]
CC -->|http/protobuf OTLP metrics| Prom[Prometheus\nOTLP receiver]
CC -->|http/protobuf OTLP logs| Loki[Loki]
Prom --> Grafana[Grafana\nClaude Code dashboard]
Loki --> Grafana
Configuration¶
Telemetry is configured in ~/.claude/settings.json (managed in the dotfiles repo):
{
"env": {
"CLAUDE_CODE_ENABLE_TELEMETRY": "1",
"OTEL_METRICS_EXPORTER": "otlp",
"OTEL_LOGS_EXPORTER": "otlp",
"OTEL_EXPORTER_OTLP_METRICS_PROTOCOL": "http/protobuf",
"OTEL_EXPORTER_OTLP_METRICS_ENDPOINT": "https://prometheus.hdhomelab.com/api/v1/otlp/v1/metrics",
"OTEL_EXPORTER_OTLP_LOGS_PROTOCOL": "http/protobuf",
"OTEL_EXPORTER_OTLP_LOGS_ENDPOINT": "https://loki.hdhomelab.com/otlp/v1/logs",
"OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE": "cumulative"
}
}
Signal-specific endpoint path
OTEL_EXPORTER_OTLP_METRICS_ENDPOINT is used as-is — the SDK does not append /v1/metrics. The full path must be specified. The general OTEL_EXPORTER_OTLP_ENDPOINT auto-appends signal paths, but signal-specific vars do not.
Key settings¶
| Variable | Value | Notes |
|---|---|---|
OTEL_EXPORTER_OTLP_METRICS_TEMPORALITY_PREFERENCE |
cumulative |
Required for Prometheus _total counters |
OTEL_METRIC_EXPORT_INTERVAL |
60000ms (default) | Set to 10000 temporarily when debugging |
Prometheus OTLP receiver¶
The native OTLP receiver is enabled in flux/monitoring/noah/kube-prometheus-stack/helmrelease.yaml:
Metrics are received at https://prometheus.hdhomelab.com/api/v1/otlp/v1/metrics.
Metrics¶
| Metric | Description |
|---|---|
claude_code_cost_usage_USD_total |
API cost in USD, labeled by model |
claude_code_token_usage_tokens_total |
Tokens used, labeled by type (input/output/cacheRead/cacheCreation) |
claude_code_session_count_total |
CLI sessions started |
claude_code_active_time_seconds_total |
Active time, labeled by type (user/cli) |
claude_code_lines_of_code_count_total |
Lines of code, labeled by type (added/removed) |
claude_code_commit_count_total |
Git commits created via Claude |
claude_code_pull_request_count_total |
Pull requests opened via Claude |
claude_code_code_edit_tool_decision_total |
Edit tool decisions, labeled by decision and tool_name |
Note
commit_count and pull_request_count only emit when Claude Code actually runs git commit or opens a PR via the Bash tool. They will not appear in Prometheus until that occurs.
Grafana dashboard¶
The dashboard is stored at flux/monitoring/noah/grafana-dashboards/claude-code.json and served from the claude-code ConfigMap with the grafana_dashboard: "1" label.
PromQL notes¶
Count panels (sessions, commits, PRs, lines of code) use max_over_time - min_over_time rather than increase():
sum(max_over_time(claude_code_commit_count_total[$__range]))
- sum(min_over_time(claude_code_commit_count_total[$__range]))
increase() extrapolates at time range boundaries and returns fractional values even for integer counters. The max - min approach reads exact counter values and avoids extrapolation entirely.
The token rate panel uses a fixed 5-minute window:
5m is ~4× the 60s export interval — the safe minimum for rate() to always have at least two data points. $__rate_interval is not used because Grafana cannot determine a scrape interval for OTLP push metrics.