feat(observability): implement Prometheus metrics backend with /metrics endpoint

- Adds PrometheusObserver backend with counters, histograms, and gauges
- Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth
- Adds GET /metrics endpoint to gateway for Prometheus scraping
- Adds provider/model labels to AgentStart and AgentEnd events for better observability
- Adds as_any() method to Observer trait for backend-specific downcast

Metrics exposed:
- zeroclaw_agent_starts_total (Counter) with provider/model labels
- zeroclaw_agent_duration_seconds (Histogram) with provider/model labels
- zeroclaw_tool_calls_total (Counter) with tool/success labels
- zeroclaw_tool_duration_seconds (Histogram) with tool label
- zeroclaw_channel_messages_total (Counter) with channel/direction labels
- zeroclaw_heartbeat_ticks_total (Counter)
- zeroclaw_errors_total (Counter) with component label
- zeroclaw_request_latency_seconds (Histogram)
- zeroclaw_tokens_used_last (Gauge)
- zeroclaw_active_sessions (Gauge)
- zeroclaw_queue_depth (Gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
argenis de la rosa 2026-02-17 14:01:37 -05:00 committed by Chummy
parent c04f2855e4
commit eba544dbd4
11 changed files with 575 additions and 228 deletions

View file

@ -1129,4 +1129,40 @@ mod tests {
"https://opencode.ai/zen/v1/chat/completions"
);
}
// ══════════════════════════════════════════════════════════
// Issue #580: Custom provider URL construction tests
// ══════════════════════════════════════════════════════════
#[test]
fn chat_completions_url_custom_provider_opencode_issue_580() {
// Issue #580: Custom provider should correctly append /chat/completions
// The error log format "{provider_name}/{current_model}" was confusing
// but the actual URL construction was always correct.
let p = make_provider("custom", "https://opencode.ai/zen/v1", None);
assert_eq!(
p.chat_completions_url(),
"https://opencode.ai/zen/v1/chat/completions"
);
}
#[test]
fn chat_completions_url_custom_provider_standard() {
// Standard custom provider without /v1 path
let p = make_provider("custom", "https://my-api.example.com", None);
assert_eq!(
p.chat_completions_url(),
"https://my-api.example.com/chat/completions"
);
}
#[test]
fn chat_completions_url_custom_provider_with_v1() {
// Custom provider with /v1 path
let p = make_provider("custom", "https://my-api.example.com/v1", None);
assert_eq!(
p.chat_completions_url(),
"https://my-api.example.com/v1/chat/completions"
);
}
}

View file

@ -193,8 +193,18 @@ impl Provider for ReliableProvider {
} else {
"retryable"
};
// For custom providers, strip the URL from the provider name
// to avoid confusion. The format "custom:https://..." in error
// logs makes it look like the model is being appended to the URL.
let display_provider = if provider_name.starts_with("custom:") {
"custom"
} else if provider_name.starts_with("anthropic-custom:") {
"anthropic-custom"
} else {
provider_name
};
failures.push(format!(
"provider={provider_name} model={current_model} attempt {}/{}: {failure_reason}",
"{display_provider}/{current_model} attempt {}/{}: {failure_reason}",
attempt + 1,
self.max_retries + 1
));
@ -298,8 +308,18 @@ impl Provider for ReliableProvider {
} else {
"retryable"
};
// For custom providers, strip the URL from the provider name
// to avoid confusion. The format "custom:https://..." in error
// logs makes it look like the model is being appended to the URL.
let display_provider = if provider_name.starts_with("custom:") {
"custom"
} else if provider_name.starts_with("anthropic-custom:") {
"anthropic-custom"
} else {
provider_name
};
failures.push(format!(
"provider={provider_name} model={current_model} attempt {}/{}: {failure_reason}",
"{display_provider}/{current_model} attempt {}/{}: {failure_reason}",
attempt + 1,
self.max_retries + 1
));