feat(observability): implement Prometheus metrics backend with /metrics endpoint
- Adds PrometheusObserver backend with counters, histograms, and gauges - Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth - Adds GET /metrics endpoint to gateway for Prometheus scraping - Adds provider/model labels to AgentStart and AgentEnd events for better observability - Adds as_any() method to Observer trait for backend-specific downcast Metrics exposed: - zeroclaw_agent_starts_total (Counter) with provider/model labels - zeroclaw_agent_duration_seconds (Histogram) with provider/model labels - zeroclaw_tool_calls_total (Counter) with tool/success labels - zeroclaw_tool_duration_seconds (Histogram) with tool label - zeroclaw_channel_messages_total (Counter) with channel/direction labels - zeroclaw_heartbeat_ticks_total (Counter) - zeroclaw_errors_total (Counter) with component label - zeroclaw_request_latency_seconds (Histogram) - zeroclaw_tokens_used_last (Gauge) - zeroclaw_active_sessions (Gauge) - zeroclaw_queue_depth (Gauge) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
c04f2855e4
commit
eba544dbd4
11 changed files with 575 additions and 228 deletions
|
|
@ -193,8 +193,18 @@ impl Provider for ReliableProvider {
|
|||
} else {
|
||||
"retryable"
|
||||
};
|
||||
// For custom providers, strip the URL from the provider name
|
||||
// to avoid confusion. The format "custom:https://..." in error
|
||||
// logs makes it look like the model is being appended to the URL.
|
||||
let display_provider = if provider_name.starts_with("custom:") {
|
||||
"custom"
|
||||
} else if provider_name.starts_with("anthropic-custom:") {
|
||||
"anthropic-custom"
|
||||
} else {
|
||||
provider_name
|
||||
};
|
||||
failures.push(format!(
|
||||
"provider={provider_name} model={current_model} attempt {}/{}: {failure_reason}",
|
||||
"{display_provider}/{current_model} attempt {}/{}: {failure_reason}",
|
||||
attempt + 1,
|
||||
self.max_retries + 1
|
||||
));
|
||||
|
|
@ -298,8 +308,18 @@ impl Provider for ReliableProvider {
|
|||
} else {
|
||||
"retryable"
|
||||
};
|
||||
// For custom providers, strip the URL from the provider name
|
||||
// to avoid confusion. The format "custom:https://..." in error
|
||||
// logs makes it look like the model is being appended to the URL.
|
||||
let display_provider = if provider_name.starts_with("custom:") {
|
||||
"custom"
|
||||
} else if provider_name.starts_with("anthropic-custom:") {
|
||||
"anthropic-custom"
|
||||
} else {
|
||||
provider_name
|
||||
};
|
||||
failures.push(format!(
|
||||
"provider={provider_name} model={current_model} attempt {}/{}: {failure_reason}",
|
||||
"{display_provider}/{current_model} attempt {}/{}: {failure_reason}",
|
||||
attempt + 1,
|
||||
self.max_retries + 1
|
||||
));
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue