Commit graph

26 commits

Author SHA1 Message Date
Alex Gorevski
9a6fa76825 readd tests, remove markdown files 2026-02-18 14:42:39 +08:00
Chummy
2560399423 feat(observability): focus PR 596 on Prometheus backend 2026-02-18 12:06:05 +08:00
argenis de la rosa
eba544dbd4 feat(observability): implement Prometheus metrics backend with /metrics endpoint
- Adds PrometheusObserver backend with counters, histograms, and gauges
- Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth
- Adds GET /metrics endpoint to gateway for Prometheus scraping
- Adds provider/model labels to AgentStart and AgentEnd events for better observability
- Adds as_any() method to Observer trait for backend-specific downcast

Metrics exposed:
- zeroclaw_agent_starts_total (Counter) with provider/model labels
- zeroclaw_agent_duration_seconds (Histogram) with provider/model labels
- zeroclaw_tool_calls_total (Counter) with tool/success labels
- zeroclaw_tool_duration_seconds (Histogram) with tool label
- zeroclaw_channel_messages_total (Counter) with channel/direction labels
- zeroclaw_heartbeat_ticks_total (Counter)
- zeroclaw_errors_total (Counter) with component label
- zeroclaw_request_latency_seconds (Histogram)
- zeroclaw_tokens_used_last (Gauge)
- zeroclaw_active_sessions (Gauge)
- zeroclaw_queue_depth (Gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:06:05 +08:00
Edvard
508fb53ac1 fix(provider): delegate native tool calling through ReliableProvider
ReliableProvider wraps underlying providers with retry/fallback logic
but did not delegate `supports_native_tools()` or `chat_with_tools()`.
This caused the agent loop to fall back to prompt-based tool calling
for all providers, even those with native tool support (OpenRouter,
OpenAI, Anthropic). Models like Gemini 2.0 Flash would then output
tool calls as text instead of structured API responses, breaking the
tool execution loop entirely.

Add `supports_native_tools()` delegation to the primary provider and
`chat_with_tools()` with the same retry/fallback logic as the existing
`chat_with_system()` and `chat_with_history()` methods.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:15:46 +08:00
Will Sarg
3c4ed2e28e
fix(providers): clarify reliable failure entries for custom providers (#594)
* fix(workflows): standardize runner configuration for security jobs

* ci(actionlint): add Blacksmith runner label to config

Add blacksmith-2vcpu-ubuntu-2404 to actionlint self-hosted-runner labels config
to suppress "unknown label" warnings during workflow linting.

This label is used across all workflows after the Blacksmith migration.

* fix(actionlint): adjust indentation for self-hosted runner labels

* feat(security): enhance security workflow with CodeQL analysis steps

* fix(security): update CodeQL action to version 4 for improved analysis

* fix(security): remove duplicate permissions in security workflow

* fix(security): revert CodeQL action to v3 for stability

The v4 version was causing workflow file validation failures.
Reverting to proven v3 version that is working on main branch.

* fix(security): remove duplicate permissions causing workflow validation failure

The permissions block had duplicate security-events and actions keys,
which caused YAML validation errors and prevented workflow execution.

Fixes: workflow file validation failures on main branch

* fix(security): remove pull_request trigger to reduce costs

* fix(security): restore PR trigger but skip codeql on PRs

* fix(security): resolve YAML syntax error in security workflow

* refactor(security): split CodeQL into dedicated scheduled workflow

* fix(security): update workflow name to Rust Package Security Audit

* fix(codeql): remove push trigger, keep schedule and on-demand only

* feat(codeql): add CodeQL configuration file to ignore specific paths

* Potential fix for code scanning alert no. 39: Hard-coded cryptographic value

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix(ci): resolve auto-response workflow merge markers

* fix(build): restore ChannelMessage reply_target usage

* ci(workflows): run workflow sanity on workflow pushes for all branches

* ci(workflows): rename auto-response workflow to PR Auto Responder

* ci(workflows): require owner approval for workflow file changes

* ci: add lint-first PR feedback gate

* ci(workflows): split label policy checks from workflow sanity

* ci(workflows): consolidate policy and rust workflow setup

* ci: add safe pull request intake sanity checks

* ci(security): switch audit to pinned rustsec audit-check

* fix(providers): clarify reliable failure entries for custom providers

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2026-02-17 13:53:03 -05:00
Chummy
0aa35eb669 fix(build): complete strict lint and test cleanup (replacement for #476) 2026-02-18 00:18:54 +08:00
Will Sarg
9e0958dee5 fix(ci): repair parking_lot migration regressions in PR #535 2026-02-17 09:10:40 -05:00
Will Sarg
ee05d62ce4
Merge branch 'main' into pr-484-clean 2026-02-17 08:54:24 -05:00
argenis de la rosa
1908af3248 fix(discord): use channel_id instead of sender for replies (fixes #483)
fix(misc): complete parking_lot::Mutex migration (fixes #505)

- DiscordChannel: store actual channel_id in ChannelMessage.channel
  instead of hardcoded "discord" string
- channels/mod.rs: use msg.channel instead of msg.sender for replies
- Migrate all std::sync::Mutex to parking_lot::Mutex:
  * src/security/audit.rs
  * src/memory/sqlite.rs
  * src/memory/response_cache.rs
  * src/memory/lucid.rs
  * src/channels/email_channel.rs
  * src/gateway/mod.rs
  * src/observability/traits.rs
  * src/providers/reliable.rs
  * src/providers/router.rs
  * src/agent/agent.rs
- Remove all .lock().unwrap() and .map_err(PoisonError) patterns
  since parking_lot::Mutex never poisons

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:05:25 -05:00
Chummy
5d131a8903 fix(security): tighten provider credential log hygiene
- remove as_deref credential routing path in provider factory
- avoid raw provider error text in warmup/retry failure summaries
- keep retry telemetry while reducing secret propagation risk
2026-02-17 19:19:06 +08:00
Argenis
69a9adde33
Merge PR #500: streaming support and security fixes
- feat(streaming): add streaming support for LLM responses (fixes #211)
- security(deps): remove vulnerable xmas-elf dependency via embuild (fixes #399)
- fix: resolve merge conflicts and integrate chat_with_tools from main

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 05:05:57 -05:00
argenis de la rosa
4070131bb8 fix: apply cargo fmt to fix formatting issues
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 05:05:23 -05:00
argenis de la rosa
d94e78c621 feat(streaming): add streaming support for LLM responses (fixes #211)
Implement Server-Sent Events (SSE) streaming for OpenAI-compatible providers:

- Add StreamChunk, StreamOptions, and StreamError types to traits module
- Add supports_streaming() and stream_chat_with_system() to Provider trait
- Implement SSE parser for OpenAI streaming responses (data: {...} format)
- Add streaming support to OpenAiCompatibleProvider
- Add streaming support to ReliableProvider with error propagation
- Add futures dependency for async stream support

Features:
- Token-by-token streaming for real-time feedback
- Token counting option (estimated ~4 chars per token)
- Graceful error handling and logging
- Channel-based stream bridging for async compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 05:01:13 -05:00
Chummy
b2dd3582a4 fix(ci): align reliable tests with simple_chat contract 2026-02-17 01:01:56 +08:00
mai1015
b341fdb368 feat: add agent structure and improve tooling for provider 2026-02-17 01:01:56 +08:00
Chummy
3234159c6c
chore(clippy): clear warning backlog and harden conversions (#383) 2026-02-17 00:32:33 +08:00
Chummy
8bcb5efa8a fix(ci): align reliable provider tests with ChatResponse 2026-02-16 22:06:40 +08:00
stawky
9a5db46cf7 feat(providers): model failover chain + API key rotation
- Add model_fallbacks and api_keys to ReliabilityConfig
- Implement per-model fallback chain in ReliableProvider
- Add API key rotation on auth failures (401/403)
- Add retry-after header parsing and exponential backoff
- Integrate failover into chat_with_system and chat_with_history
- 20 unit tests covering failover, rotation, and retry logic
2026-02-16 21:59:35 +08:00
chumyin
3b4a4de457 refactor(provider): unify Provider responses with ChatResponse
- Switch Provider trait methods to return structured ChatResponse
- Map OpenAI-compatible tool_calls into shared ToolCall type
- Update reliable/router wrappers and provider tests for new interface
- Make agent loop prefer structured tool calls with text fallback parsing
- Adapt gateway replies to structured responses with safe tool-call fallback
2026-02-16 19:16:22 +08:00
Chummy
b442a07530
fix(memory): prevent autosave key collisions across runtime flows
Fixes #221 - SQLite Memory Override bug.

This PR resolves memory overwrite behavior in autosave paths by replacing fixed memory keys with unique keys, and improves short-horizon recall quality in channel runtime.

**Root Cause**
SQLite memory uses a unique constraint on `memories.key` and writes with `ON CONFLICT(key) DO UPDATE`.
Several autosave paths reused fixed keys (or sender-stable keys), so newer messages overwrote earlier conversation entries.

**Changes**
- Channel runtime: autosave key changed from `channel_sender` to `channel_sender_messageId`
- Added memory-context injection before provider calls (aligned with agent loop behavior)
- Agent loop: autosave keys changed from fixed `user_msg`/`assistant_resp` to UUID-suffixed keys
- Gateway: Webhook/WhatsApp autosave keys changed to UUID-suffixed keys

All CI checks passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:55:52 -05:00
Edvard Schøyen
89b1ec6fa2
feat: add multi-turn conversation history and tool execution
* feat: add multi-turn conversation history and tool execution

Major enhancement to the agent loop:

**Multi-turn conversation:**
- Add `ChatMessage` type with system/user/assistant constructors
- Add `chat_with_history` method to Provider trait (default impl
  delegates to `chat_with_system` for backward compatibility)
- Implement native `chat_with_history` on OpenRouter, Compatible,
  Reliable, and Router providers to send full message history
- Interactive mode now maintains persistent history across turns

**Tool execution:**
- Agent loop now parses `<tool_call>` XML tags from LLM responses
- Executes tools from the registry and feeds results back as
  `<tool_result>` messages
- Agentic loop continues until LLM produces final text (no tool calls)
- MAX_TOOL_ITERATIONS (10) safety limit prevents runaway loops
- System prompt includes structured tool-use protocol with JSON schemas

**Types:**
- `ChatMessage`, `ChatResponse`, `ToolCall`, `ToolResultMessage`,
  `ConversationMessage` — full conversation modeling types

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address review comments on multi-turn + tool execution

- Add history sliding window (MAX_HISTORY_MESSAGES=50) to prevent
  unbounded conversation history growth in interactive mode
- Add 404→Responses API fallback in compatible.rs chat_with_history,
  matching chat_with_system behavior
- Use super::api_error() for error sanitization in compatible.rs
  instead of raw error body (prevents secret leakage)
- Add missing operational logs in reliable.rs chat_with_history:
  recovery, non-retryable, fallback switch warnings
- Add trim_history tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: address second round of review comments

- Sanitize raw error text in compatible.rs chat_with_system using
  sanitize_api_error (prevents leaking secrets in error messages)
- Add chat_with_history to MockProvider in reliable.rs tests so
  the retry/fallback path is exercised end-to-end
- Add chat_with_history_retries_then_recovers and
  chat_with_history_falls_back tests
- Log warning on malformed <tool_call> JSON instead of silent drop
- Flush stdout after print! in agent_turn so output appears before
  tool execution on line-buffered terminals
- Make interactive mode resilient to transient errors (continue
  loop instead of terminating session)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 14:43:02 -05:00
Edvard Schøyen
49bb20f961
fix(providers): use Bearer auth for Gemini CLI OAuth tokens
* fix(providers): use Bearer auth for Gemini CLI OAuth tokens

When credentials come from ~/.gemini/oauth_creds.json (Gemini CLI),
send them as Authorization: Bearer header instead of ?key= query
parameter. API keys from env vars or config continue using ?key=.

Fixes #194

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* refactor(gemini): harden OAuth bearer auth flow and tests

* fix(gemini): granular auth source tracking and review fixes

Build on chumyin's auth model refactor with:
- Expand GeminiAuth to 4 variants (ExplicitKey/EnvGeminiKey/EnvGoogleKey/
  OAuthToken) so auth_source() uses stored discriminant without re-reading
  env vars at call time
- Add is_api_key()/credential() helpers on the enum
- Upgrade expired OAuth token log from debug to warn
- Add tests: provider_rejects_empty_key, auth_source_explicit_key,
  auth_source_none_without_credentials

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: apply rustfmt to fix CI lint failures

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: root <root@instance-20220913-1738.vcn09131738.oraclevcn.com>
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-02-15 14:32:33 -05:00
Argenis
8694c2e2d2
fix(providers): skip retries on non-retryable HTTP errors (4xx)
Skip retries on non-retryable HTTP client errors (4xx) to avoid wasting time on requests that will never succeed.

- Added is_non_retryable() function to detect non-retryable errors
- 4xx client errors (400, 401, 403, 404) are now non-retryable
- Exceptions: 429 (rate limiting) and 408 (timeout) remain retryable
- 5xx server errors remain retryable
- Fallback logic now skips retries for non-retryable errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 10:11:32 -05:00
Argenis
1e19b12efd
fix(providers): warn on shared API key for fallbacks and warm up all providers (#130)
- Warn when fallback providers share the same API key as primary (could fail
  if providers require different keys)
- Warm up all providers instead of just the first, continuing on warmup failures

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 08:23:50 -05:00
Edvard
cc13fec16d fix: add provider warmup to prevent cold-start timeout on first channel message
The first API request after daemon startup consistently timed out (120s)
when using channels (Telegram, Discord, etc.), requiring a retry before
succeeding. This happened because the reqwest HTTP client's connection
pool was cold — no TLS handshake, DNS resolution, or HTTP/2 negotiation
had occurred yet.

The fix adds a `warmup()` method to the Provider trait that establishes
the connection pool on startup by hitting a lightweight endpoint
(`/api/v1/auth/key` for OpenRouter). The channel server calls this
immediately after creating the provider, before entering the message
processing loop.

Tested on Raspberry Pi 5 (aarch64) with OpenRouter + DeepSeek v3.2 via
Telegram channel. Before: first message took 2-7 minutes (120s timeout +
retries). After: first message responds in <30s with no retries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-14 18:43:26 -05:00
argenis de la rosa
ec2d5cc93d feat: enhance agent personality, tool guidance, and memory hygiene
- Expand communication style presets (professional, expressive, custom)
- Enrich SOUL.md with human-like tone and emoji-awareness guidance
- Add crash recovery and sub-task scoping guidance to AGENTS.md scaffold
- Add 'Use when / Don't use when' guidance to TOOLS.md and runtime prompts
- Implement memory hygiene system with configurable archiving and retention
- Add MemoryConfig options: hygiene_enabled, archive_after_days, purge_after_days, conversation_retention_days
- Archive old daily memory and session files to archive subdirectories
- Purge old archives and prune stale SQLite conversation rows
- Add comprehensive tests for new features
2026-02-14 11:28:39 -05:00