Commit graph

86 commits

Author SHA1 Message Date
Chummy
b4b379e3e7 fix(providers): harden tool fallback and refresh model catalogs 2026-02-18 22:50:02 +08:00
Chummy
50fd5b81e1 fix(test): stabilize cron output capture and clippy cleanups 2026-02-18 20:29:26 +08:00
Chummy
483acccdb7 feat(memory): add configurable postgres storage backend 2026-02-18 20:29:26 +08:00
Vernon Stinebaker
3b0133596c feat(providers): add native tool calling for OpenAI-compatible providers
Implement chat_with_tools() on CompatibleProvider so OpenAI-compatible
endpoints (OpenRouter, local LLMs, etc.) can use structured tool calling
instead of prompt-injected tool descriptions.

Changes:
- CompatibleProvider: capabilities() reports native_tool_calling, new
  chat_with_tools() sends tools in API request and parses tool_calls
  from response, chat() bridges to chat_with_tools() when ToolSpecs
  are provided
- RouterProvider: chat_with_tools() delegation with model hint resolution
- loop_.rs: expose tools_to_openai_format as pub(crate), add
  tools_to_openai_format_from_specs for ToolSpec-based conversion

Adds 9 new tests and updates 1 existing test.
2026-02-18 18:06:36 +08:00
Chummy
219764d4d8 fix(channels): recover malformed invoke/tool_call output in daemon mode 2026-02-18 17:01:36 +08:00
Xiangjun Ma
f1db63219c refactor(telegram): address code review findings
- Add strip_tool_call_tags() to finalize_draft to prevent Markdown
  parse failures from tool-call tags reaching Telegram API
- Deduplicate parse_reply_target() call in update_draft (was called
  twice, discarding thread_id both times)
- Replace body.as_object_mut().unwrap() mutation with separate
  plain_body JSON literal (eliminates unwrap in runtime path)
- Clean up per-chat rate-limit HashMap entry in finalize_draft to
  prevent unbounded growth over long uptimes
- Extract magic number 80 to STREAM_CHUNK_MIN_CHARS constant in
  agent loop
2026-02-18 16:33:33 +08:00
Xiangjun Ma
93538a70e3 fix(agent): relay final response as progressive chunks via on_delta
Previously on_delta sent the entire completed response as a single
message, defeating the purpose of the streaming draft updates. Now
the text is split into ~80-char chunks on whitespace boundaries
(UTF-8 safe via split_inclusive) and sent progressively through the
channel, so Telegram draft edits show text arriving incrementally.

The consumer in process_channel_message already accumulates chunks
and calls update_draft with the full text so far, and Telegram's
rate-limiting (draft_update_interval_ms) throttles editMessageText
calls to avoid API spam.
2026-02-18 16:33:33 +08:00
Xiangjun Ma
118cd53922 feat(channel): stream LLM responses to Telegram via draft message edits
Wire the existing provider-layer streaming infrastructure through the
channel trait and agent loop so Telegram users see tokens arrive
progressively via editMessageText, instead of waiting for the full
response.

Changes:
- Add StreamMode enum (off/partial/block) and draft_update_interval_ms
  to TelegramConfig (backward-compatible defaults: off, 1000ms)
- Add supports_draft_updates/send_draft/update_draft/finalize_draft to
  Channel trait with no-op defaults (zero impact on existing channels)
- Implement draft methods on TelegramChannel using sendMessage +
  editMessageText with rate limiting and Markdown fallback
- Add on_delta mpsc::Sender<String> parameter to run_tool_call_loop
  (None preserves existing behavior)
- Wire streaming in process_channel_message: when channel supports
  drafts, send initial draft, spawn updater task, finalize on completion

Edge cases handled:
- 4096-char limit: finalize draft and fall back to chunked send
- Broken Markdown: use no parse_mode during streaming, apply on finalize
- Edit failures: fall back to sending complete response as new message
- Rate limiting: configurable draft_update_interval_ms (default 1s)
2026-02-18 16:33:33 +08:00
Chummy
f3bdff1d69 fix(agent): harden glm tool-call parsing and scope PR 2026-02-18 15:23:35 +08:00
adisusilayasa
58c81aa258 feat(agent): add GLM-style tool call parsing
GLM models output tool calls in proprietary formats that ZeroClaw
doesn't natively support. This adds parsing for GLM-specific formats:

- browser_open/url>https://... -> shell tool with curl command
- shell/command>ls -> shell tool with command arg
- http_request/url>... -> http_request tool
- Plain URLs -> shell tool with curl command

Also adds:
- find_json_end() helper for parsing JSON objects
- Unclosed <toolcall> tag handling
- Unit tests for GLM-style parsing

The parsing is deliberately placed after XML and markdown code block
parsing, so it acts as a fallback for models that don't use standard
tool calling formats.

This enables GLM models (via Z.AI or other providers) to successfully
execute tools in ZeroClaw.
2026-02-18 15:23:35 +08:00
Edvard
6e53341bb1 feat(agent): add rule-based query classification for automatic model routing
Classify incoming user messages by keyword/pattern and route to the
appropriate model hint automatically, feeding into the existing
RouterProvider. Disabled by default; opt-in via [query_classification]
config section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:41:58 +08:00
Chummy
431287184b style(tests): apply rustfmt to brittle-test hardening changes 2026-02-18 14:17:58 +08:00
Alex Gorevski
45cdd25b3d fix(tests): harden brittle tests for cross-platform stability and refactoring resilience
## Problem

The test suite contained several categories of latent brittleness
identified in docs/testing-brittle-tests.md that would surface during
refactoring or cross-platform (Windows) CI execution:

1. Hardcoded Unix paths: \Path::new("/tmp")\ and \PathBuf::from("/tmp")\
   used as workspace directories in agent tests, which fail on Windows
   where /tmp does not exist.

2. Exact string match assertions: ~20 \ssert_eq!(response, "exact text")\
   assertions in agent unit and e2e tests that break on any mock wording
   change, even when the underlying orchestration behavior is correct.

3. Fragile error message string matching: \.contains("specific message")\
   assertions coupled to internal error wording rather than testing the
   error category or behavioral outcome.

## What Changed

### Hardcoded paths → platform-agnostic temp dirs (4 files, 7 locations)
- \src/agent/tests.rs\: Replaced all 4 instances of \Path::new("/tmp")\
  and \PathBuf::from("/tmp")\ with \std::env::temp_dir()\ in
  \make_memory()\, \uild_agent_with()\, \uild_agent_with_memory()\,
  and \uild_agent_with_config()\ helpers.
- \	ests/agent_e2e.rs\: Replaced all 3 instances in \make_memory()\,
  \uild_agent()\, and \uild_agent_xml()\ helpers.

### Exact string assertions → behavioral checks (2 files, ~20 locations)
- \src/agent/tests.rs\: Converted 10 \ssert_eq!(response, "...")\ to
  \ssert!(!response.is_empty(), "descriptive message")\ across tests for
  text pass-through, tool execution, tool failure recovery, XML dispatch,
  mixed text+tool responses, multi-tool batch, and run_single delegation.
- \	ests/agent_e2e.rs\: Converted 9 exact-match assertions to behavioral
  checks. Multi-turn test now uses \ssert_ne!(r1, r2)\ to verify
  sequential responses are distinct without coupling to exact wording.
- Provider error propagation test simplified to \ssert!(result.is_err())\
  without asserting on the error message string.

### Fragile error message assertions → structural checks (2 files)
- \src/tools/git_operations.rs\: Replaced fragile OR-branch string match
  (\contains("git repository") || contains("Git command failed")\) with
  structural assertions: checks \!result.success\, error is non-empty,
  and error does NOT mention autonomy/read-only (verifying the failure
  is git-related, not permission-related).
- \src/cron/scheduler.rs\: Replaced \contains("agent job failed:")\ with
  \!success\ and \!output.is_empty()\ checks that verify failure behavior
  without coupling to exact log format.

## What Was NOT Changed (and why)
- \src/agent/loop_.rs\ parser tests: Exact string assertions are the
  contract for XML tool call parsing — the exact output IS the spec.
- \src/providers/reliable.rs\: Error message assertions test the error
  format contract (provider/model attribution in failure messages).
- \src/service/mod.rs\: Already platform-gated with \#[cfg]\; XML escape
  test is a formatting contract where exact match is appropriate.
- \src/config/schema.rs\: TOML test strings use /tmp as data values for
  deserialization tests, not filesystem access; HOME tests already use
  \std::env::temp_dir()\.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 14:17:58 +08:00
Edvard
8a1e7cc7ef fix(agent): use config max_tool_iterations, add memory relevance filtering, rebalance search weights
Three fixes for conversation quality issues:

1. loop_.rs and channels now read max_tool_iterations from AgentConfig
   instead of using a hardcoded constant of 10, making it configurable.

2. Memory recall now filters entries below a configurable
   min_relevance_score threshold (default 0.4), preventing unrelated
   memories from bleeding into conversation context.

3. Default hybrid search weights rebalanced from 70/30 vector/keyword
   to 40/60, reducing cross-topic semantic bleed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:14:33 +08:00
Edvard
6d8725c9e6 fix(agent): log warning when native tool call arguments fail JSON parsing
The NativeToolDispatcher silently defaults to an empty object when tool
call arguments from the LLM fail to parse as JSON. The XML dispatcher
already logs a warning for the same case (line 68). Add a matching
tracing::warn with tool name and parse error for observability parity.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:56:56 +08:00
Edvard
63602a262f fix(agent): use config-driven limits in run_tool_call_loop and trim_history
run_tool_call_loop used a hardcoded MAX_TOOL_ITERATIONS (10) and
trim_history/auto_compact_history used a hardcoded MAX_HISTORY_MESSAGES (50),
ignoring the user-configurable agent.max_tool_iterations and
agent.max_history_messages values in config.toml.

Meanwhile, agent.rs correctly reads from config — creating an inconsistency
where CLI single-shot mode respected config but the channel runtime and
interactive CLI loop silently ignored it.

Changes:
- Rename constants to DEFAULT_* to clarify they are fallback defaults
- Add max_tool_iterations parameter to run_tool_call_loop
- Add max_history parameter to trim_history and auto_compact_history
- Thread config.agent.max_tool_iterations through ChannelRuntimeContext
- Both CLI code paths now pass config values to run_tool_call_loop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:49:28 +08:00
Chummy
2560399423 feat(observability): focus PR 596 on Prometheus backend 2026-02-18 12:06:05 +08:00
argenis de la rosa
eba544dbd4 feat(observability): implement Prometheus metrics backend with /metrics endpoint
- Adds PrometheusObserver backend with counters, histograms, and gauges
- Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth
- Adds GET /metrics endpoint to gateway for Prometheus scraping
- Adds provider/model labels to AgentStart and AgentEnd events for better observability
- Adds as_any() method to Observer trait for backend-specific downcast

Metrics exposed:
- zeroclaw_agent_starts_total (Counter) with provider/model labels
- zeroclaw_agent_duration_seconds (Histogram) with provider/model labels
- zeroclaw_tool_calls_total (Counter) with tool/success labels
- zeroclaw_tool_duration_seconds (Histogram) with tool label
- zeroclaw_channel_messages_total (Counter) with channel/direction labels
- zeroclaw_heartbeat_ticks_total (Counter)
- zeroclaw_errors_total (Counter) with component label
- zeroclaw_request_latency_seconds (Histogram)
- zeroclaw_tokens_used_last (Gauge)
- zeroclaw_active_sessions (Gauge)
- zeroclaw_queue_depth (Gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:06:05 +08:00
Chummy
3467d34596 fix(agent): avoid duplicate text in markdown tool_call fallback 2026-02-18 10:15:46 +08:00
Edvard
cb7df7c87f style(agent): apply rustfmt formatting to loop_.rs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:15:46 +08:00
Edvard
0c46b56555 fix(agent): satisfy clippy::if_not_else lint in tool history push
Flip conditional to use positive check (is_empty) in the if-branch
to resolve clippy::if_not_else error in CI strict delta lint gate.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:15:46 +08:00
Edvard
0e5a785015 fix(agent): use native format for tool result history in run_tool_call_loop
When use_native_tools is true, the agent loop now:
- Formats assistant history as JSON with tool_calls array (matching
  what convert_messages() expects to reconstruct NativeMessage)
- Pushes each tool result as ChatMessage::tool with tool_call_id
  (instead of a single ChatMessage::user with XML tool_result tags)
- Adds fallback parsing for markdown code block tool calls
  (```tool_call ... ``` and hybrid ```tool_call ... </tool_call>)

Without this, the second LLM call (sending tool results back) gets
rejected with 4xx by OpenRouter/Gemini because the message format
doesn't match the OpenAI tool calling API expectations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:15:46 +08:00
Kieran
d756293871 feat: add /clear command 2026-02-18 10:01:22 +08:00
Chummy
5942caa083 chore(pr539): scope to dingtalk daemon fixes only 2026-02-18 00:42:40 +08:00
JamesYin
9eff7a13bb fix(agent): parse legacy schedule tool_call payloads 2026-02-18 00:42:40 +08:00
JamesYin
af5d1f3066 fix(agent): recover malformed tool_call blocks with leading text 2026-02-18 00:42:40 +08:00
JamesYin
59f74e8f39 fix(agent): retry malformed prefixed tool_call markup 2026-02-18 00:42:40 +08:00
JamesYin
128e888d7a style: format rebased conflict resolutions 2026-02-18 00:42:40 +08:00
JamesYin
3522d51f98 fix(agent): retry malformed tool_call payloads in tool loop 2026-02-18 00:42:40 +08:00
Chummy
40ab5c3507 fix(agent): rebase alias-tag parser and align channel send API 2026-02-18 00:28:08 +08:00
Chummy
4243d8ec86 fix(agent): parse tool-call alias tags in channel runtime 2026-02-18 00:28:08 +08:00
Chummy
ed675d4e6b test(agent): add comprehensive loop test suite 2026-02-18 00:26:31 +08:00
Chummy
0aa35eb669 fix(build): complete strict lint and test cleanup (replacement for #476) 2026-02-18 00:18:54 +08:00
Chummy
fc6e8eb521
fix(provider): follow-up CN/global consistency for Z.AI and aliases (#554)
* fix(provider): harden CN/global routing consistency for Chinese vendors

* fix(agent): migrate CLI channel send to SendMessage

* fix(onboard): deduplicate Z.AI key URL match arms
2026-02-18 00:04:56 +08:00
Chummy
bb641d28c2 fix(approval): harden CLI approval flow and summaries 2026-02-17 23:06:12 +08:00
stawky
ab561baa97 feat(approval): interactive approval workflow for supervised mode (#215)
- Add auto_approve / always_ask fields to AutonomyConfig
- New src/approval/ module: ApprovalManager with session-scoped allowlist,
  ApprovalRequest/Response types, audit logging, CLI interactive prompt
- Insert approval hook in agent_turn before tool execution
- Non-CLI channels auto-approve; CLI shows Y/N/A prompt
- Skip approval for read-only tools (file_read, memory_recall) by default
- 15 unit tests covering all approval logic
2026-02-17 23:06:12 +08:00
Will Sarg
ee05d62ce4
Merge branch 'main' into pr-484-clean 2026-02-17 08:54:24 -05:00
Vernon Stinebaker
df31359ec4
feat(agent): scrub credentials from tool output (#532)
* feat(channels): add channel capabilities to system prompt

Add channel capabilities section to system prompt so the agent knows
it can send Discord messages directly without asking permission.
Also reminds agent not to repeat or echo credentials.

Co-authored-by: Vernon Stinebaker <vernon.stinebaker@gmail.com>

* feat(agent): scrub credentials from tool output

* chore: fix clippy and formatting for scrubbing
2026-02-17 08:23:11 -05:00
argenis de la rosa
1908af3248 fix(discord): use channel_id instead of sender for replies (fixes #483)
fix(misc): complete parking_lot::Mutex migration (fixes #505)

- DiscordChannel: store actual channel_id in ChannelMessage.channel
  instead of hardcoded "discord" string
- channels/mod.rs: use msg.channel instead of msg.sender for replies
- Migrate all std::sync::Mutex to parking_lot::Mutex:
  * src/security/audit.rs
  * src/memory/sqlite.rs
  * src/memory/response_cache.rs
  * src/memory/lucid.rs
  * src/channels/email_channel.rs
  * src/gateway/mod.rs
  * src/observability/traits.rs
  * src/providers/reliable.rs
  * src/providers/router.rs
  * src/agent/agent.rs
- Remove all .lock().unwrap() and .map_err(PoisonError) patterns
  since parking_lot::Mutex never poisons

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:05:25 -05:00
Vernon Stinebaker
efa6e5aa4a
feat(channel): add capabilities to system prompt (#531)
* feat(channels): add channel capabilities to system prompt

Add channel capabilities section to system prompt so the agent knows
it can send Discord messages directly without asking permission.
Also reminds agent not to repeat or echo credentials.

Co-authored-by: Vernon Stinebaker <vernon.stinebaker@gmail.com>

* chore: fix formatting and clippy warnings
2026-02-17 08:02:11 -05:00
fettpl
ebb78afda4
feat(memory): add session_id isolation to Memory trait (#530)
* feat(memory): add session_id isolation to Memory trait

Add optional session_id parameter to store(), recall(), and list()
methods across the Memory trait and all four backends (sqlite, markdown,
lucid, none). This enables per-session memory isolation so different
agent sessions cannot cross-read each other's stored memories.

Changes:
- traits.rs: Add session_id: Option<&str> to store/recall/list
- sqlite.rs: Schema migration (ALTER TABLE ADD COLUMN session_id),
  index, persist/filter by session_id in all query paths
- markdown.rs, lucid.rs, none.rs: Updated signatures
- All callers pass None for backward compatibility
- 5 new tests: session-filtered recall, cross-session isolation,
  session-filtered list, no-filter returns all, migration idempotency

Closes #518

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(channels): fix discord _channel_id typo and lark missing reply_to

Pre-existing compilation errors on main after reply_to was added to
ChannelMessage: discord.rs used _channel_id (underscore prefix) but
referenced channel_id, and lark.rs was missing the reply_to field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 07:44:05 -05:00
Chummy
8f5da70283 fix(api): retain agent and observability re-exports 2026-02-17 20:00:08 +08:00
DeadManAI
4fca1abee8 fix: resolve all clippy warnings, formatting, and Mistral endpoint
- Fix Mistral provider base URL (missing /v1 prefix caused 404s)
- Resolve 55 clippy warnings across 28 warning types
- Apply cargo fmt to 44 formatting violations
- Remove unused imports (process_message, MultiObserver, VerboseObserver,
  ChatResponse, ToolCall, Path, TempDir)
- Replace format!+push_str with write! macro
- Fix unchecked Duration subtraction, redundant closures, clamp patterns
- Declare missing feature flags (sandbox-landlock, sandbox-bubblewrap,
  browser-native) in Cargo.toml
- Derive Default where manual impls were redundant
- Add separators to long numeric literals (115200 → 115_200)
- Restructure unreachable code in arduino_flash platform branches

All 1,500 tests pass. Zero clippy warnings. Clean formatting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 20:00:08 +08:00
Kieran
808450c48e feat: custom global api_url 2026-02-17 18:48:45 +08:00
Chummy
8371f412f8 feat(observability): propagate optional cost_usd on agent end 2026-02-17 18:16:12 +08:00
Chummy
f75f73a50d fix(agent): preserve native tool-call fallbacks and history fidelity 2026-02-17 17:55:38 +08:00
Vernon Stinebaker
f322360248 feat(providers): add native tool-call API support via chat_with_tools
Add chat_with_tools() to the Provider trait with a default fallback to
chat_with_history(). Implement native tool calling in OpenRouterProvider,
reusing existing NativeChatRequest/NativeChatResponse structs. Wire the
agent loop to use native tool calls when the provider supports them,
falling back to XML-based parsing otherwise.

Changes are purely additive to traits.rs and openrouter.rs. The only
deletions (36 lines) are within run_tool_call_loop() in loop_.rs where
the LLM call section was replaced with a branching if/else for native
vs XML tool calling.

Includes 5 new tests covering:
- chat_with_tools error path (missing API key)
- NativeChatResponse deserialization (tool calls only, mixed)
- parse_native_response conversion to ChatResponse
- tools_to_openai_format schema validation
2026-02-17 17:55:38 +08:00
mai1015
0e9852ec06 feat: pass a cloned config to all_tools_with_runtime for improved tool initialization 2026-02-17 17:06:28 +08:00
mai1015
fb2d1cea0b Implement cron job management tools and types
- Added `JobType`, `SessionTarget`, `Schedule`, `DeliveryConfig`, `CronJob`, `CronRun`, and `CronJobPatch` types in `src/cron/types.rs` for cron job configuration and management.
- Introduced `CronAddTool`, `CronListTool`, `CronRemoveTool`, `CronRunTool`, `CronRunsTool`, and `CronUpdateTool` in `src/tools` for adding, listing, removing, running, and updating cron jobs.
- Updated the `run` function in `src/daemon/mod.rs` to conditionally start the scheduler based on the cron configuration.
- Modified command-line argument parsing in `src/lib.rs` and `src/main.rs` to support new cron job commands.
- Enhanced the onboarding wizard in `src/onboard/wizard.rs` to include cron configuration.
- Added tests for cron job tools to ensure functionality and error handling.
2026-02-17 17:06:28 +08:00
darwin808
4413790859 chore(lint): remove unused imports, variables, and redundant mut bindings
Eliminate low-risk clippy warnings as part of the strict lint backlog (#409):

- Remove unused `uuid::Uuid` imports from slack and telegram channels
- Remove unnecessary `mut` and redundant rebindings in agent loop
- Prefix unused `channel_id` variable in discord channel
- Remove unused test imports (`ChatResponse`, `ToolCall`, `TempDir`, `Path`)
2026-02-17 16:40:58 +08:00