zeroclaw

Author	SHA1	Message	Date
Chummy	d9a94fc763	fix(skills): escape inlined skill XML content	2026-02-20 01:28:49 +08:00
Edvard	8a4da141d6	fix(skills): inject skill prompts and tools into agent system prompt Skill prompts and tool definitions from SKILL.toml were parsed and stored correctly but never included in the agent's system prompt. Both prompt-building paths (channels/mod.rs and agent/prompt.rs) only emitted skill metadata (name, description, location), telling the LLM to "read" the SKILL.toml on demand. This caused the agent to attempt manual file reads that often failed, leaving skills effectively ignored. Now both paths inline <instructions> and <tools> blocks inside each <skill> XML element, so the agent receives full skill context without extra tool calls. Closes #877 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 01:28:49 +08:00
Chummy	d714d3984e	fix(memory): stop autosaving assistant summaries and filter legacy entries	2026-02-20 01:14:08 +08:00
Alex Gorevski	dce7280812	Merge pull request #865 from agorevski/feat/systematic-test-coverage-852 test: add systematic test coverage for 7 bug pattern groups (#852)	2026-02-19 07:02:20 -08:00
Chummy	dcd0bf641d	feat: add multimodal image marker support with Ollama vision	2026-02-19 21:25:21 +08:00
Chummy	a5d7911923	feat(runtime): add reasoning toggle for ollama	2026-02-19 21:05:19 +08:00
Chummy	572aa77c2a	feat(memory): add embedding hint routes and upgrade guidance	2026-02-19 20:49:53 +08:00
Chummy	d6dca4b890	fix(provider): align native tool system-flattening and add regressions	2026-02-19 17:44:07 +08:00
YubinghanBai	48eb1d1f30	fix(agent): inject full datetime into system prompt and allow date command Three related agent UX issues found during MiniMax channel testing: 1. DateTimeSection injected only timezone, not the actual date/time. Models have no reliable way to know the current date from training data alone, causing wrong or hallucinated dates in responses. Fix: include full timestamp (YYYY-MM-DD HH:MM:SS TZ) in the prompt. 2. The `date` shell command was absent from the security policy allowed_commands default list. When a model tried to call shell("date") to get the current time, it received a policy rejection and told the user it was "blocked by security policy". Fix: add "date" to the default allowed_commands list. The command is read-only, side-effect-free, and carries no security risk. 3. (Context) The datetime prompt fix makes the date command fallback largely unnecessary, but the allowlist addition ensures the tool works correctly if models choose to call it anyway. Non-goals: - Not changing the autonomy model or risk classification - Not adding new config keys Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 17:44:07 +08:00
Devin AI	44fa7f3d3d	fix(agent): include workspace files when AIEOS identity is configured Remove early return in IdentitySection::build() that caused AGENTS.md, SOUL.md, and other workspace files to be silently skipped when AIEOS identity loaded successfully. Both AIEOS identity and workspace files now coexist in the system prompt. Closes zeroclaw-labs/zeroclaw#856 Co-Authored-By: Kristofer Mondlane <kmondlane@gmail.com>	2026-02-19 15:24:58 +08:00
Alex Gorevski	7f03ab77a9	test: add systematic test coverage for 7 bug pattern groups (#852 ) Add ~105 test cases across 7 test groups identified in issue #852: TG1 - Provider resolution (27 tests): Factory resolution, alias mapping, custom URLs, auth styles, credential wiring TG2 - Config persistence (18 tests): Config defaults, TOML roundtrip, agent/memory config, workspace dirs TG3 - Channel routing (14 tests): ChannelMessage identity contracts, SendMessage construction, Channel trait send/listen roundtrip TG4 - Agent loop robustness (12 integration + 14 inline tests): Malformed tool calls, failing tools, iteration limits, empty responses, unicode TG5 - Memory restart (14 tests): Dedup on same key, restart persistence, session scoping, recall, concurrent stores, categories TG6 - Channel message splitting (8+8 inline tests): Code blocks at boundary, long words, emoji, CJK chars, whitespace edge cases TG7 - Provider schema (21 tests): ChatMessage/ToolCall/ChatResponse serialization, tool_call_id preservation, auth style variants Also fixes a bug in split_message_for_telegram() where byte-based indexing could panic on multi-byte characters (emoji, CJK). Now uses char_indices() consistent with the Discord split implementation. Closes #852 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-02-18 15:28:34 -08:00
Chummy	b4b379e3e7	fix(providers): harden tool fallback and refresh model catalogs	2026-02-18 22:50:02 +08:00
Chummy	50fd5b81e1	fix(test): stabilize cron output capture and clippy cleanups	2026-02-18 20:29:26 +08:00
Chummy	483acccdb7	feat(memory): add configurable postgres storage backend	2026-02-18 20:29:26 +08:00
Vernon Stinebaker	3b0133596c	feat(providers): add native tool calling for OpenAI-compatible providers Implement chat_with_tools() on CompatibleProvider so OpenAI-compatible endpoints (OpenRouter, local LLMs, etc.) can use structured tool calling instead of prompt-injected tool descriptions. Changes: - CompatibleProvider: capabilities() reports native_tool_calling, new chat_with_tools() sends tools in API request and parses tool_calls from response, chat() bridges to chat_with_tools() when ToolSpecs are provided - RouterProvider: chat_with_tools() delegation with model hint resolution - loop_.rs: expose tools_to_openai_format as pub(crate), add tools_to_openai_format_from_specs for ToolSpec-based conversion Adds 9 new tests and updates 1 existing test.	2026-02-18 18:06:36 +08:00
Chummy	219764d4d8	fix(channels): recover malformed invoke/tool_call output in daemon mode	2026-02-18 17:01:36 +08:00
Xiangjun Ma	f1db63219c	refactor(telegram): address code review findings - Add strip_tool_call_tags() to finalize_draft to prevent Markdown parse failures from tool-call tags reaching Telegram API - Deduplicate parse_reply_target() call in update_draft (was called twice, discarding thread_id both times) - Replace body.as_object_mut().unwrap() mutation with separate plain_body JSON literal (eliminates unwrap in runtime path) - Clean up per-chat rate-limit HashMap entry in finalize_draft to prevent unbounded growth over long uptimes - Extract magic number 80 to STREAM_CHUNK_MIN_CHARS constant in agent loop	2026-02-18 16:33:33 +08:00
Xiangjun Ma	93538a70e3	fix(agent): relay final response as progressive chunks via on_delta Previously on_delta sent the entire completed response as a single message, defeating the purpose of the streaming draft updates. Now the text is split into ~80-char chunks on whitespace boundaries (UTF-8 safe via split_inclusive) and sent progressively through the channel, so Telegram draft edits show text arriving incrementally. The consumer in process_channel_message already accumulates chunks and calls update_draft with the full text so far, and Telegram's rate-limiting (draft_update_interval_ms) throttles editMessageText calls to avoid API spam.	2026-02-18 16:33:33 +08:00
Xiangjun Ma	118cd53922	feat(channel): stream LLM responses to Telegram via draft message edits Wire the existing provider-layer streaming infrastructure through the channel trait and agent loop so Telegram users see tokens arrive progressively via editMessageText, instead of waiting for the full response. Changes: - Add StreamMode enum (off/partial/block) and draft_update_interval_ms to TelegramConfig (backward-compatible defaults: off, 1000ms) - Add supports_draft_updates/send_draft/update_draft/finalize_draft to Channel trait with no-op defaults (zero impact on existing channels) - Implement draft methods on TelegramChannel using sendMessage + editMessageText with rate limiting and Markdown fallback - Add on_delta mpsc::Sender<String> parameter to run_tool_call_loop (None preserves existing behavior) - Wire streaming in process_channel_message: when channel supports drafts, send initial draft, spawn updater task, finalize on completion Edge cases handled: - 4096-char limit: finalize draft and fall back to chunked send - Broken Markdown: use no parse_mode during streaming, apply on finalize - Edit failures: fall back to sending complete response as new message - Rate limiting: configurable draft_update_interval_ms (default 1s)	2026-02-18 16:33:33 +08:00
Chummy	f3bdff1d69	fix(agent): harden glm tool-call parsing and scope PR	2026-02-18 15:23:35 +08:00
adisusilayasa	58c81aa258	feat(agent): add GLM-style tool call parsing GLM models output tool calls in proprietary formats that ZeroClaw doesn't natively support. This adds parsing for GLM-specific formats: - browser_open/url>https://... -> shell tool with curl command - shell/command>ls -> shell tool with command arg - http_request/url>... -> http_request tool - Plain URLs -> shell tool with curl command Also adds: - find_json_end() helper for parsing JSON objects - Unclosed <toolcall> tag handling - Unit tests for GLM-style parsing The parsing is deliberately placed after XML and markdown code block parsing, so it acts as a fallback for models that don't use standard tool calling formats. This enables GLM models (via Z.AI or other providers) to successfully execute tools in ZeroClaw.	2026-02-18 15:23:35 +08:00
Edvard	6e53341bb1	feat(agent): add rule-based query classification for automatic model routing Classify incoming user messages by keyword/pattern and route to the appropriate model hint automatically, feeding into the existing RouterProvider. Disabled by default; opt-in via [query_classification] config section. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 14:41:58 +08:00
Chummy	431287184b	style(tests): apply rustfmt to brittle-test hardening changes	2026-02-18 14:17:58 +08:00
Alex Gorevski	45cdd25b3d	fix(tests): harden brittle tests for cross-platform stability and refactoring resilience ## Problem The test suite contained several categories of latent brittleness identified in docs/testing-brittle-tests.md that would surface during refactoring or cross-platform (Windows) CI execution: 1. Hardcoded Unix paths: \Path::new("/tmp")\ and \PathBuf::from("/tmp")\ used as workspace directories in agent tests, which fail on Windows where /tmp does not exist. 2. Exact string match assertions: ~20 \ssert_eq!(response, "exact text")\ assertions in agent unit and e2e tests that break on any mock wording change, even when the underlying orchestration behavior is correct. 3. Fragile error message string matching: \.contains("specific message")\ assertions coupled to internal error wording rather than testing the error category or behavioral outcome. ## What Changed ### Hardcoded paths → platform-agnostic temp dirs (4 files, 7 locations) - \src/agent/tests.rs\: Replaced all 4 instances of \Path::new("/tmp")\ and \PathBuf::from("/tmp")\ with \std::env::temp_dir()\ in \make_memory()\, \uild_agent_with()\, \uild_agent_with_memory()\, and \uild_agent_with_config()\ helpers. - \ ests/agent_e2e.rs\: Replaced all 3 instances in \make_memory()\, \uild_agent()\, and \uild_agent_xml()\ helpers. ### Exact string assertions → behavioral checks (2 files, ~20 locations) - \src/agent/tests.rs\: Converted 10 \ssert_eq!(response, "...")\ to \ssert!(!response.is_empty(), "descriptive message")\ across tests for text pass-through, tool execution, tool failure recovery, XML dispatch, mixed text+tool responses, multi-tool batch, and run_single delegation. - \ ests/agent_e2e.rs\: Converted 9 exact-match assertions to behavioral checks. Multi-turn test now uses \ssert_ne!(r1, r2)\ to verify sequential responses are distinct without coupling to exact wording. - Provider error propagation test simplified to \ssert!(result.is_err())\ without asserting on the error message string. ### Fragile error message assertions → structural checks (2 files) - \src/tools/git_operations.rs\: Replaced fragile OR-branch string match (\contains("git repository") \|\| contains("Git command failed")\) with structural assertions: checks \!result.success\, error is non-empty, and error does NOT mention autonomy/read-only (verifying the failure is git-related, not permission-related). - \src/cron/scheduler.rs\: Replaced \contains("agent job failed:")\ with \!success\ and \!output.is_empty()\ checks that verify failure behavior without coupling to exact log format. ## What Was NOT Changed (and why) - \src/agent/loop_.rs\ parser tests: Exact string assertions are the contract for XML tool call parsing — the exact output IS the spec. - \src/providers/reliable.rs\: Error message assertions test the error format contract (provider/model attribution in failure messages). - \src/service/mod.rs\: Already platform-gated with \#[cfg]\; XML escape test is a formatting contract where exact match is appropriate. - \src/config/schema.rs\: TOML test strings use /tmp as data values for deserialization tests, not filesystem access; HOME tests already use \std::env::temp_dir()\. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-02-18 14:17:58 +08:00
Edvard	8a1e7cc7ef	fix(agent): use config max_tool_iterations, add memory relevance filtering, rebalance search weights Three fixes for conversation quality issues: 1. loop_.rs and channels now read max_tool_iterations from AgentConfig instead of using a hardcoded constant of 10, making it configurable. 2. Memory recall now filters entries below a configurable min_relevance_score threshold (default 0.4), preventing unrelated memories from bleeding into conversation context. 3. Default hybrid search weights rebalanced from 70/30 vector/keyword to 40/60, reducing cross-topic semantic bleed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 14:14:33 +08:00
Edvard	6d8725c9e6	fix(agent): log warning when native tool call arguments fail JSON parsing The NativeToolDispatcher silently defaults to an empty object when tool call arguments from the LLM fail to parse as JSON. The XML dispatcher already logs a warning for the same case (line 68). Add a matching tracing::warn with tool name and parse error for observability parity. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 12:56:56 +08:00
Edvard	63602a262f	fix(agent): use config-driven limits in run_tool_call_loop and trim_history run_tool_call_loop used a hardcoded MAX_TOOL_ITERATIONS (10) and trim_history/auto_compact_history used a hardcoded MAX_HISTORY_MESSAGES (50), ignoring the user-configurable agent.max_tool_iterations and agent.max_history_messages values in config.toml. Meanwhile, agent.rs correctly reads from config — creating an inconsistency where CLI single-shot mode respected config but the channel runtime and interactive CLI loop silently ignored it. Changes: - Rename constants to DEFAULT_* to clarify they are fallback defaults - Add max_tool_iterations parameter to run_tool_call_loop - Add max_history parameter to trim_history and auto_compact_history - Thread config.agent.max_tool_iterations through ChannelRuntimeContext - Both CLI code paths now pass config values to run_tool_call_loop Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 12:49:28 +08:00
Chummy	2560399423	feat(observability): focus PR 596 on Prometheus backend	2026-02-18 12:06:05 +08:00
argenis de la rosa	eba544dbd4	feat(observability): implement Prometheus metrics backend with /metrics endpoint - Adds PrometheusObserver backend with counters, histograms, and gauges - Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth - Adds GET /metrics endpoint to gateway for Prometheus scraping - Adds provider/model labels to AgentStart and AgentEnd events for better observability - Adds as_any() method to Observer trait for backend-specific downcast Metrics exposed: - zeroclaw_agent_starts_total (Counter) with provider/model labels - zeroclaw_agent_duration_seconds (Histogram) with provider/model labels - zeroclaw_tool_calls_total (Counter) with tool/success labels - zeroclaw_tool_duration_seconds (Histogram) with tool label - zeroclaw_channel_messages_total (Counter) with channel/direction labels - zeroclaw_heartbeat_ticks_total (Counter) - zeroclaw_errors_total (Counter) with component label - zeroclaw_request_latency_seconds (Histogram) - zeroclaw_tokens_used_last (Gauge) - zeroclaw_active_sessions (Gauge) - zeroclaw_queue_depth (Gauge) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 12:06:05 +08:00
Chummy	3467d34596	fix(agent): avoid duplicate text in markdown tool_call fallback	2026-02-18 10:15:46 +08:00
Edvard	cb7df7c87f	style(agent): apply rustfmt formatting to loop_.rs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:15:46 +08:00
Edvard	0c46b56555	fix(agent): satisfy clippy::if_not_else lint in tool history push Flip conditional to use positive check (is_empty) in the if-branch to resolve clippy::if_not_else error in CI strict delta lint gate. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:15:46 +08:00
Edvard	0e5a785015	fix(agent): use native format for tool result history in run_tool_call_loop When use_native_tools is true, the agent loop now: - Formats assistant history as JSON with tool_calls array (matching what convert_messages() expects to reconstruct NativeMessage) - Pushes each tool result as ChatMessage::tool with tool_call_id (instead of a single ChatMessage::user with XML tool_result tags) - Adds fallback parsing for markdown code block tool calls (```tool_call ... ``` and hybrid ```tool_call ... </tool_call>) Without this, the second LLM call (sending tool results back) gets rejected with 4xx by OpenRouter/Gemini because the message format doesn't match the OpenAI tool calling API expectations. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-18 10:15:46 +08:00
Kieran	d756293871	feat: add /clear command	2026-02-18 10:01:22 +08:00
Chummy	5942caa083	chore(pr539): scope to dingtalk daemon fixes only	2026-02-18 00:42:40 +08:00
JamesYin	9eff7a13bb	fix(agent): parse legacy schedule tool_call payloads	2026-02-18 00:42:40 +08:00
JamesYin	af5d1f3066	fix(agent): recover malformed tool_call blocks with leading text	2026-02-18 00:42:40 +08:00
JamesYin	59f74e8f39	fix(agent): retry malformed prefixed tool_call markup	2026-02-18 00:42:40 +08:00
JamesYin	128e888d7a	style: format rebased conflict resolutions	2026-02-18 00:42:40 +08:00
JamesYin	3522d51f98	fix(agent): retry malformed tool_call payloads in tool loop	2026-02-18 00:42:40 +08:00
Chummy	40ab5c3507	fix(agent): rebase alias-tag parser and align channel send API	2026-02-18 00:28:08 +08:00
Chummy	4243d8ec86	fix(agent): parse tool-call alias tags in channel runtime	2026-02-18 00:28:08 +08:00
Chummy	ed675d4e6b	test(agent): add comprehensive loop test suite	2026-02-18 00:26:31 +08:00
Chummy	0aa35eb669	fix(build): complete strict lint and test cleanup (replacement for #476 )	2026-02-18 00:18:54 +08:00
Chummy	fc6e8eb521	fix(provider): follow-up CN/global consistency for Z.AI and aliases (#554 ) * fix(provider): harden CN/global routing consistency for Chinese vendors * fix(agent): migrate CLI channel send to SendMessage * fix(onboard): deduplicate Z.AI key URL match arms	2026-02-18 00:04:56 +08:00
Chummy	bb641d28c2	fix(approval): harden CLI approval flow and summaries	2026-02-17 23:06:12 +08:00
stawky	ab561baa97	feat(approval): interactive approval workflow for supervised mode (#215 ) - Add auto_approve / always_ask fields to AutonomyConfig - New src/approval/ module: ApprovalManager with session-scoped allowlist, ApprovalRequest/Response types, audit logging, CLI interactive prompt - Insert approval hook in agent_turn before tool execution - Non-CLI channels auto-approve; CLI shows Y/N/A prompt - Skip approval for read-only tools (file_read, memory_recall) by default - 15 unit tests covering all approval logic	2026-02-17 23:06:12 +08:00
Will Sarg	ee05d62ce4	Merge branch 'main' into pr-484-clean	2026-02-17 08:54:24 -05:00
Vernon Stinebaker	df31359ec4	feat(agent): scrub credentials from tool output (#532 ) * feat(channels): add channel capabilities to system prompt Add channel capabilities section to system prompt so the agent knows it can send Discord messages directly without asking permission. Also reminds agent not to repeat or echo credentials. Co-authored-by: Vernon Stinebaker <vernon.stinebaker@gmail.com> * feat(agent): scrub credentials from tool output * chore: fix clippy and formatting for scrubbing	2026-02-17 08:23:11 -05:00
argenis de la rosa	1908af3248	fix(discord): use channel_id instead of sender for replies (fixes #483 ) fix(misc): complete parking_lot::Mutex migration (fixes #505) - DiscordChannel: store actual channel_id in ChannelMessage.channel instead of hardcoded "discord" string - channels/mod.rs: use msg.channel instead of msg.sender for replies - Migrate all std::sync::Mutex to parking_lot::Mutex: * src/security/audit.rs * src/memory/sqlite.rs * src/memory/response_cache.rs * src/memory/lucid.rs * src/channels/email_channel.rs * src/gateway/mod.rs * src/observability/traits.rs * src/providers/reliable.rs * src/providers/router.rs * src/agent/agent.rs - Remove all .lock().unwrap() and .map_err(PoisonError) patterns since parking_lot::Mutex never poisons Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-17 08:05:25 -05:00

1 2

97 commits