Skill prompts and tool definitions from SKILL.toml were parsed and stored
correctly but never included in the agent's system prompt. Both prompt-building
paths (channels/mod.rs and agent/prompt.rs) only emitted skill metadata (name,
description, location), telling the LLM to "read" the SKILL.toml on demand.
This caused the agent to attempt manual file reads that often failed, leaving
skills effectively ignored.
Now both paths inline <instructions> and <tools> blocks inside each <skill>
XML element, so the agent receives full skill context without extra tool calls.
Closes#877
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Three related agent UX issues found during MiniMax channel testing:
1. DateTimeSection injected only timezone, not the actual date/time.
Models have no reliable way to know the current date from training
data alone, causing wrong or hallucinated dates in responses.
Fix: include full timestamp (YYYY-MM-DD HH:MM:SS TZ) in the prompt.
2. The `date` shell command was absent from the security policy
allowed_commands default list. When a model tried to call
shell("date") to get the current time, it received a policy
rejection and told the user it was "blocked by security policy".
Fix: add "date" to the default allowed_commands list. The command
is read-only, side-effect-free, and carries no security risk.
3. (Context) The datetime prompt fix makes the date command fallback
largely unnecessary, but the allowlist addition ensures the tool
works correctly if models choose to call it anyway.
Non-goals:
- Not changing the autonomy model or risk classification
- Not adding new config keys
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Remove early return in IdentitySection::build() that caused AGENTS.md,
SOUL.md, and other workspace files to be silently skipped when AIEOS
identity loaded successfully. Both AIEOS identity and workspace files
now coexist in the system prompt.
Closeszeroclaw-labs/zeroclaw#856
Co-Authored-By: Kristofer Mondlane <kmondlane@gmail.com>
Implement chat_with_tools() on CompatibleProvider so OpenAI-compatible
endpoints (OpenRouter, local LLMs, etc.) can use structured tool calling
instead of prompt-injected tool descriptions.
Changes:
- CompatibleProvider: capabilities() reports native_tool_calling, new
chat_with_tools() sends tools in API request and parses tool_calls
from response, chat() bridges to chat_with_tools() when ToolSpecs
are provided
- RouterProvider: chat_with_tools() delegation with model hint resolution
- loop_.rs: expose tools_to_openai_format as pub(crate), add
tools_to_openai_format_from_specs for ToolSpec-based conversion
Adds 9 new tests and updates 1 existing test.
- Add strip_tool_call_tags() to finalize_draft to prevent Markdown
parse failures from tool-call tags reaching Telegram API
- Deduplicate parse_reply_target() call in update_draft (was called
twice, discarding thread_id both times)
- Replace body.as_object_mut().unwrap() mutation with separate
plain_body JSON literal (eliminates unwrap in runtime path)
- Clean up per-chat rate-limit HashMap entry in finalize_draft to
prevent unbounded growth over long uptimes
- Extract magic number 80 to STREAM_CHUNK_MIN_CHARS constant in
agent loop
Previously on_delta sent the entire completed response as a single
message, defeating the purpose of the streaming draft updates. Now
the text is split into ~80-char chunks on whitespace boundaries
(UTF-8 safe via split_inclusive) and sent progressively through the
channel, so Telegram draft edits show text arriving incrementally.
The consumer in process_channel_message already accumulates chunks
and calls update_draft with the full text so far, and Telegram's
rate-limiting (draft_update_interval_ms) throttles editMessageText
calls to avoid API spam.
Wire the existing provider-layer streaming infrastructure through the
channel trait and agent loop so Telegram users see tokens arrive
progressively via editMessageText, instead of waiting for the full
response.
Changes:
- Add StreamMode enum (off/partial/block) and draft_update_interval_ms
to TelegramConfig (backward-compatible defaults: off, 1000ms)
- Add supports_draft_updates/send_draft/update_draft/finalize_draft to
Channel trait with no-op defaults (zero impact on existing channels)
- Implement draft methods on TelegramChannel using sendMessage +
editMessageText with rate limiting and Markdown fallback
- Add on_delta mpsc::Sender<String> parameter to run_tool_call_loop
(None preserves existing behavior)
- Wire streaming in process_channel_message: when channel supports
drafts, send initial draft, spawn updater task, finalize on completion
Edge cases handled:
- 4096-char limit: finalize draft and fall back to chunked send
- Broken Markdown: use no parse_mode during streaming, apply on finalize
- Edit failures: fall back to sending complete response as new message
- Rate limiting: configurable draft_update_interval_ms (default 1s)
GLM models output tool calls in proprietary formats that ZeroClaw
doesn't natively support. This adds parsing for GLM-specific formats:
- browser_open/url>https://... -> shell tool with curl command
- shell/command>ls -> shell tool with command arg
- http_request/url>... -> http_request tool
- Plain URLs -> shell tool with curl command
Also adds:
- find_json_end() helper for parsing JSON objects
- Unclosed <toolcall> tag handling
- Unit tests for GLM-style parsing
The parsing is deliberately placed after XML and markdown code block
parsing, so it acts as a fallback for models that don't use standard
tool calling formats.
This enables GLM models (via Z.AI or other providers) to successfully
execute tools in ZeroClaw.
Classify incoming user messages by keyword/pattern and route to the
appropriate model hint automatically, feeding into the existing
RouterProvider. Disabled by default; opt-in via [query_classification]
config section.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Problem
The test suite contained several categories of latent brittleness
identified in docs/testing-brittle-tests.md that would surface during
refactoring or cross-platform (Windows) CI execution:
1. Hardcoded Unix paths: \Path::new("/tmp")\ and \PathBuf::from("/tmp")\
used as workspace directories in agent tests, which fail on Windows
where /tmp does not exist.
2. Exact string match assertions: ~20 \ssert_eq!(response, "exact text")\
assertions in agent unit and e2e tests that break on any mock wording
change, even when the underlying orchestration behavior is correct.
3. Fragile error message string matching: \.contains("specific message")\
assertions coupled to internal error wording rather than testing the
error category or behavioral outcome.
## What Changed
### Hardcoded paths → platform-agnostic temp dirs (4 files, 7 locations)
- \src/agent/tests.rs\: Replaced all 4 instances of \Path::new("/tmp")\
and \PathBuf::from("/tmp")\ with \std::env::temp_dir()\ in
\make_memory()\, \uild_agent_with()\, \uild_agent_with_memory()\,
and \uild_agent_with_config()\ helpers.
- \ ests/agent_e2e.rs\: Replaced all 3 instances in \make_memory()\,
\uild_agent()\, and \uild_agent_xml()\ helpers.
### Exact string assertions → behavioral checks (2 files, ~20 locations)
- \src/agent/tests.rs\: Converted 10 \ssert_eq!(response, "...")\ to
\ssert!(!response.is_empty(), "descriptive message")\ across tests for
text pass-through, tool execution, tool failure recovery, XML dispatch,
mixed text+tool responses, multi-tool batch, and run_single delegation.
- \ ests/agent_e2e.rs\: Converted 9 exact-match assertions to behavioral
checks. Multi-turn test now uses \ssert_ne!(r1, r2)\ to verify
sequential responses are distinct without coupling to exact wording.
- Provider error propagation test simplified to \ssert!(result.is_err())\
without asserting on the error message string.
### Fragile error message assertions → structural checks (2 files)
- \src/tools/git_operations.rs\: Replaced fragile OR-branch string match
(\contains("git repository") || contains("Git command failed")\) with
structural assertions: checks \!result.success\, error is non-empty,
and error does NOT mention autonomy/read-only (verifying the failure
is git-related, not permission-related).
- \src/cron/scheduler.rs\: Replaced \contains("agent job failed:")\ with
\!success\ and \!output.is_empty()\ checks that verify failure behavior
without coupling to exact log format.
## What Was NOT Changed (and why)
- \src/agent/loop_.rs\ parser tests: Exact string assertions are the
contract for XML tool call parsing — the exact output IS the spec.
- \src/providers/reliable.rs\: Error message assertions test the error
format contract (provider/model attribution in failure messages).
- \src/service/mod.rs\: Already platform-gated with \#[cfg]\; XML escape
test is a formatting contract where exact match is appropriate.
- \src/config/schema.rs\: TOML test strings use /tmp as data values for
deserialization tests, not filesystem access; HOME tests already use
\std::env::temp_dir()\.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three fixes for conversation quality issues:
1. loop_.rs and channels now read max_tool_iterations from AgentConfig
instead of using a hardcoded constant of 10, making it configurable.
2. Memory recall now filters entries below a configurable
min_relevance_score threshold (default 0.4), preventing unrelated
memories from bleeding into conversation context.
3. Default hybrid search weights rebalanced from 70/30 vector/keyword
to 40/60, reducing cross-topic semantic bleed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The NativeToolDispatcher silently defaults to an empty object when tool
call arguments from the LLM fail to parse as JSON. The XML dispatcher
already logs a warning for the same case (line 68). Add a matching
tracing::warn with tool name and parse error for observability parity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
run_tool_call_loop used a hardcoded MAX_TOOL_ITERATIONS (10) and
trim_history/auto_compact_history used a hardcoded MAX_HISTORY_MESSAGES (50),
ignoring the user-configurable agent.max_tool_iterations and
agent.max_history_messages values in config.toml.
Meanwhile, agent.rs correctly reads from config — creating an inconsistency
where CLI single-shot mode respected config but the channel runtime and
interactive CLI loop silently ignored it.
Changes:
- Rename constants to DEFAULT_* to clarify they are fallback defaults
- Add max_tool_iterations parameter to run_tool_call_loop
- Add max_history parameter to trim_history and auto_compact_history
- Thread config.agent.max_tool_iterations through ChannelRuntimeContext
- Both CLI code paths now pass config values to run_tool_call_loop
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Flip conditional to use positive check (is_empty) in the if-branch
to resolve clippy::if_not_else error in CI strict delta lint gate.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When use_native_tools is true, the agent loop now:
- Formats assistant history as JSON with tool_calls array (matching
what convert_messages() expects to reconstruct NativeMessage)
- Pushes each tool result as ChatMessage::tool with tool_call_id
(instead of a single ChatMessage::user with XML tool_result tags)
- Adds fallback parsing for markdown code block tool calls
(```tool_call ... ``` and hybrid ```tool_call ... </tool_call>)
Without this, the second LLM call (sending tool results back) gets
rejected with 4xx by OpenRouter/Gemini because the message format
doesn't match the OpenAI tool calling API expectations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(channels): add channel capabilities to system prompt
Add channel capabilities section to system prompt so the agent knows
it can send Discord messages directly without asking permission.
Also reminds agent not to repeat or echo credentials.
Co-authored-by: Vernon Stinebaker <vernon.stinebaker@gmail.com>
* feat(agent): scrub credentials from tool output
* chore: fix clippy and formatting for scrubbing
fix(misc): complete parking_lot::Mutex migration (fixes#505)
- DiscordChannel: store actual channel_id in ChannelMessage.channel
instead of hardcoded "discord" string
- channels/mod.rs: use msg.channel instead of msg.sender for replies
- Migrate all std::sync::Mutex to parking_lot::Mutex:
* src/security/audit.rs
* src/memory/sqlite.rs
* src/memory/response_cache.rs
* src/memory/lucid.rs
* src/channels/email_channel.rs
* src/gateway/mod.rs
* src/observability/traits.rs
* src/providers/reliable.rs
* src/providers/router.rs
* src/agent/agent.rs
- Remove all .lock().unwrap() and .map_err(PoisonError) patterns
since parking_lot::Mutex never poisons
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>