Commit graph

759 commits

Author SHA1 Message Date
Vernon Stinebaker
3b0133596c feat(providers): add native tool calling for OpenAI-compatible providers
Implement chat_with_tools() on CompatibleProvider so OpenAI-compatible
endpoints (OpenRouter, local LLMs, etc.) can use structured tool calling
instead of prompt-injected tool descriptions.

Changes:
- CompatibleProvider: capabilities() reports native_tool_calling, new
  chat_with_tools() sends tools in API request and parses tool_calls
  from response, chat() bridges to chat_with_tools() when ToolSpecs
  are provided
- RouterProvider: chat_with_tools() delegation with model hint resolution
- loop_.rs: expose tools_to_openai_format as pub(crate), add
  tools_to_openai_format_from_specs for ToolSpec-based conversion

Adds 9 new tests and updates 1 existing test.
2026-02-18 18:06:36 +08:00
Chummy
6acec94666 docs(custom-providers): update anthropic model example to sonnet-4-6 2026-02-18 18:06:13 +08:00
Chummy
461a4563f8 docs(config): align inline comments and sync model defaults 2026-02-18 18:06:13 +08:00
Chummy
9410e4e78e docs(agent-guides): fix section references after numbering sync 2026-02-18 18:06:13 +08:00
Chummy
d7277a3b40 docs(agent-guides): align AGENTS and CLAUDE with new docs system 2026-02-18 18:06:13 +08:00
Chummy
f6cf004800 docs(readme): refine feature messaging and de-duplicate top navigation 2026-02-18 18:06:13 +08:00
Chummy
e1990c7fb8 docs(readme): update default model to claude-sonnet-4-6 2026-02-18 18:06:13 +08:00
Chummy
93e5383cb2 docs: overhaul docs IA and multilingual navigation 2026-02-18 18:06:13 +08:00
Chummy
5e800c38f1 fix(channel): cancel and join scoped typing task safely 2026-02-18 18:01:29 +08:00
Jayson Reis
12c5473083 fix: Keep typing status on telegram while message is being processed
# Conflicts:
#	src/channels/mod.rs
2026-02-18 18:01:29 +08:00
Chummy
1bfd50bce9 fix(mattermost): preserve threaded default and docs 2026-02-18 17:46:19 +08:00
Vernon Stinebaker
58120b1c69 feat(mattermost): add thread_replies config and typing indicator
Add two Mattermost channel enhancements:

1. thread_replies config option (default: false)
   - When false, replies go to the channel root instead of threading.
   - When true, replies thread on the original post.
   - Existing thread replies always stay in-thread regardless of setting.

2. Typing indicator (start_typing/stop_typing)
   - Implements the Channel trait's typing methods for Mattermost.
   - Fires POST /api/v4/users/me/typing every 4s in a background task.
   - Supports parent_id for threaded typing indicators.
   - Aborts cleanly on stop_typing via JoinHandle.

Updated all MattermostChannel::new call sites (start_channels, scheduler)
and added 9 unit tests covering thread routing and edge cases.
2026-02-18 17:46:19 +08:00
Chummy
41c3e62dad fix(docker): unblock workspace build and auto-publish latest image 2026-02-18 17:14:46 +08:00
Chummy
bc5b1a7841 fix(providers): harden reasoning_content fallback behavior 2026-02-18 17:07:38 +08:00
Vernon Stinebaker
dd4f5271d1 feat(providers): support reasoning_content fallback for thinking models
Reasoning/thinking models (Qwen3, GLM-4, DeepSeek, etc.) may return
output in `reasoning_content` instead of `content`. Add automatic
fallback for both OpenAI and OpenAI-compatible providers, including
streaming SSE support.

Changes:
- Add `reasoning_content` field to response structs in both providers
- Add `effective_content()` helper that prefers `content` but falls
  back to `reasoning_content` when content is empty/null/missing
- Update all extraction sites to use `effective_content()`
- Add streaming SSE fallback for `reasoning_content` chunks
- Add 16 focused unit tests covering all edge cases

Tested end-to-end against GLM-4.7-flash via local LLM server.
2026-02-18 17:07:38 +08:00
Chummy
219764d4d8 fix(channels): recover malformed invoke/tool_call output in daemon mode 2026-02-18 17:01:36 +08:00
Chummy
75a9eb383c test(security): enforce lowercase token hex assertion 2026-02-18 16:56:45 +08:00
Chummy
918be53a30 test(security): harden token format regression coverage 2026-02-18 16:56:45 +08:00
hayoial
58958d9991 fix: add per-sender conversation history for channel messages
Channel messages (Telegram, Discord, etc.) previously had no multi-turn
context — each incoming message was processed with a fresh history
containing only the system prompt and the current user message.

This patch:
- Maintains a per-sender conversation history map (Arc<Mutex<HashMap>>)
- Restores prior turns when processing each new message
- Saves user + assistant turns after successful LLM response
- Caps history at 50 messages per sender to bound memory usage

Fixes the channel context continuity issue where the bot would respond
with 'I have no context' to every follow-up question.
2026-02-18 16:35:38 +08:00
Xiangjun Ma
f1db63219c refactor(telegram): address code review findings
- Add strip_tool_call_tags() to finalize_draft to prevent Markdown
  parse failures from tool-call tags reaching Telegram API
- Deduplicate parse_reply_target() call in update_draft (was called
  twice, discarding thread_id both times)
- Replace body.as_object_mut().unwrap() mutation with separate
  plain_body JSON literal (eliminates unwrap in runtime path)
- Clean up per-chat rate-limit HashMap entry in finalize_draft to
  prevent unbounded growth over long uptimes
- Extract magic number 80 to STREAM_CHUNK_MIN_CHARS constant in
  agent loop
2026-02-18 16:33:33 +08:00
Chummy
e326e12039 test(telegram): cover draft streaming paths and simplify stream modes 2026-02-18 16:33:33 +08:00
Xiangjun Ma
e21fe1ff55 fix(telegram): address Copilot review feedback
- Fix silent parse failures: message_id.parse().unwrap_or(0) replaced
  with match + tracing::warn on parse error (update_draft, finalize_draft)
- Fix UTF-8 panic: byte-based truncation replaced with char_indices()
  safe boundary detection for TELEGRAM_MAX_MESSAGE_LENGTH
- Fix global rate limiter: Mutex<Option<Instant>> replaced with
  Mutex<HashMap<String, Instant>> for per-chat rate limiting so
  concurrent conversations don't interfere with each other
- Document Block variant: clarify it's reserved for future use and
  currently behaves the same as Partial
2026-02-18 16:33:33 +08:00
Xiangjun Ma
93538a70e3 fix(agent): relay final response as progressive chunks via on_delta
Previously on_delta sent the entire completed response as a single
message, defeating the purpose of the streaming draft updates. Now
the text is split into ~80-char chunks on whitespace boundaries
(UTF-8 safe via split_inclusive) and sent progressively through the
channel, so Telegram draft edits show text arriving incrementally.

The consumer in process_channel_message already accumulates chunks
and calls update_draft with the full text so far, and Telegram's
rate-limiting (draft_update_interval_ms) throttles editMessageText
calls to avoid API spam.
2026-02-18 16:33:33 +08:00
Xiangjun Ma
118cd53922 feat(channel): stream LLM responses to Telegram via draft message edits
Wire the existing provider-layer streaming infrastructure through the
channel trait and agent loop so Telegram users see tokens arrive
progressively via editMessageText, instead of waiting for the full
response.

Changes:
- Add StreamMode enum (off/partial/block) and draft_update_interval_ms
  to TelegramConfig (backward-compatible defaults: off, 1000ms)
- Add supports_draft_updates/send_draft/update_draft/finalize_draft to
  Channel trait with no-op defaults (zero impact on existing channels)
- Implement draft methods on TelegramChannel using sendMessage +
  editMessageText with rate limiting and Markdown fallback
- Add on_delta mpsc::Sender<String> parameter to run_tool_call_loop
  (None preserves existing behavior)
- Wire streaming in process_channel_message: when channel supports
  drafts, send initial draft, spawn updater task, finalize on completion

Edge cases handled:
- 4096-char limit: finalize draft and fall back to chunked send
- Broken Markdown: use no parse_mode during streaming, apply on finalize
- Edit failures: fall back to sending complete response as new message
- Rate limiting: configurable draft_update_interval_ms (default 1s)
2026-02-18 16:33:33 +08:00
Chummy
a0b277b21e fix(web-search): harden config handling and trim unrelated CI edit 2026-02-18 15:24:21 +08:00
adisusilayasa
1757add64a feat(tools): add web_search_tool for internet search
Add native web search capability that works regardless of LLM tool-calling
support. This is particularly useful for GLM models via Z.AI that don't
reliably support standard tool calling formats.

Features:
- DuckDuckGo provider (free, no API key required)
- Brave Search provider (optional, requires API key)
- Configurable max results and timeout
- Enabled by default

Configuration (config.toml):
  [web_search]
  enabled = true
  provider = "duckduckgo"
  max_results = 5

The tool allows agents to search the web for current information without
requiring proper tool calling support from the LLM.

Also includes CI workflow fix for first-interaction action inputs.
2026-02-18 15:24:21 +08:00
Chummy
f3bdff1d69 fix(agent): harden glm tool-call parsing and scope PR 2026-02-18 15:23:35 +08:00
adisusilayasa
16c5784212 fix(ci): include workflow fix for CI to pass
The first-interaction action requires snake_case input names.
2026-02-18 15:23:35 +08:00
adisusilayasa
58c81aa258 feat(agent): add GLM-style tool call parsing
GLM models output tool calls in proprietary formats that ZeroClaw
doesn't natively support. This adds parsing for GLM-specific formats:

- browser_open/url>https://... -> shell tool with curl command
- shell/command>ls -> shell tool with command arg
- http_request/url>... -> http_request tool
- Plain URLs -> shell tool with curl command

Also adds:
- find_json_end() helper for parsing JSON objects
- Unclosed <toolcall> tag handling
- Unit tests for GLM-style parsing

The parsing is deliberately placed after XML and markdown code block
parsing, so it acts as a fallback for models that don't use standard
tool calling formats.

This enables GLM models (via Z.AI or other providers) to successfully
execute tools in ZeroClaw.
2026-02-18 15:23:35 +08:00
mikeboensel
9f34e2465e
Merge pull request #755 from zeroclaw-labs/ISSUE-754
fix(token): update token generation to use rand::rng()

Addresses warning coming from compiler:
❯ rspberrypi@localhost:~/zeroclaw$ cargo build --release --locked
warning: use of deprecated function rand::thread_rng: Renamed to rng
--> src/security/pairing.rs:186:11 |
186 | rand::thread_rng().fill_bytes(&mut bytes); | ^^^^^^^^^^
|
= note: #[warn(deprecated)] on by default
2026-02-18 02:14:22 -05:00
Chummy
57be369771 chore(docker): keep install list indentation unchanged 2026-02-18 15:14:05 +08:00
Cemal Y. Dalar
7f15627f8c fix(docker): restore benches/ copy after stub removal in builder stage
The dep-caching layer creates stub files (src/main.rs and
benches/agent_benchmarks.rs) to warm the cargo registry cache, then
removes them with `rm -rf src benches`. The subsequent real source copy
only restored `src/` — leaving `benches/` absent. Cargo's manifest
parser then failed to locate `benches/agent_benchmarks.rs` referenced
in Cargo.toml, aborting the release build with:

  error: failed to parse manifest at `/app/Cargo.toml`
  Caused by: can't find `agent_benchmarks` bench at
  `benches/agent_benchmarks.rs`

Fix: add `COPY benches/ benches/` alongside the `COPY src/ src/` step
so the real bench source is present for the incremental release build.
2026-02-18 15:14:05 +08:00
Mike Boensel
0166f2d4de fix(token): update token generation to use rand::rng() to resolve deprecation warnings 2026-02-18 02:11:51 -05:00
Chummy
a3eedfdc78 docs(zai): align setup guide with runtime defaults
- remove trailing whitespace in .env.example Z.AI block
- align documented model defaults/options with current onboard/provider behavior
- keep this PR docs-focused by reverting incidental workflow edits
2026-02-18 15:10:55 +08:00
adisusilayasa
e3d6058424 fix(ci): include workflow fix for CI to pass
The first-interaction action requires snake_case input names.
This fix is needed for CI to pass on this PR.
2026-02-18 15:10:55 +08:00
adisusilayasa
402d8f0a32 docs: add Z.AI GLM coding plan setup guide
- Add comprehensive documentation for Z.AI GLM models
- Include curl examples for testing Z.AI API
- Document available models and troubleshooting
- Update .env.example with Z.AI configuration

Z.AI provides GLM models (glm-4.5, glm-4.6, glm-4.7, glm-5) through
the OpenAI-compatible endpoint at api.z.ai/api/coding/paas/v4.

Existing tests verify:
- zai_base_url() returns correct URLs for global/CN variants
- create_provider('zai', key) successfully creates provider
- Regional alias predicates cover all variants
2026-02-18 15:10:55 +08:00
Chummy
42bf05df47 docs: clarify custom provider env vars and URL scheme 2026-02-18 15:04:11 +08:00
ZeroClaw Bot
f13553014b docs: add custom provider endpoint configuration guide
Add comprehensive documentation for custom API endpoint configuration
to address missing documentation reported in issue #567.

Changes:
- Create docs/custom-providers.md with detailed guide for custom: and anthropic-custom: formats
- Add custom endpoint examples to README.md configuration section
- Add note about daemon requirement for channels in Quick Start
- Add reference link to custom providers guide

Addresses: #567

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 15:04:11 +08:00
Chummy
dd454178ed perf(memory): fold recall/vector/list optimizations into spawn_blocking refactor 2026-02-18 14:46:51 +08:00
Alex Gorevski
4e528dde7d perf(memory): wrap blocking SQLite calls in tokio::task::spawn_blocking
Problem:
Every async fn in SqliteMemory acquired self.conn.lock() and ran
synchronous rusqlite queries directly on the Tokio runtime thread.
This blocks the async executor, preventing other tasks from making
progress — especially harmful under concurrent recall/store load.

Fix:
- Change conn from Mutex<Connection> to Arc<Mutex<Connection>> so
  the connection handle can be cloned into spawn_blocking closures.
- Wrap all synchronous database operations (store, recall, get, list,
  forget, count, health_check) in tokio::task::spawn_blocking.
- Split get_or_compute_embedding into three phases: cache check
  (blocking), embedding computation (async I/O), cache store
  (blocking) — ensuring no lock is held across await points.
- Apply the same pattern to the reindex method.

The async I/O (embedding computation) remains on the Tokio runtime
while all SQLite access runs on the blocking thread pool, preventing
executor starvation.

Ref: zeroclaw-labs/zeroclaw#710 (Item 4)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 14:46:51 +08:00
Chummy
83b098d7ac fix(imessage): preserve sqlite conn across polling safely 2026-02-18 14:45:05 +08:00
Alex Gorevski
1ddcb0a573 perf(imessage): reuse persistent SQLite connection across poll cycles
Problem:
The iMessage listener opened a new SQLite connection to the Messages
database on every ~3-second poll cycle via get_max_rowid() and
fetch_new_messages(), creating ~40 connection open/close cycles per
minute. Each cycle incurs filesystem syscalls, WAL header reads,
and potential page cache cold starts.

Fix:
Open a single read-only connection before the poll loop and reuse it
across iterations using the 'shuttle' pattern: the connection is moved
into each spawn_blocking closure and returned alongside the results,
then reassigned for the next iteration. This eliminates per-poll
connection overhead while preserving the spawn_blocking pattern that
keeps SQLite I/O off the Tokio runtime thread.

The standalone get_max_rowid() and fetch_new_messages() helper
functions are retained for use by tests and other callers.

Ref: zeroclaw-labs/zeroclaw#710 (Item 9)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 14:45:05 +08:00
Chummy
14066d094f test(runtime): stabilize docker root mount assertion 2026-02-18 14:42:39 +08:00
Alex Gorevski
9a6fa76825 readd tests, remove markdown files 2026-02-18 14:42:39 +08:00
Chummy
e2634c72c2 test(config): include query_classification in config fixtures 2026-02-18 14:41:58 +08:00
Edvard
6e53341bb1 feat(agent): add rule-based query classification for automatic model routing
Classify incoming user messages by keyword/pattern and route to the
appropriate model hint automatically, feeding into the existing
RouterProvider. Disabled by default; opt-in via [query_classification]
config section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:41:58 +08:00
Edvard
1336c2f03e feat(providers): add warmup() for OpenAI, Anthropic, Gemini, Compatible, GLM
All five providers have HTTP clients but did not implement warmup(),
relying on the trait default no-op. This adds lightweight warmup calls
to establish TLS + HTTP/2 connection pools on startup, reducing
first-request latency. Each warmup is skipped when credentials are
absent, matching the OpenRouter pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:35:03 +08:00
Chummy
a85a4a8194 fix(config): resolve ZEROCLAW_WORKSPACE root/workspace paths safely 2026-02-18 14:30:53 +08:00
bhagwan
b2976eb474 fix(config): support both legacy and new ZEROCLAW_WORKSPACE structure
ZEROCLAW_WORKSPACE can now be either:
- Legacy path: /path/to/workspace (config at /path/to/.zeroclaw/config.toml)
- Parent path: /path/to (config at /path/to/config.toml, workspace at /path/to/workspace)

This maintains backward compatibility with Docker's legacy folder structure
while also supporting the new parent-dir layout.
2026-02-18 14:30:53 +08:00
Chummy
da7c21f469 style(anthropic): format cache conversation test block 2026-02-18 14:29:50 +08:00