Commit graph

746 commits

Author SHA1 Message Date
Chummy
bc5b1a7841 fix(providers): harden reasoning_content fallback behavior 2026-02-18 17:07:38 +08:00
Vernon Stinebaker
dd4f5271d1 feat(providers): support reasoning_content fallback for thinking models
Reasoning/thinking models (Qwen3, GLM-4, DeepSeek, etc.) may return
output in `reasoning_content` instead of `content`. Add automatic
fallback for both OpenAI and OpenAI-compatible providers, including
streaming SSE support.

Changes:
- Add `reasoning_content` field to response structs in both providers
- Add `effective_content()` helper that prefers `content` but falls
  back to `reasoning_content` when content is empty/null/missing
- Update all extraction sites to use `effective_content()`
- Add streaming SSE fallback for `reasoning_content` chunks
- Add 16 focused unit tests covering all edge cases

Tested end-to-end against GLM-4.7-flash via local LLM server.
2026-02-18 17:07:38 +08:00
Chummy
219764d4d8 fix(channels): recover malformed invoke/tool_call output in daemon mode 2026-02-18 17:01:36 +08:00
Chummy
75a9eb383c test(security): enforce lowercase token hex assertion 2026-02-18 16:56:45 +08:00
Chummy
918be53a30 test(security): harden token format regression coverage 2026-02-18 16:56:45 +08:00
hayoial
58958d9991 fix: add per-sender conversation history for channel messages
Channel messages (Telegram, Discord, etc.) previously had no multi-turn
context — each incoming message was processed with a fresh history
containing only the system prompt and the current user message.

This patch:
- Maintains a per-sender conversation history map (Arc<Mutex<HashMap>>)
- Restores prior turns when processing each new message
- Saves user + assistant turns after successful LLM response
- Caps history at 50 messages per sender to bound memory usage

Fixes the channel context continuity issue where the bot would respond
with 'I have no context' to every follow-up question.
2026-02-18 16:35:38 +08:00
Xiangjun Ma
f1db63219c refactor(telegram): address code review findings
- Add strip_tool_call_tags() to finalize_draft to prevent Markdown
  parse failures from tool-call tags reaching Telegram API
- Deduplicate parse_reply_target() call in update_draft (was called
  twice, discarding thread_id both times)
- Replace body.as_object_mut().unwrap() mutation with separate
  plain_body JSON literal (eliminates unwrap in runtime path)
- Clean up per-chat rate-limit HashMap entry in finalize_draft to
  prevent unbounded growth over long uptimes
- Extract magic number 80 to STREAM_CHUNK_MIN_CHARS constant in
  agent loop
2026-02-18 16:33:33 +08:00
Chummy
e326e12039 test(telegram): cover draft streaming paths and simplify stream modes 2026-02-18 16:33:33 +08:00
Xiangjun Ma
e21fe1ff55 fix(telegram): address Copilot review feedback
- Fix silent parse failures: message_id.parse().unwrap_or(0) replaced
  with match + tracing::warn on parse error (update_draft, finalize_draft)
- Fix UTF-8 panic: byte-based truncation replaced with char_indices()
  safe boundary detection for TELEGRAM_MAX_MESSAGE_LENGTH
- Fix global rate limiter: Mutex<Option<Instant>> replaced with
  Mutex<HashMap<String, Instant>> for per-chat rate limiting so
  concurrent conversations don't interfere with each other
- Document Block variant: clarify it's reserved for future use and
  currently behaves the same as Partial
2026-02-18 16:33:33 +08:00
Xiangjun Ma
93538a70e3 fix(agent): relay final response as progressive chunks via on_delta
Previously on_delta sent the entire completed response as a single
message, defeating the purpose of the streaming draft updates. Now
the text is split into ~80-char chunks on whitespace boundaries
(UTF-8 safe via split_inclusive) and sent progressively through the
channel, so Telegram draft edits show text arriving incrementally.

The consumer in process_channel_message already accumulates chunks
and calls update_draft with the full text so far, and Telegram's
rate-limiting (draft_update_interval_ms) throttles editMessageText
calls to avoid API spam.
2026-02-18 16:33:33 +08:00
Xiangjun Ma
118cd53922 feat(channel): stream LLM responses to Telegram via draft message edits
Wire the existing provider-layer streaming infrastructure through the
channel trait and agent loop so Telegram users see tokens arrive
progressively via editMessageText, instead of waiting for the full
response.

Changes:
- Add StreamMode enum (off/partial/block) and draft_update_interval_ms
  to TelegramConfig (backward-compatible defaults: off, 1000ms)
- Add supports_draft_updates/send_draft/update_draft/finalize_draft to
  Channel trait with no-op defaults (zero impact on existing channels)
- Implement draft methods on TelegramChannel using sendMessage +
  editMessageText with rate limiting and Markdown fallback
- Add on_delta mpsc::Sender<String> parameter to run_tool_call_loop
  (None preserves existing behavior)
- Wire streaming in process_channel_message: when channel supports
  drafts, send initial draft, spawn updater task, finalize on completion

Edge cases handled:
- 4096-char limit: finalize draft and fall back to chunked send
- Broken Markdown: use no parse_mode during streaming, apply on finalize
- Edit failures: fall back to sending complete response as new message
- Rate limiting: configurable draft_update_interval_ms (default 1s)
2026-02-18 16:33:33 +08:00
Chummy
a0b277b21e fix(web-search): harden config handling and trim unrelated CI edit 2026-02-18 15:24:21 +08:00
adisusilayasa
1757add64a feat(tools): add web_search_tool for internet search
Add native web search capability that works regardless of LLM tool-calling
support. This is particularly useful for GLM models via Z.AI that don't
reliably support standard tool calling formats.

Features:
- DuckDuckGo provider (free, no API key required)
- Brave Search provider (optional, requires API key)
- Configurable max results and timeout
- Enabled by default

Configuration (config.toml):
  [web_search]
  enabled = true
  provider = "duckduckgo"
  max_results = 5

The tool allows agents to search the web for current information without
requiring proper tool calling support from the LLM.

Also includes CI workflow fix for first-interaction action inputs.
2026-02-18 15:24:21 +08:00
Chummy
f3bdff1d69 fix(agent): harden glm tool-call parsing and scope PR 2026-02-18 15:23:35 +08:00
adisusilayasa
16c5784212 fix(ci): include workflow fix for CI to pass
The first-interaction action requires snake_case input names.
2026-02-18 15:23:35 +08:00
adisusilayasa
58c81aa258 feat(agent): add GLM-style tool call parsing
GLM models output tool calls in proprietary formats that ZeroClaw
doesn't natively support. This adds parsing for GLM-specific formats:

- browser_open/url>https://... -> shell tool with curl command
- shell/command>ls -> shell tool with command arg
- http_request/url>... -> http_request tool
- Plain URLs -> shell tool with curl command

Also adds:
- find_json_end() helper for parsing JSON objects
- Unclosed <toolcall> tag handling
- Unit tests for GLM-style parsing

The parsing is deliberately placed after XML and markdown code block
parsing, so it acts as a fallback for models that don't use standard
tool calling formats.

This enables GLM models (via Z.AI or other providers) to successfully
execute tools in ZeroClaw.
2026-02-18 15:23:35 +08:00
mikeboensel
9f34e2465e
Merge pull request #755 from zeroclaw-labs/ISSUE-754
fix(token): update token generation to use rand::rng()

Addresses warning coming from compiler:
❯ rspberrypi@localhost:~/zeroclaw$ cargo build --release --locked
warning: use of deprecated function rand::thread_rng: Renamed to rng
--> src/security/pairing.rs:186:11 |
186 | rand::thread_rng().fill_bytes(&mut bytes); | ^^^^^^^^^^
|
= note: #[warn(deprecated)] on by default
2026-02-18 02:14:22 -05:00
Chummy
57be369771 chore(docker): keep install list indentation unchanged 2026-02-18 15:14:05 +08:00
Cemal Y. Dalar
7f15627f8c fix(docker): restore benches/ copy after stub removal in builder stage
The dep-caching layer creates stub files (src/main.rs and
benches/agent_benchmarks.rs) to warm the cargo registry cache, then
removes them with `rm -rf src benches`. The subsequent real source copy
only restored `src/` — leaving `benches/` absent. Cargo's manifest
parser then failed to locate `benches/agent_benchmarks.rs` referenced
in Cargo.toml, aborting the release build with:

  error: failed to parse manifest at `/app/Cargo.toml`
  Caused by: can't find `agent_benchmarks` bench at
  `benches/agent_benchmarks.rs`

Fix: add `COPY benches/ benches/` alongside the `COPY src/ src/` step
so the real bench source is present for the incremental release build.
2026-02-18 15:14:05 +08:00
Mike Boensel
0166f2d4de fix(token): update token generation to use rand::rng() to resolve deprecation warnings 2026-02-18 02:11:51 -05:00
Chummy
a3eedfdc78 docs(zai): align setup guide with runtime defaults
- remove trailing whitespace in .env.example Z.AI block
- align documented model defaults/options with current onboard/provider behavior
- keep this PR docs-focused by reverting incidental workflow edits
2026-02-18 15:10:55 +08:00
adisusilayasa
e3d6058424 fix(ci): include workflow fix for CI to pass
The first-interaction action requires snake_case input names.
This fix is needed for CI to pass on this PR.
2026-02-18 15:10:55 +08:00
adisusilayasa
402d8f0a32 docs: add Z.AI GLM coding plan setup guide
- Add comprehensive documentation for Z.AI GLM models
- Include curl examples for testing Z.AI API
- Document available models and troubleshooting
- Update .env.example with Z.AI configuration

Z.AI provides GLM models (glm-4.5, glm-4.6, glm-4.7, glm-5) through
the OpenAI-compatible endpoint at api.z.ai/api/coding/paas/v4.

Existing tests verify:
- zai_base_url() returns correct URLs for global/CN variants
- create_provider('zai', key) successfully creates provider
- Regional alias predicates cover all variants
2026-02-18 15:10:55 +08:00
Chummy
42bf05df47 docs: clarify custom provider env vars and URL scheme 2026-02-18 15:04:11 +08:00
ZeroClaw Bot
f13553014b docs: add custom provider endpoint configuration guide
Add comprehensive documentation for custom API endpoint configuration
to address missing documentation reported in issue #567.

Changes:
- Create docs/custom-providers.md with detailed guide for custom: and anthropic-custom: formats
- Add custom endpoint examples to README.md configuration section
- Add note about daemon requirement for channels in Quick Start
- Add reference link to custom providers guide

Addresses: #567

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-18 15:04:11 +08:00
Chummy
dd454178ed perf(memory): fold recall/vector/list optimizations into spawn_blocking refactor 2026-02-18 14:46:51 +08:00
Alex Gorevski
4e528dde7d perf(memory): wrap blocking SQLite calls in tokio::task::spawn_blocking
Problem:
Every async fn in SqliteMemory acquired self.conn.lock() and ran
synchronous rusqlite queries directly on the Tokio runtime thread.
This blocks the async executor, preventing other tasks from making
progress — especially harmful under concurrent recall/store load.

Fix:
- Change conn from Mutex<Connection> to Arc<Mutex<Connection>> so
  the connection handle can be cloned into spawn_blocking closures.
- Wrap all synchronous database operations (store, recall, get, list,
  forget, count, health_check) in tokio::task::spawn_blocking.
- Split get_or_compute_embedding into three phases: cache check
  (blocking), embedding computation (async I/O), cache store
  (blocking) — ensuring no lock is held across await points.
- Apply the same pattern to the reindex method.

The async I/O (embedding computation) remains on the Tokio runtime
while all SQLite access runs on the blocking thread pool, preventing
executor starvation.

Ref: zeroclaw-labs/zeroclaw#710 (Item 4)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 14:46:51 +08:00
Chummy
83b098d7ac fix(imessage): preserve sqlite conn across polling safely 2026-02-18 14:45:05 +08:00
Alex Gorevski
1ddcb0a573 perf(imessage): reuse persistent SQLite connection across poll cycles
Problem:
The iMessage listener opened a new SQLite connection to the Messages
database on every ~3-second poll cycle via get_max_rowid() and
fetch_new_messages(), creating ~40 connection open/close cycles per
minute. Each cycle incurs filesystem syscalls, WAL header reads,
and potential page cache cold starts.

Fix:
Open a single read-only connection before the poll loop and reuse it
across iterations using the 'shuttle' pattern: the connection is moved
into each spawn_blocking closure and returned alongside the results,
then reassigned for the next iteration. This eliminates per-poll
connection overhead while preserving the spawn_blocking pattern that
keeps SQLite I/O off the Tokio runtime thread.

The standalone get_max_rowid() and fetch_new_messages() helper
functions are retained for use by tests and other callers.

Ref: zeroclaw-labs/zeroclaw#710 (Item 9)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 14:45:05 +08:00
Chummy
14066d094f test(runtime): stabilize docker root mount assertion 2026-02-18 14:42:39 +08:00
Alex Gorevski
9a6fa76825 readd tests, remove markdown files 2026-02-18 14:42:39 +08:00
Chummy
e2634c72c2 test(config): include query_classification in config fixtures 2026-02-18 14:41:58 +08:00
Edvard
6e53341bb1 feat(agent): add rule-based query classification for automatic model routing
Classify incoming user messages by keyword/pattern and route to the
appropriate model hint automatically, feeding into the existing
RouterProvider. Disabled by default; opt-in via [query_classification]
config section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:41:58 +08:00
Edvard
1336c2f03e feat(providers): add warmup() for OpenAI, Anthropic, Gemini, Compatible, GLM
All five providers have HTTP clients but did not implement warmup(),
relying on the trait default no-op. This adds lightweight warmup calls
to establish TLS + HTTP/2 connection pools on startup, reducing
first-request latency. Each warmup is skipped when credentials are
absent, matching the OpenRouter pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:35:03 +08:00
Chummy
a85a4a8194 fix(config): resolve ZEROCLAW_WORKSPACE root/workspace paths safely 2026-02-18 14:30:53 +08:00
bhagwan
b2976eb474 fix(config): support both legacy and new ZEROCLAW_WORKSPACE structure
ZEROCLAW_WORKSPACE can now be either:
- Legacy path: /path/to/workspace (config at /path/to/.zeroclaw/config.toml)
- Parent path: /path/to (config at /path/to/config.toml, workspace at /path/to/workspace)

This maintains backward compatibility with Docker's legacy folder structure
while also supporting the new parent-dir layout.
2026-02-18 14:30:53 +08:00
Chummy
da7c21f469 style(anthropic): format cache conversation test block 2026-02-18 14:29:50 +08:00
tercerapersona
455eb3b847 feat: add prompt caching support to Anthropic provider
Implements Anthropic's prompt caching API to enable significant cost
reduction (up to 90%) and latency improvements (up to 85%) for
requests with repeated content.

Key features:
- Auto-caching heuristics: large system prompts (>3KB), tool
  definitions, and long conversations (>4 messages)
- Full backward compatibility: cache_control fields are optional
- Supports both string and block-array system prompt formats
- Cache control on all content types (text, tool_use, tool_result)

Implementation details:
- Added CacheControl, SystemPrompt, and SystemBlock structures
- Updated NativeContentOut and NativeToolSpec with cache_control
- Strategic cache breakpoint placement (last tool, last message)
- Comprehensive test coverage for serialization and heuristics

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
(cherry picked from commit fff04f4edb5e4cb7e581b1b16035da8cc2e55cef)
2026-02-18 14:29:50 +08:00
Maya Walcher
63bc4721e3 feat(onboard): add signup URL, model catalog, and live fetch for Astrai
Add three onboarding improvements for the Astrai provider:

- Signup URL: users now see "Get your API key at: https://as-trai.com"
  during onboarding instead of a blank prompt
- Curated model list: auto (best execution), GPT-4o, Claude Sonnet 4.5,
  DeepSeek V3, Llama 3.3 70B
- Live model fetch: Astrai's OpenAI-compatible /v1/models endpoint is
  now queried when an API key is present, matching other providers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:19:21 +08:00
Chummy
d70324f4f7 fix(robot-kit): format crate and harden cross-platform feature gating 2026-02-18 14:19:17 +08:00
Lumi-node
0dfc707c49 feat: add zeroclaw-robot-kit crate for AI-powered robotics
Standalone robot toolkit providing AI agents with physical world interaction.

Features:
- 6 tools: drive, look, listen, speak, sense, emote
- Multiple backends: ROS2, serial, GPIO, mock
- Independent SafetyMonitor with E-stop, collision avoidance
- Designed for Raspberry Pi 5 + Ollama offline operation
- 55 unit/integration tests
- Complete Pi 5 hardware setup guide
2026-02-18 14:19:17 +08:00
Chummy
431287184b style(tests): apply rustfmt to brittle-test hardening changes 2026-02-18 14:17:58 +08:00
Alex Gorevski
45cdd25b3d fix(tests): harden brittle tests for cross-platform stability and refactoring resilience
## Problem

The test suite contained several categories of latent brittleness
identified in docs/testing-brittle-tests.md that would surface during
refactoring or cross-platform (Windows) CI execution:

1. Hardcoded Unix paths: \Path::new("/tmp")\ and \PathBuf::from("/tmp")\
   used as workspace directories in agent tests, which fail on Windows
   where /tmp does not exist.

2. Exact string match assertions: ~20 \ssert_eq!(response, "exact text")\
   assertions in agent unit and e2e tests that break on any mock wording
   change, even when the underlying orchestration behavior is correct.

3. Fragile error message string matching: \.contains("specific message")\
   assertions coupled to internal error wording rather than testing the
   error category or behavioral outcome.

## What Changed

### Hardcoded paths → platform-agnostic temp dirs (4 files, 7 locations)
- \src/agent/tests.rs\: Replaced all 4 instances of \Path::new("/tmp")\
  and \PathBuf::from("/tmp")\ with \std::env::temp_dir()\ in
  \make_memory()\, \uild_agent_with()\, \uild_agent_with_memory()\,
  and \uild_agent_with_config()\ helpers.
- \	ests/agent_e2e.rs\: Replaced all 3 instances in \make_memory()\,
  \uild_agent()\, and \uild_agent_xml()\ helpers.

### Exact string assertions → behavioral checks (2 files, ~20 locations)
- \src/agent/tests.rs\: Converted 10 \ssert_eq!(response, "...")\ to
  \ssert!(!response.is_empty(), "descriptive message")\ across tests for
  text pass-through, tool execution, tool failure recovery, XML dispatch,
  mixed text+tool responses, multi-tool batch, and run_single delegation.
- \	ests/agent_e2e.rs\: Converted 9 exact-match assertions to behavioral
  checks. Multi-turn test now uses \ssert_ne!(r1, r2)\ to verify
  sequential responses are distinct without coupling to exact wording.
- Provider error propagation test simplified to \ssert!(result.is_err())\
  without asserting on the error message string.

### Fragile error message assertions → structural checks (2 files)
- \src/tools/git_operations.rs\: Replaced fragile OR-branch string match
  (\contains("git repository") || contains("Git command failed")\) with
  structural assertions: checks \!result.success\, error is non-empty,
  and error does NOT mention autonomy/read-only (verifying the failure
  is git-related, not permission-related).
- \src/cron/scheduler.rs\: Replaced \contains("agent job failed:")\ with
  \!success\ and \!output.is_empty()\ checks that verify failure behavior
  without coupling to exact log format.

## What Was NOT Changed (and why)
- \src/agent/loop_.rs\ parser tests: Exact string assertions are the
  contract for XML tool call parsing — the exact output IS the spec.
- \src/providers/reliable.rs\: Error message assertions test the error
  format contract (provider/model attribution in failure messages).
- \src/service/mod.rs\: Already platform-gated with \#[cfg]\; XML escape
  test is a formatting contract where exact match is appropriate.
- \src/config/schema.rs\: TOML test strings use /tmp as data values for
  deserialization tests, not filesystem access; HOME tests already use
  \std::env::temp_dir()\.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 14:17:58 +08:00
Chummy
decea532ed refactor(memory): keep default hybrid weights while adding relevance threshold 2026-02-18 14:14:33 +08:00
Edvard
8a1e7cc7ef fix(agent): use config max_tool_iterations, add memory relevance filtering, rebalance search weights
Three fixes for conversation quality issues:

1. loop_.rs and channels now read max_tool_iterations from AgentConfig
   instead of using a hardcoded constant of 10, making it configurable.

2. Memory recall now filters entries below a configurable
   min_relevance_score threshold (default 0.4), preventing unrelated
   memories from bleeding into conversation context.

3. Default hybrid search weights rebalanced from 70/30 vector/keyword
   to 40/60, reducing cross-topic semantic bleed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:14:33 +08:00
Alex Gorevski
21c5f58363 perf(cron): wrap record_run INSERT+DELETE in explicit transaction
Problem:
In record_run(), an INSERT into cron_runs followed by a pruning DELETE
ran as separate implicit transactions. If the INSERT succeeded but the
DELETE failed (e.g., due to disk pressure or lock contention), the run
table would grow unboundedly since the pruning step was lost while the
new row persisted.

Fix:
Wrap both statements in an explicit transaction using
conn.unchecked_transaction(). If either statement fails, the entire
transaction is rolled back, maintaining the invariant that the run
history stays bounded by max_run_history.

Ref: zeroclaw-labs/zeroclaw#710 (Item 5)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 14:07:31 +08:00
Alex Gorevski
9967eeb954 perf(cron): add composite index on cron_runs(job_id, started_at)
Problem:
The pruning query in record_run uses WHERE job_id = ?1 with
ORDER BY started_at DESC, but only single-column indexes exist
for job_id and started_at separately. SQLite must scan one index
and then sort or scan the other, which is suboptimal for the
combined filter + sort pattern used during pruning.

Fix:
Add a composite index CREATE INDEX IF NOT EXISTS
idx_cron_runs_job_started ON cron_runs(job_id, started_at).
This lets SQLite satisfy the WHERE job_id = ?1 ORDER BY
started_at DESC subquery in a single index scan without a
separate sort step. The existing single-column indexes are
retained for other queries that filter on only one column.

Ref: zeroclaw-labs/zeroclaw#710 (Item 7)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 14:06:59 +08:00
Chummy
151bc6a600 fix(scripts): align installer filename and docs placement 2026-02-18 13:56:35 +08:00
reidliu41
fdef03e455 feat(scripts): add one-line install script
- Add `scripts/install.sh` — a single `curl | bash` installer that handles system deps, Rust, clone, build, and install
  automatically.
- Update README Linux/macOS section with a "One-Line Installer (Recommended)" block above the existing manual steps.

1. Detects OS (Linux apt/dnf, macOS Xcode CLT)
2. Installs build deps + git via system package manager (sudo only here)
3. Installs Rust via rustup (skipped if already present)
4. Shallow-clones the repo to `/tmp/zeroclaw-install`
5. `cargo build --release --locked` + `cargo install --path . --force --locked`
6. Cleans up temp dir and prints next steps (`source ~/.cargo/env`, `zeroclaw onboard`)
2026-02-18 13:56:35 +08:00
ikunali
61eb72f6eb docs(readme): dark mode support for Star History chart 2026-02-18 13:50:42 +08:00