Add three onboarding improvements for the Astrai provider:
- Signup URL: users now see "Get your API key at: https://as-trai.com"
during onboarding instead of a blank prompt
- Curated model list: auto (best execution), GPT-4o, Claude Sonnet 4.5,
DeepSeek V3, Llama 3.3 70B
- Live model fetch: Astrai's OpenAI-compatible /v1/models endpoint is
now queried when an API key is present, matching other providers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
## Problem
The test suite contained several categories of latent brittleness
identified in docs/testing-brittle-tests.md that would surface during
refactoring or cross-platform (Windows) CI execution:
1. Hardcoded Unix paths: \Path::new("/tmp")\ and \PathBuf::from("/tmp")\
used as workspace directories in agent tests, which fail on Windows
where /tmp does not exist.
2. Exact string match assertions: ~20 \ssert_eq!(response, "exact text")\
assertions in agent unit and e2e tests that break on any mock wording
change, even when the underlying orchestration behavior is correct.
3. Fragile error message string matching: \.contains("specific message")\
assertions coupled to internal error wording rather than testing the
error category or behavioral outcome.
## What Changed
### Hardcoded paths → platform-agnostic temp dirs (4 files, 7 locations)
- \src/agent/tests.rs\: Replaced all 4 instances of \Path::new("/tmp")\
and \PathBuf::from("/tmp")\ with \std::env::temp_dir()\ in
\make_memory()\, \uild_agent_with()\, \uild_agent_with_memory()\,
and \uild_agent_with_config()\ helpers.
- \ ests/agent_e2e.rs\: Replaced all 3 instances in \make_memory()\,
\uild_agent()\, and \uild_agent_xml()\ helpers.
### Exact string assertions → behavioral checks (2 files, ~20 locations)
- \src/agent/tests.rs\: Converted 10 \ssert_eq!(response, "...")\ to
\ssert!(!response.is_empty(), "descriptive message")\ across tests for
text pass-through, tool execution, tool failure recovery, XML dispatch,
mixed text+tool responses, multi-tool batch, and run_single delegation.
- \ ests/agent_e2e.rs\: Converted 9 exact-match assertions to behavioral
checks. Multi-turn test now uses \ssert_ne!(r1, r2)\ to verify
sequential responses are distinct without coupling to exact wording.
- Provider error propagation test simplified to \ssert!(result.is_err())\
without asserting on the error message string.
### Fragile error message assertions → structural checks (2 files)
- \src/tools/git_operations.rs\: Replaced fragile OR-branch string match
(\contains("git repository") || contains("Git command failed")\) with
structural assertions: checks \!result.success\, error is non-empty,
and error does NOT mention autonomy/read-only (verifying the failure
is git-related, not permission-related).
- \src/cron/scheduler.rs\: Replaced \contains("agent job failed:")\ with
\!success\ and \!output.is_empty()\ checks that verify failure behavior
without coupling to exact log format.
## What Was NOT Changed (and why)
- \src/agent/loop_.rs\ parser tests: Exact string assertions are the
contract for XML tool call parsing — the exact output IS the spec.
- \src/providers/reliable.rs\: Error message assertions test the error
format contract (provider/model attribution in failure messages).
- \src/service/mod.rs\: Already platform-gated with \#[cfg]\; XML escape
test is a formatting contract where exact match is appropriate.
- \src/config/schema.rs\: TOML test strings use /tmp as data values for
deserialization tests, not filesystem access; HOME tests already use
\std::env::temp_dir()\.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three fixes for conversation quality issues:
1. loop_.rs and channels now read max_tool_iterations from AgentConfig
instead of using a hardcoded constant of 10, making it configurable.
2. Memory recall now filters entries below a configurable
min_relevance_score threshold (default 0.4), preventing unrelated
memories from bleeding into conversation context.
3. Default hybrid search weights rebalanced from 70/30 vector/keyword
to 40/60, reducing cross-topic semantic bleed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Problem:
In record_run(), an INSERT into cron_runs followed by a pruning DELETE
ran as separate implicit transactions. If the INSERT succeeded but the
DELETE failed (e.g., due to disk pressure or lock contention), the run
table would grow unboundedly since the pruning step was lost while the
new row persisted.
Fix:
Wrap both statements in an explicit transaction using
conn.unchecked_transaction(). If either statement fails, the entire
transaction is rolled back, maintaining the invariant that the run
history stays bounded by max_run_history.
Ref: zeroclaw-labs/zeroclaw#710 (Item 5)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem:
The pruning query in record_run uses WHERE job_id = ?1 with
ORDER BY started_at DESC, but only single-column indexes exist
for job_id and started_at separately. SQLite must scan one index
and then sort or scan the other, which is suboptimal for the
combined filter + sort pattern used during pruning.
Fix:
Add a composite index CREATE INDEX IF NOT EXISTS
idx_cron_runs_job_started ON cron_runs(job_id, started_at).
This lets SQLite satisfy the WHERE job_id = ?1 ORDER BY
started_at DESC subquery in a single index scan without a
separate sort step. The existing single-column indexes are
retained for other queries that filter on only one column.
Ref: zeroclaw-labs/zeroclaw#710 (Item 7)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add `scripts/install.sh` — a single `curl | bash` installer that handles system deps, Rust, clone, build, and install
automatically.
- Update README Linux/macOS section with a "One-Line Installer (Recommended)" block above the existing manual steps.
1. Detects OS (Linux apt/dnf, macOS Xcode CLT)
2. Installs build deps + git via system package manager (sudo only here)
3. Installs Rust via rustup (skipped if already present)
4. Shallow-clones the repo to `/tmp/zeroclaw-install`
5. `cargo build --release --locked` + `cargo install --path . --force --locked`
6. Cleans up temp dir and prints next steps (`source ~/.cargo/env`, `zeroclaw onboard`)
The NativeToolDispatcher silently defaults to an empty object when tool
call arguments from the LLM fail to parse as JSON. The XML dispatcher
already logs a warning for the same case (line 68). Add a matching
tracing::warn with tool name and parse error for observability parity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Problem: add_column_if_missing() checks PRAGMA table_info for column existence, then issues ALTER TABLE ADD COLUMN if not found. When two concurrent processes both pass the check before either executes the ALTER, the second process fails with a 'duplicate column name' error.
Fix: Catch the 'duplicate column name' SQLite error after the ALTER TABLE and treat it as a benign no-op. Also explicitly drop statement/rows handles before ALTER to release locks.
Ref: #710 (Item 8)
run_tool_call_loop used a hardcoded MAX_TOOL_ITERATIONS (10) and
trim_history/auto_compact_history used a hardcoded MAX_HISTORY_MESSAGES (50),
ignoring the user-configurable agent.max_tool_iterations and
agent.max_history_messages values in config.toml.
Meanwhile, agent.rs correctly reads from config — creating an inconsistency
where CLI single-shot mode respected config but the channel runtime and
interactive CLI loop silently ignored it.
Changes:
- Rename constants to DEFAULT_* to clarify they are fallback defaults
- Add max_tool_iterations parameter to run_tool_call_loop
- Add max_history parameter to trim_history and auto_compact_history
- Thread config.agent.max_tool_iterations through ChannelRuntimeContext
- Both CLI code paths now pass config values to run_tool_call_loop
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Discord channel was setting msg.channel to the numeric Discord
channel ID instead of the literal string 'discord'. This caused
process_channel_message() to fail the channels_by_name lookup since
the map is keyed by channel name (e.g. 'discord', 'telegram', 'slack').
The result: the bot receives messages and generates LLM responses but
never sends them back -- target_channel resolves to None so the send
call is silently skipped.
Every other channel (telegram, slack, whatsapp, matrix, signal, irc,
imessage, lark, dingtalk, qq, email, mattermost) correctly sets this
field to its channel name string. Discord was the only one using the
platform-specific ID.
The GLM provider previously relied on the trait default for
chat_with_history, which only forwarded the last user message. This adds
a proper multi-turn implementation that sends the full conversation
history to the GLM API, matching the pattern used by OpenRouter, Ollama,
and other providers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add memory.sqlite_open_timeout_secs config (None = wait indefinitely).
- When set, open the DB in a thread with recv_timeout; cap at 300s.
- Default remains None for backward compatibility.
- Document in README; add tests for timeout path and default.
Add start_typing/stop_typing overrides to TelegramChannel following the
same pattern as DiscordChannel: spawn a tokio task that sends
sendChatAction every 4 seconds (Telegram typing expires after 5s),
and abort it on stop_typing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The MemoryCategory::Custom variant already exists in the memory backend
but the memory_store tool only accepted core/daily/conversation. Now any
string is accepted as a category, passing through to Custom(name) for
non-builtin values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace broad COPY . . with targeted COPY src/ and firmware/ to
preserve Docker layer cache across non-build file changes (4.1)
- Inline permissions/config prep into builder stage, removing the
extra busybox stage and its maintenance/security overhead (4.2)
- Strip heavy dev tools (vim, git, iputils-ping, openssl) from dev
image, keeping only ca-certificates and curl (4.3)
- Replace expensive zeroclaw doctor healthcheck with lightweight
zeroclaw status; increase interval from 30s to 60s (4.4)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Gemini CLI OAuth tokens are scoped for Google's internal Code Assist
API at cloudcode-pa.googleapis.com/v1internal, not the public
generativelanguage.googleapis.com/v1beta endpoint.
This commit:
- Routes OAuth requests to the correct internal endpoint
- Wraps the request payload with model metadata (internal API format)
- Keeps API key auth unchanged on the public endpoint
Fixes#578
Implement Telegram Forum (topic) support to allow the bot to respond
in the same topic where it was mentioned/called.
Changes:
- parse_update_message(): Extract message_thread_id and format reply_target as 'chat_id:thread_id'
- send(): Parse recipient to extract chat_id and optional thread_id
- All send methods now pass thread_id parameter and include message_thread_id in API requests
- Added test for forum topic message parsing
This ensures bot replies stay within the same forum topic thread.
Flip conditional to use positive check (is_empty) in the if-branch
to resolve clippy::if_not_else error in CI strict delta lint gate.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When use_native_tools is true, the agent loop now:
- Formats assistant history as JSON with tool_calls array (matching
what convert_messages() expects to reconstruct NativeMessage)
- Pushes each tool result as ChatMessage::tool with tool_call_id
(instead of a single ChatMessage::user with XML tool_result tags)
- Adds fallback parsing for markdown code block tool calls
(```tool_call ... ``` and hybrid ```tool_call ... </tool_call>)
Without this, the second LLM call (sending tool results back) gets
rejected with 4xx by OpenRouter/Gemini because the message format
doesn't match the OpenAI tool calling API expectations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ReliableProvider wraps underlying providers with retry/fallback logic
but did not delegate `supports_native_tools()` or `chat_with_tools()`.
This caused the agent loop to fall back to prompt-based tool calling
for all providers, even those with native tool support (OpenRouter,
OpenAI, Anthropic). Models like Gemini 2.0 Flash would then output
tool calls as text instead of structured API responses, breaking the
tool execution loop entirely.
Add `supports_native_tools()` delegation to the primary provider and
`chat_with_tools()` with the same retry/fallback logic as the existing
`chat_with_system()` and `chat_with_history()` methods.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ensures consistent LF line endings in the repository across all platforms.
Critical for shell scripts that execute on Linux CI/CD environments.
Key changes:
- Shell scripts (*.sh) enforce eol=lf to prevent bash interpreter errors
- Rust source and config files normalized to LF
- Binary files explicitly marked to prevent conversion
- Default text=auto provides safe handling for unlisted file types
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Simplified
Add explicit linting and formatting configuration files to document
intent and provide consistent defaults across editors and platforms.
- .editorconfig: UTF-8, LF line endings, 4-space indent for Rust,
2-space for YAML/TOML, preserve trailing whitespace in Markdown.
- rustfmt.toml: Pin edition to 2021 matching Cargo.toml. Uses
standard defaults; file documents that this is intentional.
- clippy.toml: Set cognitive-complexity-threshold to 30,
too-many-arguments-threshold to 10, and too-many-lines-threshold
to 200. Thresholds tuned to match existing codebase patterns and
reduce noise from existing allow-attributes.
All values match current implicit defaults or are tuned to avoid
triggering on existing code. No source code changes required.
Validated: cargo fmt --check and cargo clippy -D clippy::correctness
both pass with no regressions.
Resolves#662
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add "astrai" to factory_all_providers_create_successfully test
- Add "astrai" => "ASTRAI_API_KEY" in provider_env_var() for onboarding
- Add Astrai to onboarding provider selection list (Gateway tier)
- Add provider_env_var("astrai") assertion in known_providers test
Addresses review comments from @chumyin on #486.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 'branch' from requires_write_access() to resolve the
contradiction where branch listing was classified as both read-only
and write-requiring. Branch listing only enumerates local refs and
has no side effects, so it should remain available under ReadOnly
autonomy mode.
Add regression tests:
- branch_is_not_write_gated: verifies classification consistency
- allows_branch_listing_in_readonly_mode: verifies end-to-end
execution under ReadOnly autonomy
- is_read_only_detection: now explicitly asserts branch is read-only
Resolveszeroclaw-labs/zeroclaw#612
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat: Add GitHub Actions workflows for security audits, CodeQL analysis, contributor updates, performance benchmarks, integration tests, fuzz testing, and reusable Rust build jobs
- Implemented `sec-audit.yml` for Rust package security audits using `rustsec/audit-check` and `cargo-deny-action`.
- Created `sec-codeql.yml` for CodeQL analysis scheduled twice daily.
- Added `sync-contributors.yml` to update the NOTICE file with new contributors automatically.
- Introduced `test-benchmarks.yml` for performance benchmarks using Criterion.
- Established `test-e2e.yml` for running integration and end-to-end tests.
- Developed `test-fuzz.yml` for fuzz testing with configurable runtime.
- Created `test-rust-build.yml` as a reusable job for executing Rust commands with customizable parameters.
- Documented main branch delivery flows in `main-branch-flow.md` for clarity on CI/CD processes.
* ci(workflows): update workflow scripts and rename for clarity; remove obsolete lint feedback script
* chore(ci): externalize workflow scripts and relocate main flow doc
* chore(ci): align workflow names with file naming style
* feat: Add GitHub Actions workflows for security audits, CodeQL analysis, contributor updates, performance benchmarks, integration tests, fuzz testing, and reusable Rust build jobs
- Implemented `sec-audit.yml` for Rust package security audits using `rustsec/audit-check` and `cargo-deny-action`.
- Created `sec-codeql.yml` for CodeQL analysis scheduled twice daily.
- Added `sync-contributors.yml` to update the NOTICE file with new contributors automatically.
- Introduced `test-benchmarks.yml` for performance benchmarks using Criterion.
- Established `test-e2e.yml` for running integration and end-to-end tests.
- Developed `test-fuzz.yml` for fuzz testing with configurable runtime.
- Created `test-rust-build.yml` as a reusable job for executing Rust commands with customizable parameters.
- Documented main branch delivery flows in `main-branch-flow.md` for clarity on CI/CD processes.
* ci(workflows): update workflow scripts and rename for clarity; remove obsolete lint feedback script
* chore(ci): externalize workflow scripts and relocate main flow doc
Generate CycloneDX and SPDX Software Bill of Materials during
release builds. SBOMs are included in release artifacts and
covered by SHA256 checksums and cosign signatures.
Addresses item #5 in #618.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>