Flip conditional to use positive check (is_empty) in the if-branch
to resolve clippy::if_not_else error in CI strict delta lint gate.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When use_native_tools is true, the agent loop now:
- Formats assistant history as JSON with tool_calls array (matching
what convert_messages() expects to reconstruct NativeMessage)
- Pushes each tool result as ChatMessage::tool with tool_call_id
(instead of a single ChatMessage::user with XML tool_result tags)
- Adds fallback parsing for markdown code block tool calls
(```tool_call ... ``` and hybrid ```tool_call ... </tool_call>)
Without this, the second LLM call (sending tool results back) gets
rejected with 4xx by OpenRouter/Gemini because the message format
doesn't match the OpenAI tool calling API expectations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ReliableProvider wraps underlying providers with retry/fallback logic
but did not delegate `supports_native_tools()` or `chat_with_tools()`.
This caused the agent loop to fall back to prompt-based tool calling
for all providers, even those with native tool support (OpenRouter,
OpenAI, Anthropic). Models like Gemini 2.0 Flash would then output
tool calls as text instead of structured API responses, breaking the
tool execution loop entirely.
Add `supports_native_tools()` delegation to the primary provider and
`chat_with_tools()` with the same retry/fallback logic as the existing
`chat_with_system()` and `chat_with_history()` methods.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Ensures consistent LF line endings in the repository across all platforms.
Critical for shell scripts that execute on Linux CI/CD environments.
Key changes:
- Shell scripts (*.sh) enforce eol=lf to prevent bash interpreter errors
- Rust source and config files normalized to LF
- Binary files explicitly marked to prevent conversion
- Default text=auto provides safe handling for unlisted file types
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Simplified
Add explicit linting and formatting configuration files to document
intent and provide consistent defaults across editors and platforms.
- .editorconfig: UTF-8, LF line endings, 4-space indent for Rust,
2-space for YAML/TOML, preserve trailing whitespace in Markdown.
- rustfmt.toml: Pin edition to 2021 matching Cargo.toml. Uses
standard defaults; file documents that this is intentional.
- clippy.toml: Set cognitive-complexity-threshold to 30,
too-many-arguments-threshold to 10, and too-many-lines-threshold
to 200. Thresholds tuned to match existing codebase patterns and
reduce noise from existing allow-attributes.
All values match current implicit defaults or are tuned to avoid
triggering on existing code. No source code changes required.
Validated: cargo fmt --check and cargo clippy -D clippy::correctness
both pass with no regressions.
Resolves#662
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add "astrai" to factory_all_providers_create_successfully test
- Add "astrai" => "ASTRAI_API_KEY" in provider_env_var() for onboarding
- Add Astrai to onboarding provider selection list (Gateway tier)
- Add provider_env_var("astrai") assertion in known_providers test
Addresses review comments from @chumyin on #486.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove 'branch' from requires_write_access() to resolve the
contradiction where branch listing was classified as both read-only
and write-requiring. Branch listing only enumerates local refs and
has no side effects, so it should remain available under ReadOnly
autonomy mode.
Add regression tests:
- branch_is_not_write_gated: verifies classification consistency
- allows_branch_listing_in_readonly_mode: verifies end-to-end
execution under ReadOnly autonomy
- is_read_only_detection: now explicitly asserts branch is read-only
Resolveszeroclaw-labs/zeroclaw#612
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* feat: Add GitHub Actions workflows for security audits, CodeQL analysis, contributor updates, performance benchmarks, integration tests, fuzz testing, and reusable Rust build jobs
- Implemented `sec-audit.yml` for Rust package security audits using `rustsec/audit-check` and `cargo-deny-action`.
- Created `sec-codeql.yml` for CodeQL analysis scheduled twice daily.
- Added `sync-contributors.yml` to update the NOTICE file with new contributors automatically.
- Introduced `test-benchmarks.yml` for performance benchmarks using Criterion.
- Established `test-e2e.yml` for running integration and end-to-end tests.
- Developed `test-fuzz.yml` for fuzz testing with configurable runtime.
- Created `test-rust-build.yml` as a reusable job for executing Rust commands with customizable parameters.
- Documented main branch delivery flows in `main-branch-flow.md` for clarity on CI/CD processes.
* ci(workflows): update workflow scripts and rename for clarity; remove obsolete lint feedback script
* chore(ci): externalize workflow scripts and relocate main flow doc
* chore(ci): align workflow names with file naming style
* feat: Add GitHub Actions workflows for security audits, CodeQL analysis, contributor updates, performance benchmarks, integration tests, fuzz testing, and reusable Rust build jobs
- Implemented `sec-audit.yml` for Rust package security audits using `rustsec/audit-check` and `cargo-deny-action`.
- Created `sec-codeql.yml` for CodeQL analysis scheduled twice daily.
- Added `sync-contributors.yml` to update the NOTICE file with new contributors automatically.
- Introduced `test-benchmarks.yml` for performance benchmarks using Criterion.
- Established `test-e2e.yml` for running integration and end-to-end tests.
- Developed `test-fuzz.yml` for fuzz testing with configurable runtime.
- Created `test-rust-build.yml` as a reusable job for executing Rust commands with customizable parameters.
- Documented main branch delivery flows in `main-branch-flow.md` for clarity on CI/CD processes.
* ci(workflows): update workflow scripts and rename for clarity; remove obsolete lint feedback script
* chore(ci): externalize workflow scripts and relocate main flow doc
Generate CycloneDX and SPDX Software Bill of Materials during
release builds. SBOMs are included in release artifacts and
covered by SHA256 checksums and cosign signatures.
Addresses item #5 in #618.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Closes#607
The http_request tool validated the initial URL against the domain
allowlist and private-host rules, but reqwest's default redirect policy
followed redirects automatically without revalidating each hop. This
allowed SSRF via redirect chains from allowed domains to internal hosts.
Set redirect policy to Policy::none() so 3xx responses are returned
as-is. Callers that need to follow redirects must issue a new request,
which goes through validate_url again.
Severity: High — SSRF/allowlist bypass via redirect chains.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add benchmarks using Criterion for:
- XML tool-call parsing (single and multi-call)
- Native tool-call parsing
- SQLite memory store/recall/count operations
- Full agent turn cycle (text-only and with tool call)
Add CI workflow (.github/workflows/benchmarks.yml) that:
- Runs benchmarks on push to main and on PRs
- Uploads Criterion results as artifacts
- Posts benchmark summary as PR comment for regression visibility
Ref: https://github.com/zeroclaw-labs/zeroclaw/issues/618 (item 7)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Closes#601
The Linux screenshot path uses sh -c with single-quote interpolation.
A filename containing quote characters could break quoting and inject
shell tokens. Add a check that rejects filenames with any shell-breaking
characters (quotes, backticks, dollar signs, semicolons, pipes, etc.)
before passing to the shell command.
Severity: High — command injection in tool execution path.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem: Security-critical parsing surfaces (config loading, tool
parameter deserialization) have no fuzz testing coverage. Malformed
inputs to these surfaces could cause panics, memory issues, or
unexpected behavior in production.
Solution: Add a weekly cargo-fuzz CI workflow with two initial
harnesses:
- fuzz_config_parse: fuzzes TOML config deserialization
- fuzz_tool_params: fuzzes JSON tool parameter parsing
The workflow runs each target for 300 seconds (configurable via
workflow_dispatch input), uses nightly Rust toolchain (required by
libfuzzer), and uploads crash artifacts for triage with 30-day
retention. Step summaries report pass/fail status per target.
Files added:
- .github/workflows/fuzz.yml (scheduled + manual dispatch)
- fuzz/Cargo.toml (fuzz crate manifest)
- fuzz/fuzz_targets/fuzz_config_parse.rs
- fuzz/fuzz_targets/fuzz_tool_params.rs
Testing: Validated YAML syntax and Cargo.toml structure. Fuzz
harnesses use standard libfuzzer-sys patterns. Actual fuzzing
will execute on first scheduled or manual CI run.
Ref: zeroclaw-labs/zeroclaw#618 (item 4 — Fuzz Testing)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem: The Dependabot configuration monitors Cargo and GitHub Actions
dependencies but does not track Docker base image updates. Stale base
images in the Dockerfile can accumulate unpatched vulnerabilities.
Solution: Add a Docker package-ecosystem entry to dependabot.yml that
proposes weekly base image updates, grouped by minor/patch, with a
3-PR concurrency limit. Labels (ci, dependencies) match the existing
GitHub Actions ecosystem entry for consistent triage routing.
Testing: Validated YAML syntax. Dependabot will activate automatically
on the next scheduled scan after merge.
Ref: zeroclaw-labs/zeroclaw#618 (item 1 — Dependency Update Automation)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add end-to-end integration tests that exercise the full agent turn
cycle through the public API using mock providers and tools:
- Simple text response (no tools)
- Single tool call → tool execution → final response
- Multi-step tool chain
- XML dispatcher path
- Multi-turn conversation coherence
- Unknown tool recovery
- Parallel tool dispatch
Add CI workflow (.github/workflows/e2e.yml) that runs these tests
on push to main and on PRs.
Ref: https://github.com/zeroclaw-labs/zeroclaw/issues/618 (item 6)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem: CI only tests the default feature set. The codebase defines
multiple Cargo features (hardware, browser-native, sandbox-landlock,
sandbox-bubblewrap, probe, rag-pdf) behind conditional compilation.
Feature-gated code can silently break without CI coverage.
Solution: Add a dedicated feature-matrix workflow that tests key
feature combinations in a matrix strategy:
- --no-default-features (bare minimum compiles)
- --all-features (everything together)
- --no-default-features --features hardware (isolated hardware)
- --no-default-features --features browser-native (isolated browser)
Each combination runs both cargo check and cargo test. The
workflow triggers on Cargo.toml/lock/src changes and weekly schedule.
Testing: Validated YAML syntax and matrix expansion logic. Actual
feature compilation will be verified by CI on first run.
Ref: zeroclaw-labs/zeroclaw#618 (item 2 — Feature Matrix Testing)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Problem: The release workflow warns when binaries exceed 5MB but does
not block the build. Since small binary size is a stated project goal
(release profile uses opt-level="z", LTO, strip, panic=abort), size
regressions can silently ship to users without any enforcement.
Solution: Convert the binary size check to a tiered gate:
- >5MB: emits a GitHub Actions warning (soft target, informational)
- >15MB: emits a GitHub Actions error and fails the build (hard limit)
- Adds a step summary with per-target binary size metrics for
visibility in the Actions UI.
The 15MB hard limit provides headroom for legitimate growth while
catching catastrophic regressions (e.g., debug symbols not stripped,
accidental fat dependency additions).
Testing: Validated YAML syntax. The shell script logic is
straightforward (stat + arithmetic comparison). The existing
unner.os != 'Windows' guard is preserved.
Ref: zeroclaw-labs/zeroclaw#618 (item 3 — Binary Size Gating)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* fix(gateway): honor config bind settings and persist pairing
Resolve docker-compose startup and restart friction by:
- using config host/port defaults for gateway/daemon unless CLI flags are passed
- persisting paired token hashes to config.toml on successful /pair
- running container default command as 'zeroclaw gateway' (no hardcoded --host/--port overrides)
- updating compose image/docs to zeroclaw-labs namespace
- adding MODEL env fallback for default_model override and targeted regression tests
* chore(ci): sync lockfile and restore rustfmt parity
Update Cargo.lock to match Cargo.toml and format src/service/mod.rs so rust quality gates stop failing with unrelated baseline drift.
* docs(readme): restore Buy Me a Coffee links
Re-add the donation badge in the header and restore a dedicated support section so the donation path remains visible in the README.
* docs(readme): polish hero copy and stats layout
Improve intro readability below badges by refining community copy, centering the tagline, and replacing the wide code block with a centered inline stats row.
The CI workflow contained a ~90-line bash script for change-detection
(lines 38-128) and a ~80-line JavaScript block for lint feedback
(lines 292-370) directly inline in the YAML. Large inline scripts are
harder to test, lint, and maintain than standalone files.
Extract:
- Change-detection logic → scripts/ci/detect_change_scope.sh
- Lint feedback logic → scripts/ci/lint_feedback.js
The workflow now calls these external scripts. GitHub expression values
that were previously interpolated inline are passed as environment
variables instead.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci(release): add concurrency group to prevent duplicate release builds
When two tags are pushed in quick succession, the release workflow could
run concurrently, producing corrupted or incomplete GitHub releases.
Add a concurrency group scoped to the tag ref so that release runs for
the same tag are serialized. cancel-in-progress is set to false to ensure
a running release completes rather than being aborted.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* ci(release): serialize all release runs globally
Use a constant workflow concurrency group so release publish jobs run one-at-a-time across tags, avoiding cross-tag race conditions.
---------
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Will Sarg <12886992+willsarg@users.noreply.github.com>
* fix(workflows): standardize runner configuration for security jobs
* ci(actionlint): add Blacksmith runner label to config
Add blacksmith-2vcpu-ubuntu-2404 to actionlint self-hosted-runner labels config
to suppress "unknown label" warnings during workflow linting.
This label is used across all workflows after the Blacksmith migration.
* fix(actionlint): adjust indentation for self-hosted runner labels
* feat(security): enhance security workflow with CodeQL analysis steps
* fix(security): update CodeQL action to version 4 for improved analysis
* fix(security): remove duplicate permissions in security workflow
* fix(security): revert CodeQL action to v3 for stability
The v4 version was causing workflow file validation failures.
Reverting to proven v3 version that is working on main branch.
* fix(security): remove duplicate permissions causing workflow validation failure
The permissions block had duplicate security-events and actions keys,
which caused YAML validation errors and prevented workflow execution.
Fixes: workflow file validation failures on main branch
* fix(security): remove pull_request trigger to reduce costs
* fix(security): restore PR trigger but skip codeql on PRs
* fix(security): resolve YAML syntax error in security workflow
* refactor(security): split CodeQL into dedicated scheduled workflow
* fix(security): update workflow name to Rust Package Security Audit
* fix(codeql): remove push trigger, keep schedule and on-demand only
* feat(codeql): add CodeQL configuration file to ignore specific paths
* Potential fix for code scanning alert no. 39: Hard-coded cryptographic value
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* fix(ci): resolve auto-response workflow merge markers
* fix(build): restore ChannelMessage reply_target usage
* ci(workflows): run workflow sanity on workflow pushes for all branches
* ci(workflows): rename auto-response workflow to PR Auto Responder
* ci(workflows): require owner approval for workflow file changes
* ci: add lint-first PR feedback gate
* ci(workflows): split label policy checks from workflow sanity
* ci(workflows): consolidate policy and rust workflow setup
* ci: add safe pull request intake sanity checks
* ci(security): switch audit to pinned rustsec audit-check
* fix(providers): clarify reliable failure entries for custom providers
* ci(pr-intake): make template/format checks advisory
Keep PR Intake Sanity non-blocking for template completeness and formatting findings, while still failing on dangerous merge-conflict markers in added lines.
---------
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* fix(workflows): standardize runner configuration for security jobs
* ci(actionlint): add Blacksmith runner label to config
Add blacksmith-2vcpu-ubuntu-2404 to actionlint self-hosted-runner labels config
to suppress "unknown label" warnings during workflow linting.
This label is used across all workflows after the Blacksmith migration.
* fix(actionlint): adjust indentation for self-hosted runner labels
* feat(security): enhance security workflow with CodeQL analysis steps
* fix(security): update CodeQL action to version 4 for improved analysis
* fix(security): remove duplicate permissions in security workflow
* fix(security): revert CodeQL action to v3 for stability
The v4 version was causing workflow file validation failures.
Reverting to proven v3 version that is working on main branch.
* fix(security): remove duplicate permissions causing workflow validation failure
The permissions block had duplicate security-events and actions keys,
which caused YAML validation errors and prevented workflow execution.
Fixes: workflow file validation failures on main branch
* fix(security): remove pull_request trigger to reduce costs
* fix(security): restore PR trigger but skip codeql on PRs
* fix(security): resolve YAML syntax error in security workflow
* refactor(security): split CodeQL into dedicated scheduled workflow
* fix(security): update workflow name to Rust Package Security Audit
* fix(codeql): remove push trigger, keep schedule and on-demand only
* feat(codeql): add CodeQL configuration file to ignore specific paths
* Potential fix for code scanning alert no. 39: Hard-coded cryptographic value
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* fix(ci): resolve auto-response workflow merge markers
* fix(build): restore ChannelMessage reply_target usage
* ci(workflows): run workflow sanity on workflow pushes for all branches
* ci(workflows): rename auto-response workflow to PR Auto Responder
* ci(workflows): require owner approval for workflow file changes
* ci: add lint-first PR feedback gate
* ci(workflows): split label policy checks from workflow sanity
* ci(workflows): consolidate policy and rust workflow setup
* ci: add safe pull request intake sanity checks
* ci(security): switch audit to pinned rustsec audit-check
* fix(providers): clarify reliable failure entries for custom providers
---------
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>