Implement chat_with_tools() on CompatibleProvider so OpenAI-compatible
endpoints (OpenRouter, local LLMs, etc.) can use structured tool calling
instead of prompt-injected tool descriptions.
Changes:
- CompatibleProvider: capabilities() reports native_tool_calling, new
chat_with_tools() sends tools in API request and parses tool_calls
from response, chat() bridges to chat_with_tools() when ToolSpecs
are provided
- RouterProvider: chat_with_tools() delegation with model hint resolution
- loop_.rs: expose tools_to_openai_format as pub(crate), add
tools_to_openai_format_from_specs for ToolSpec-based conversion
Adds 9 new tests and updates 1 existing test.
Reasoning/thinking models (Qwen3, GLM-4, DeepSeek, etc.) may return
output in `reasoning_content` instead of `content`. Add automatic
fallback for both OpenAI and OpenAI-compatible providers, including
streaming SSE support.
Changes:
- Add `reasoning_content` field to response structs in both providers
- Add `effective_content()` helper that prefers `content` but falls
back to `reasoning_content` when content is empty/null/missing
- Update all extraction sites to use `effective_content()`
- Add streaming SSE fallback for `reasoning_content` chunks
- Add 16 focused unit tests covering all edge cases
Tested end-to-end against GLM-4.7-flash via local LLM server.
All five providers have HTTP clients but did not implement warmup(),
relying on the trait default no-op. This adds lightweight warmup calls
to establish TLS + HTTP/2 connection pools on startup, reducing
first-request latency. Each warmup is skipped when credentials are
absent, matching the OpenRouter pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements Anthropic's prompt caching API to enable significant cost
reduction (up to 90%) and latency improvements (up to 85%) for
requests with repeated content.
Key features:
- Auto-caching heuristics: large system prompts (>3KB), tool
definitions, and long conversations (>4 messages)
- Full backward compatibility: cache_control fields are optional
- Supports both string and block-array system prompt formats
- Cache control on all content types (text, tool_use, tool_result)
Implementation details:
- Added CacheControl, SystemPrompt, and SystemBlock structures
- Updated NativeContentOut and NativeToolSpec with cache_control
- Strategic cache breakpoint placement (last tool, last message)
- Comprehensive test coverage for serialization and heuristics
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
(cherry picked from commit fff04f4edb5e4cb7e581b1b16035da8cc2e55cef)
The GLM provider previously relied on the trait default for
chat_with_history, which only forwarded the last user message. This adds
a proper multi-turn implementation that sends the full conversation
history to the GLM API, matching the pattern used by OpenRouter, Ollama,
and other providers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Gemini CLI OAuth tokens are scoped for Google's internal Code Assist
API at cloudcode-pa.googleapis.com/v1internal, not the public
generativelanguage.googleapis.com/v1beta endpoint.
This commit:
- Routes OAuth requests to the correct internal endpoint
- Wraps the request payload with model metadata (internal API format)
- Keeps API key auth unchanged on the public endpoint
Fixes#578
ReliableProvider wraps underlying providers with retry/fallback logic
but did not delegate `supports_native_tools()` or `chat_with_tools()`.
This caused the agent loop to fall back to prompt-based tool calling
for all providers, even those with native tool support (OpenRouter,
OpenAI, Anthropic). Models like Gemini 2.0 Flash would then output
tool calls as text instead of structured API responses, breaking the
tool execution loop entirely.
Add `supports_native_tools()` delegation to the primary provider and
`chat_with_tools()` with the same retry/fallback logic as the existing
`chat_with_system()` and `chat_with_history()` methods.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add "astrai" to factory_all_providers_create_successfully test
- Add "astrai" => "ASTRAI_API_KEY" in provider_env_var() for onboarding
- Add Astrai to onboarding provider selection list (Gateway tier)
- Add provider_env_var("astrai") assertion in known_providers test
Addresses review comments from @chumyin on #486.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(workflows): standardize runner configuration for security jobs
* ci(actionlint): add Blacksmith runner label to config
Add blacksmith-2vcpu-ubuntu-2404 to actionlint self-hosted-runner labels config
to suppress "unknown label" warnings during workflow linting.
This label is used across all workflows after the Blacksmith migration.
* fix(actionlint): adjust indentation for self-hosted runner labels
* feat(security): enhance security workflow with CodeQL analysis steps
* fix(security): update CodeQL action to version 4 for improved analysis
* fix(security): remove duplicate permissions in security workflow
* fix(security): revert CodeQL action to v3 for stability
The v4 version was causing workflow file validation failures.
Reverting to proven v3 version that is working on main branch.
* fix(security): remove duplicate permissions causing workflow validation failure
The permissions block had duplicate security-events and actions keys,
which caused YAML validation errors and prevented workflow execution.
Fixes: workflow file validation failures on main branch
* fix(security): remove pull_request trigger to reduce costs
* fix(security): restore PR trigger but skip codeql on PRs
* fix(security): resolve YAML syntax error in security workflow
* refactor(security): split CodeQL into dedicated scheduled workflow
* fix(security): update workflow name to Rust Package Security Audit
* fix(codeql): remove push trigger, keep schedule and on-demand only
* feat(codeql): add CodeQL configuration file to ignore specific paths
* Potential fix for code scanning alert no. 39: Hard-coded cryptographic value
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
* fix(ci): resolve auto-response workflow merge markers
* fix(build): restore ChannelMessage reply_target usage
* ci(workflows): run workflow sanity on workflow pushes for all branches
* ci(workflows): rename auto-response workflow to PR Auto Responder
* ci(workflows): require owner approval for workflow file changes
* ci: add lint-first PR feedback gate
* ci(workflows): split label policy checks from workflow sanity
* ci(workflows): consolidate policy and rust workflow setup
* ci: add safe pull request intake sanity checks
* ci(security): switch audit to pinned rustsec audit-check
* fix(providers): clarify reliable failure entries for custom providers
---------
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Resolved conflicts in:
- Cargo.toml: kept both `ring` (JWT auth) and `prost` (protobuf) dependencies
- src/onboard/wizard.rs: accepted main branch version
- src/providers/mod.rs: accepted main branch version
- Cargo.lock: accepted main branch version
Note: The custom `glm::GlmProvider` from this PR was replaced with
main's OpenAiCompatibleProvider approach for GLM, which uses base URLs.
The main purpose of this PR is Windows daemon support via Task Scheduler.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add `zeroclaw providers` CLI command that lists all 28 supported AI providers
- Each entry shows: config ID, display name, local/cloud tag, active marker, and aliases
- Also shows `custom:<URL>` and `anthropic-custom:<URL>` escape hatches at the bottom
Previously users had no way to discover available providers without reading source code. The
unknown-provider error message suggests `run zeroclaw onboard --interactive` but doesn't list
options. This command gives immediate visibility.
Integrate cloud endpoint behavior into existing ollama provider flow, avoid a separate standalone doc, and keep configuration minimal via api_url/api_key.
Also align reply_target and memory trait call sites needed for current baseline compatibility.
* fix(providers): add CN/global endpoint variants for Chinese vendors
* fix(onboard): deduplicate provider key-url match arms
* chore(i18n): normalize non-English literals to English
The existing Copilot provider passes a static Bearer token, but the
Copilot API requires short-lived session tokens obtained via GitHub's
OAuth device code flow, plus mandatory editor headers.
This replaces the stub with a dedicated CopilotProvider that:
- Runs the OAuth device code flow on first use (same client ID as VS Code)
- Exchanges the OAuth token for a Copilot API key via
api.github.com/copilot_internal/v2/token
- Sends required Editor-Version/Editor-Plugin-Version headers
- Caches tokens to disk (~/.config/zeroclaw/copilot/) with auto-refresh
- Uses Mutex to prevent concurrent refresh races / duplicate device prompts
- Writes token files with 0600 permissions (owner-only)
- Respects GitHub's polling interval and code expiry from device flow
- Sanitizes error messages to prevent token leakage
- Uses async filesystem I/O (tokio::fs) throughout
- Optionally accepts a pre-supplied GitHub token via config api_key
Fixes: 403 'Access to this endpoint is forbidden'
Fixes: 400 'missing Editor-Version header for IDE auth'
Add Astrai (https://as-trai.com) as a first-class OpenAI-compatible
provider. Astrai is an AI inference router with built-in cost
optimization, PII stripping, and compliance logging.
- Register ASTRAI_API_KEY env var in resolve_api_key
- Add "astrai" entry in provider factory → as-trai.com/v1
- Add factory_astrai unit test
- Add Astrai to compatible provider test list
- Update README provider count (22+ → 23+) and list
Co-authored-by: Maya Walcher <maya.walcher@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
fix(misc): complete parking_lot::Mutex migration (fixes#505)
- DiscordChannel: store actual channel_id in ChannelMessage.channel
instead of hardcoded "discord" string
- channels/mod.rs: use msg.channel instead of msg.sender for replies
- Migrate all std::sync::Mutex to parking_lot::Mutex:
* src/security/audit.rs
* src/memory/sqlite.rs
* src/memory/response_cache.rs
* src/memory/lucid.rs
* src/channels/email_channel.rs
* src/gateway/mod.rs
* src/observability/traits.rs
* src/providers/reliable.rs
* src/providers/router.rs
* src/agent/agent.rs
- Remove all .lock().unwrap() and .map_err(PoisonError) patterns
since parking_lot::Mutex never poisons
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add `lmstudio` / `lm-studio` as a built-in provider alias for local LM Studio instances
(`http://localhost:1234/v1`)
- Uses a dummy API key when none is provided, since LM Studio does not require authentication
- Users can connect to remote LM Studio instances via `custom:http://<ip>:1234/v1`
Add ProviderCapabilities struct to enable runtime detection of
provider-specific features, starting with native tool calling support.
This is a foundational change that enables future PRs to implement
intelligent tool calling mode selection (native vs prompt-guided).
Changes:
- Add ProviderCapabilities struct with native_tool_calling field
- Add capabilities() method to Provider trait with default impl
- Add unit tests for capabilities equality and defaults
Why:
- Current design cannot distinguish providers with native tool calling
- Needed to enable Gemini/Anthropic/OpenAI native function calling
- Fully backward compatible (all providers inherit default)
What did NOT change:
- No existing Provider methods modified
- No behavior changes for existing code
- Zero breaking changes
Testing:
- cargo test: all tests passed
- cargo fmt: pass
- cargo clippy: pass