Commit graph

113 commits

Author SHA1 Message Date
Vernon Stinebaker
3b0133596c feat(providers): add native tool calling for OpenAI-compatible providers
Implement chat_with_tools() on CompatibleProvider so OpenAI-compatible
endpoints (OpenRouter, local LLMs, etc.) can use structured tool calling
instead of prompt-injected tool descriptions.

Changes:
- CompatibleProvider: capabilities() reports native_tool_calling, new
  chat_with_tools() sends tools in API request and parses tool_calls
  from response, chat() bridges to chat_with_tools() when ToolSpecs
  are provided
- RouterProvider: chat_with_tools() delegation with model hint resolution
- loop_.rs: expose tools_to_openai_format as pub(crate), add
  tools_to_openai_format_from_specs for ToolSpec-based conversion

Adds 9 new tests and updates 1 existing test.
2026-02-18 18:06:36 +08:00
Chummy
bc5b1a7841 fix(providers): harden reasoning_content fallback behavior 2026-02-18 17:07:38 +08:00
Vernon Stinebaker
dd4f5271d1 feat(providers): support reasoning_content fallback for thinking models
Reasoning/thinking models (Qwen3, GLM-4, DeepSeek, etc.) may return
output in `reasoning_content` instead of `content`. Add automatic
fallback for both OpenAI and OpenAI-compatible providers, including
streaming SSE support.

Changes:
- Add `reasoning_content` field to response structs in both providers
- Add `effective_content()` helper that prefers `content` but falls
  back to `reasoning_content` when content is empty/null/missing
- Update all extraction sites to use `effective_content()`
- Add streaming SSE fallback for `reasoning_content` chunks
- Add 16 focused unit tests covering all edge cases

Tested end-to-end against GLM-4.7-flash via local LLM server.
2026-02-18 17:07:38 +08:00
Alex Gorevski
9a6fa76825 readd tests, remove markdown files 2026-02-18 14:42:39 +08:00
Edvard
1336c2f03e feat(providers): add warmup() for OpenAI, Anthropic, Gemini, Compatible, GLM
All five providers have HTTP clients but did not implement warmup(),
relying on the trait default no-op. This adds lightweight warmup calls
to establish TLS + HTTP/2 connection pools on startup, reducing
first-request latency. Each warmup is skipped when credentials are
absent, matching the OpenRouter pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:35:03 +08:00
Chummy
da7c21f469 style(anthropic): format cache conversation test block 2026-02-18 14:29:50 +08:00
tercerapersona
455eb3b847 feat: add prompt caching support to Anthropic provider
Implements Anthropic's prompt caching API to enable significant cost
reduction (up to 90%) and latency improvements (up to 85%) for
requests with repeated content.

Key features:
- Auto-caching heuristics: large system prompts (>3KB), tool
  definitions, and long conversations (>4 messages)
- Full backward compatibility: cache_control fields are optional
- Supports both string and block-array system prompt formats
- Cache control on all content types (text, tool_use, tool_result)

Implementation details:
- Added CacheControl, SystemPrompt, and SystemBlock structures
- Updated NativeContentOut and NativeToolSpec with cache_control
- Strategic cache breakpoint placement (last tool, last message)
- Comprehensive test coverage for serialization and heuristics

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
(cherry picked from commit fff04f4edb5e4cb7e581b1b16035da8cc2e55cef)
2026-02-18 14:29:50 +08:00
Chummy
d42cb1e906 fix(auth): rebase PR #200 onto main and restore auth CLI flow 2026-02-18 12:57:44 +08:00
Codex
e8aa63822a fix PR #200 review issues 2026-02-18 12:57:44 +08:00
Codex
39087a446d Fix OpenAI Codex contract, SSE parsing, and default xhigh reasoning 2026-02-18 12:57:44 +08:00
Codex
007368d586 feat(auth): add subscription auth profiles and codex/claude flows 2026-02-18 12:57:44 +08:00
Edvard
89d0fb9a1e feat(providers): implement chat_with_history for GLM provider
The GLM provider previously relied on the trait default for
chat_with_history, which only forwarded the last user message. This adds
a proper multi-turn implementation that sends the full conversation
history to the GLM API, matching the pattern used by OpenRouter, Ollama,
and other providers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:33:51 +08:00
Chummy
2560399423 feat(observability): focus PR 596 on Prometheus backend 2026-02-18 12:06:05 +08:00
argenis de la rosa
eba544dbd4 feat(observability): implement Prometheus metrics backend with /metrics endpoint
- Adds PrometheusObserver backend with counters, histograms, and gauges
- Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth
- Adds GET /metrics endpoint to gateway for Prometheus scraping
- Adds provider/model labels to AgentStart and AgentEnd events for better observability
- Adds as_any() method to Observer trait for backend-specific downcast

Metrics exposed:
- zeroclaw_agent_starts_total (Counter) with provider/model labels
- zeroclaw_agent_duration_seconds (Histogram) with provider/model labels
- zeroclaw_tool_calls_total (Counter) with tool/success labels
- zeroclaw_tool_duration_seconds (Histogram) with tool label
- zeroclaw_channel_messages_total (Counter) with channel/direction labels
- zeroclaw_heartbeat_ticks_total (Counter)
- zeroclaw_errors_total (Counter) with component label
- zeroclaw_request_latency_seconds (Histogram)
- zeroclaw_tokens_used_last (Gauge)
- zeroclaw_active_sessions (Gauge)
- zeroclaw_queue_depth (Gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:06:05 +08:00
Chummy
9e9a4a53ab style(gemini): apply rustfmt to oauth endpoint patch 2026-02-18 10:25:15 +08:00
KNIGHTABDO
1d8e57d388 fix(gemini): route OAuth tokens to cloudcode-pa.googleapis.com
Gemini CLI OAuth tokens are scoped for Google's internal Code Assist
API at cloudcode-pa.googleapis.com/v1internal, not the public
generativelanguage.googleapis.com/v1beta endpoint.

This commit:
- Routes OAuth requests to the correct internal endpoint
- Wraps the request payload with model metadata (internal API format)
- Keeps API key auth unchanged on the public endpoint

Fixes #578
2026-02-18 10:25:15 +08:00
Edvard
508fb53ac1 fix(provider): delegate native tool calling through ReliableProvider
ReliableProvider wraps underlying providers with retry/fallback logic
but did not delegate `supports_native_tools()` or `chat_with_tools()`.
This caused the agent loop to fall back to prompt-based tool calling
for all providers, even those with native tool support (OpenRouter,
OpenAI, Anthropic). Models like Gemini 2.0 Flash would then output
tool calls as text instead of structured API responses, breaking the
tool execution loop entirely.

Add `supports_native_tools()` delegation to the primary provider and
`chat_with_tools()` with the same retry/fallback logic as the existing
`chat_with_system()` and `chat_with_history()` methods.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:15:46 +08:00
Maya Walcher
c830a513a5 fix(provider): address Astrai follow-up review from #486
- Add "astrai" to factory_all_providers_create_successfully test
- Add "astrai" => "ASTRAI_API_KEY" in provider_env_var() for onboarding
- Add Astrai to onboarding provider selection list (Gateway tier)
- Add provider_env_var("astrai") assertion in known_providers test

Addresses review comments from @chumyin on #486.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:00:32 +08:00
Will Sarg
3c4ed2e28e
fix(providers): clarify reliable failure entries for custom providers (#594)
* fix(workflows): standardize runner configuration for security jobs

* ci(actionlint): add Blacksmith runner label to config

Add blacksmith-2vcpu-ubuntu-2404 to actionlint self-hosted-runner labels config
to suppress "unknown label" warnings during workflow linting.

This label is used across all workflows after the Blacksmith migration.

* fix(actionlint): adjust indentation for self-hosted runner labels

* feat(security): enhance security workflow with CodeQL analysis steps

* fix(security): update CodeQL action to version 4 for improved analysis

* fix(security): remove duplicate permissions in security workflow

* fix(security): revert CodeQL action to v3 for stability

The v4 version was causing workflow file validation failures.
Reverting to proven v3 version that is working on main branch.

* fix(security): remove duplicate permissions causing workflow validation failure

The permissions block had duplicate security-events and actions keys,
which caused YAML validation errors and prevented workflow execution.

Fixes: workflow file validation failures on main branch

* fix(security): remove pull_request trigger to reduce costs

* fix(security): restore PR trigger but skip codeql on PRs

* fix(security): resolve YAML syntax error in security workflow

* refactor(security): split CodeQL into dedicated scheduled workflow

* fix(security): update workflow name to Rust Package Security Audit

* fix(codeql): remove push trigger, keep schedule and on-demand only

* feat(codeql): add CodeQL configuration file to ignore specific paths

* Potential fix for code scanning alert no. 39: Hard-coded cryptographic value

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

* fix(ci): resolve auto-response workflow merge markers

* fix(build): restore ChannelMessage reply_target usage

* ci(workflows): run workflow sanity on workflow pushes for all branches

* ci(workflows): rename auto-response workflow to PR Auto Responder

* ci(workflows): require owner approval for workflow file changes

* ci: add lint-first PR feedback gate

* ci(workflows): split label policy checks from workflow sanity

* ci(workflows): consolidate policy and rust workflow setup

* ci: add safe pull request intake sanity checks

* ci(security): switch audit to pinned rustsec audit-check

* fix(providers): clarify reliable failure entries for custom providers

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
2026-02-17 13:53:03 -05:00
argenis de la rosa
34af6a223a Merge remote-tracking branch 'origin/main' into feat/glm-provider
Resolved conflicts in:
- Cargo.toml: kept both `ring` (JWT auth) and `prost` (protobuf) dependencies
- src/onboard/wizard.rs: accepted main branch version
- src/providers/mod.rs: accepted main branch version
- Cargo.lock: accepted main branch version

Note: The custom `glm::GlmProvider` from this PR was replaced with
main's OpenAiCompatibleProvider approach for GLM, which uses base URLs.
The main purpose of this PR is Windows daemon support via Task Scheduler.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 13:27:58 -05:00
Chummy
f97f995ac0 refactor(provider): unify China alias families across modules 2026-02-18 01:01:57 +08:00
Chummy
ce23cbaeea fix(cli): harden providers listing and keep provider map aligned 2026-02-18 00:50:51 +08:00
reidliu41
feaa4aba60 feat(cli): add zeroclaw providers command to list supported providers
- Add `zeroclaw providers` CLI command that lists all 28 supported AI providers
- Each entry shows: config ID, display name, local/cloud tag, active marker, and aliases
- Also shows `custom:<URL>` and `anthropic-custom:<URL>` escape hatches at the bottom

Previously users had no way to discover available providers without reading source code. The
unknown-provider error message suggests `run zeroclaw onboard --interactive` but doesn't list
options. This command gives immediate visibility.
2026-02-18 00:50:51 +08:00
Chummy
0aa35eb669 fix(build): complete strict lint and test cleanup (replacement for #476) 2026-02-18 00:18:54 +08:00
Chummy
fc6e8eb521
fix(provider): follow-up CN/global consistency for Z.AI and aliases (#554)
* fix(provider): harden CN/global routing consistency for Chinese vendors

* fix(agent): migrate CLI channel send to SendMessage

* fix(onboard): deduplicate Z.AI key URL match arms
2026-02-18 00:04:56 +08:00
Chummy
d94d7baa14 feat(ollama): unify local and remote endpoint routing
Integrate cloud endpoint behavior into existing ollama provider flow, avoid a separate standalone doc, and keep configuration minimal via api_url/api_key.

Also align reply_target and memory trait call sites needed for current baseline compatibility.
2026-02-17 22:52:09 +08:00
Chummy
85de9b5625
fix(provider): split CN/global endpoints for Chinese provider variants (#542)
* fix(providers): add CN/global endpoint variants for Chinese vendors

* fix(onboard): deduplicate provider key-url match arms

* chore(i18n): normalize non-English literals to English
2026-02-17 22:51:51 +08:00
Chummy
b2690f6809 feat(provider): add native tool calling API (supersedes #450)
Co-authored-by: YubinghanBai <baiyubinghan@gmail.com>
2026-02-17 22:47:10 +08:00
Will Sarg
a62c7a5893 fix(clippy): satisfy strict delta lints in SSE streaming path 2026-02-17 09:26:21 -05:00
Will Sarg
9e0958dee5 fix(ci): repair parking_lot migration regressions in PR #535 2026-02-17 09:10:40 -05:00
Will Sarg
ee05d62ce4
Merge branch 'main' into pr-484-clean 2026-02-17 08:54:24 -05:00
Chummy
01c419bb57 test(providers): keep unicode boundary test in English text 2026-02-17 21:51:58 +08:00
Khoi Tran
3c62b59a72 fix(copilot): add proper OAuth device-flow authentication
The existing Copilot provider passes a static Bearer token, but the
Copilot API requires short-lived session tokens obtained via GitHub's
OAuth device code flow, plus mandatory editor headers.

This replaces the stub with a dedicated CopilotProvider that:

- Runs the OAuth device code flow on first use (same client ID as VS Code)
- Exchanges the OAuth token for a Copilot API key via
  api.github.com/copilot_internal/v2/token
- Sends required Editor-Version/Editor-Plugin-Version headers
- Caches tokens to disk (~/.config/zeroclaw/copilot/) with auto-refresh
- Uses Mutex to prevent concurrent refresh races / duplicate device prompts
- Writes token files with 0600 permissions (owner-only)
- Respects GitHub's polling interval and code expiry from device flow
- Sanitizes error messages to prevent token leakage
- Uses async filesystem I/O (tokio::fs) throughout
- Optionally accepts a pre-supplied GitHub token via config api_key

Fixes: 403 'Access to this endpoint is forbidden'
Fixes: 400 'missing Editor-Version header for IDE auth'
2026-02-17 21:51:58 +08:00
beee003
8ad5b6146b
feat: add Astrai as a named provider (#486)
Add Astrai (https://as-trai.com) as a first-class OpenAI-compatible
provider. Astrai is an AI inference router with built-in cost
optimization, PII stripping, and compliance logging.

- Register ASTRAI_API_KEY env var in resolve_api_key
- Add "astrai" entry in provider factory → as-trai.com/v1
- Add factory_astrai unit test
- Add Astrai to compatible provider test list
- Update README provider count (22+ → 23+) and list

Co-authored-by: Maya Walcher <maya.walcher@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:22:38 -05:00
argenis de la rosa
1908af3248 fix(discord): use channel_id instead of sender for replies (fixes #483)
fix(misc): complete parking_lot::Mutex migration (fixes #505)

- DiscordChannel: store actual channel_id in ChannelMessage.channel
  instead of hardcoded "discord" string
- channels/mod.rs: use msg.channel instead of msg.sender for replies
- Migrate all std::sync::Mutex to parking_lot::Mutex:
  * src/security/audit.rs
  * src/memory/sqlite.rs
  * src/memory/response_cache.rs
  * src/memory/lucid.rs
  * src/channels/email_channel.rs
  * src/gateway/mod.rs
  * src/observability/traits.rs
  * src/providers/reliable.rs
  * src/providers/router.rs
  * src/agent/agent.rs
- Remove all .lock().unwrap() and .map_err(PoisonError) patterns
  since parking_lot::Mutex never poisons

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:05:25 -05:00
reidliu41
77640e2198 feat(provider): add LM Studio provider alias
- Add `lmstudio` / `lm-studio` as a built-in provider alias for local LM Studio instances
(`http://localhost:1234/v1`)
- Uses a dummy API key when none is provided, since LM Studio does not require authentication
- Users can connect to remote LM Studio instances via `custom:http://<ip>:1234/v1`
2026-02-17 20:02:40 +08:00
DeadManAI
4fca1abee8 fix: resolve all clippy warnings, formatting, and Mistral endpoint
- Fix Mistral provider base URL (missing /v1 prefix caused 404s)
- Resolve 55 clippy warnings across 28 warning types
- Apply cargo fmt to 44 formatting violations
- Remove unused imports (process_message, MultiObserver, VerboseObserver,
  ChatResponse, ToolCall, Path, TempDir)
- Replace format!+push_str with write! macro
- Fix unchecked Duration subtraction, redundant closures, clamp patterns
- Declare missing feature flags (sandbox-landlock, sandbox-bubblewrap,
  browser-native) in Cargo.toml
- Derive Default where manual impls were redundant
- Add separators to long numeric literals (115200 → 115_200)
- Restructure unreachable code in arduino_flash platform branches

All 1,500 tests pass. Zero clippy warnings. Clean formatting.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 20:00:08 +08:00
Chummy
0087bcc496 fix(security): resolve rebase conflicts and provider regressions 2026-02-17 19:19:06 +08:00
Chummy
5d131a8903 fix(security): tighten provider credential log hygiene
- remove as_deref credential routing path in provider factory
- avoid raw provider error text in warmup/retry failure summaries
- keep retry telemetry while reducing secret propagation risk
2026-02-17 19:19:06 +08:00
Chummy
a1bb72767a fix(security): remove provider init error detail logging 2026-02-17 19:19:06 +08:00
Chummy
e5a8cd3f57 fix(ci): suppress option_as_ref_deref on credential refs 2026-02-17 19:19:06 +08:00
Chummy
a6ca68a4fb fix(ci): satisfy strict lint delta on security follow-ups 2026-02-17 19:19:06 +08:00
Chummy
60d81fb706 fix(security): reduce residual CodeQL logging flows
- remove secret-presence logging path in gateway startup output
- reduce credential-derived warning path in provider fallback setup
- avoid as_deref credential propagation in delegate/provider wiring
- harden Composio error rendering to avoid raw body leakage
- simplify onboarding secrets status output to non-sensitive wording
2026-02-17 19:19:06 +08:00
Chummy
1711f140be fix(security): remediate unassigned CodeQL findings
- harden URL/request handling for composio and whatsapp integrations
- reduce cleartext logging exposure across providers/tools/gateway
- hash and constant-time compare gateway webhook secrets
- expand nested secret encryption coverage in config
- align feature aliases and add regression tests for security paths
- fix bubblewrap all-features test invocation surfaced during deep validation
2026-02-17 19:19:06 +08:00
Chummy
f9d681063d fix(fmt): align providers test formatting with rustfmt 2026-02-17 19:10:09 +08:00
Chummy
e9e45acd6d providers: map native tool support from capabilities 2026-02-17 18:59:04 +08:00
YubinghanBai
b5869d424e feat(provider): add capabilities detection mechanism
Add ProviderCapabilities struct to enable runtime detection of
provider-specific features, starting with native tool calling support.

This is a foundational change that enables future PRs to implement
intelligent tool calling mode selection (native vs prompt-guided).

Changes:
- Add ProviderCapabilities struct with native_tool_calling field
- Add capabilities() method to Provider trait with default impl
- Add unit tests for capabilities equality and defaults

Why:
- Current design cannot distinguish providers with native tool calling
- Needed to enable Gemini/Anthropic/OpenAI native function calling
- Fully backward compatible (all providers inherit default)

What did NOT change:
- No existing Provider methods modified
- No behavior changes for existing code
- Zero breaking changes

Testing:
- cargo test: all tests passed
- cargo fmt: pass
- cargo clippy: pass
2026-02-17 18:59:04 +08:00
Chummy
42fa802bad fix(ollama): sanitize provider payload logging 2026-02-17 18:48:45 +08:00
Kieran
1c0d7bbcb8 feat: ollama tools 2026-02-17 18:48:45 +08:00
Kieran
808450c48e feat: custom global api_url 2026-02-17 18:48:45 +08:00