Commit graph

146 commits

Author SHA1 Message Date
Chummy
772bb15ed9 fix(tests): stabilize issue #868 model refresh regression 2026-02-19 19:15:08 +08:00
Aleksandr Prilipko
5dd11e6b0f fix(provider): use output_text content type for assistant messages in Codex history
The OpenAI Responses API requires assistant messages to use content type
"output_text" while user messages use "input_text". The prior implementation
used "input_text" for both roles, causing 400 errors on multi-turn history.

Extract build_responses_input() helper for testability and add 3 unit tests
covering role→content-type mapping, default instructions, and unknown roles.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:04:02 +08:00
Aleksandr Prilipko
1b57be7223 fix(provider): implement chat_with_history for OpenAI Codex and Gemini
Both providers only implemented chat_with_system, so the default
chat_with_history trait method was discarding all conversation history
except the last user message. This caused the Telegram bot to lose
context between messages.

Changes:
- OpenAiCodexProvider: extract send_responses_request helper, add
  chat_with_history that maps full ChatMessage history to ResponsesInput
- GeminiProvider: extract send_generate_content helper, add
  chat_with_history that maps ChatMessage history to Gemini Content
  (with assistant→model role mapping)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:04:02 +08:00
Chummy
a0098de28c fix(bedrock): normalize aws-bedrock alias and harden docs/tests 2026-02-19 19:01:45 +08:00
KevinZhao
0e4e0d590d feat(provider): add dedicated AWS Bedrock Converse API provider
Replace the non-functional OpenAI-compatible stub with a purpose-built
Bedrock provider that implements AWS SigV4 signing from first principles
using hmac/sha2/hex crates — no AWS SDK dependency.

Key capabilities:
- SigV4 authentication (AKSK + optional session token)
- Converse API with native tool calling support
- Prompt caching via cachePoint heuristics
- Proper URI encoding for model IDs containing colons
- Resilient response parsing with unknown block type fallback

Also updates:
- Factory wiring and credential resolution bypass for AKSK auth
- Onboard wizard with Bedrock-specific model selection and guidance
- Provider reference docs with auth, region, and model ID details

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-19 19:01:45 +08:00
Chummy
ba018a38ef chore(provider): normalize fallback test comments to ASCII punctuation 2026-02-19 18:43:45 +08:00
Chummy
435c33d408 fix(provider): preserve fallback runtime options when resolving credentials 2026-02-19 18:43:45 +08:00
Vernon Stinebaker
bb22bdc8fb fix(provider): resolve fallback provider credentials independently
Fallback providers in create_resilient_provider_with_options() were
created via create_provider_with_options() which passed the primary
provider's api_key as credential_override.  This caused
resolve_provider_credential() to short-circuit on the override and
never check the fallback provider's own env var (e.g. DEEPSEEK_API_KEY
for a deepseek fallback), resulting in auth failures (401) when the
primary and fallback use different API services.

Switch to create_provider_with_url(fallback, None, None) so each
fallback resolves its own credential via provider-specific env vars.
This also enables custom: URL prefixes (e.g.
custom:http://host.docker.internal:1234/v1) to work as fallback
entries, which was previously impossible through the options path.

Add three focused tests covering independent credential resolution,
custom URL fallbacks, and mixed fallback chains.
2026-02-19 18:43:45 +08:00
Chummy
1aec9ad9c0 fix(rebase): resolve duplicate tests and gateway AppState fields 2026-02-19 18:03:09 +08:00
Chummy
d6dca4b890 fix(provider): align native tool system-flattening and add regressions 2026-02-19 17:44:07 +08:00
Chummy
ff254b4bb3 fix(provider): harden think-tag fallback and add edge-case tests 2026-02-19 16:54:52 +08:00
YubinghanBai
db7b24b319 fix(provider): strip <think> tags and merge system messages for MiniMax
MiniMax API rejects role: system in the messages array with error
2013 (invalid message role: system). In channel mode, the history
builder prepends a system message and optionally appends a second
one for delivery instructions, causing 400 errors on every channel
turn.

Additionally, MiniMax reasoning models embed chain-of-thought in
the content field as <think>...</think> blocks rather than using
the separate reasoning_content field, causing raw thinking output
to leak into user-visible responses.

Changes:
- Add merge_system_into_user flag to OpenAiCompatibleProvider;
  when set, all system messages are concatenated and prepended to
  the first user message before sending to the API
- Add new_merge_system_into_user() constructor used by MiniMax
- Add strip_think_tags() helper that removes <think>...</think>
  blocks from response content before returning to the caller
- Apply strip_think_tags in effective_content() and
  effective_content_optional() so all non-streaming paths are covered
- Update MiniMax factory registration to use new_merge_system_into_user
- Fix pre-existing rustfmt violation on apply_auth_header call

All other providers continue to use the default path unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-19 16:54:52 +08:00
Chummy
1461b00ad1 fix(provider): fallback to responses on chat transport errors 2026-02-19 15:42:38 +08:00
Chummy
78e0594e5f fix(openai): align chat_with_tools with http client and strict tool parsing 2026-02-19 15:11:18 +08:00
Lucien Loiseau
f76c1226f1 fix(providers): implement chat_with_tools for OpenAiProvider
The OpenAiProvider overrode chat() with native tool support but never
overrode chat_with_tools(), which is the method called by
run_tool_call_loop in channel mode (IRC/Discord/etc). The trait default
for chat_with_tools() silently drops the tools parameter, sending plain
ChatRequest with no tools — causing the model to never use native tool
calls in channel mode.

Add chat_with_tools() override that deserializes tool specs, uses
convert_messages() for proper tool_call_id handling, and sends
NativeChatRequest with tools and tool_choice.

Also add Deserialize derive to NativeToolSpec and NativeToolFunctionSpec
to support deserialization from OpenAI-format JSON.
2026-02-19 15:11:18 +08:00
Jayson Reis
b9af601943 chore: Remove blocking read strings 2026-02-19 14:52:29 +08:00
Chummy
cf476a81c1 fix(provider): preserve native Ollama tool history structure 2026-02-19 14:32:43 +08:00
reidliu41
cd59dc65c4 fix(provider): enable native tool calling for OllamaProvider 2026-02-19 14:32:43 +08:00
Alex Gorevski
4a9fc9b6cc fix(security): prevent cleartext logging of sensitive data
Address CodeQL rust/cleartext-logging alerts by breaking data-flow taint
chains from sensitive variables (api_key, credential, session_id, user_id)
to log/print sinks. Changes include:

- Replace tainted profile IDs in println! with untainted local variables
- Add redact() helper for safe logging of sensitive values
- Redact account identifiers in auth status output
- Rename session_id locals in memory backends to break name-based taint
- Rename user_id/user_id_hint in channels to break name-based taint
- Custom Debug impl for ComputerUseConfig to redact api_key field
- Break taint chain in provider credential factory via string reconstruction
- Remove client IP from gateway rate-limit log messages
- Break taint on auth token extraction and wizard credential flow
- Rename composio account ref variable to break name-based taint

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-18 20:12:45 -08:00
Youhana Sheriff
b43e9eb325 fix(provider): polish kimi-code wiring and onboarding parity 2026-02-19 01:15:02 +08:00
Youhana Sheriff
cb91a2f914 feat(provider): add dedicated kimi-code provider support 2026-02-19 01:15:02 +08:00
Chummy
e8e9c0ea6c Revert "feat(provider): add dedicated kimi-code provider support"
This reverts commit 88dcd17a30.
2026-02-19 01:15:02 +08:00
Chummy
5563b755dc Revert "fix(provider): polish kimi-code wiring and onboarding parity"
This reverts commit 0b66ed026c.
2026-02-19 01:15:02 +08:00
Chummy
e1aeabdb5f fix(providers): align compatible chat client and response test 2026-02-18 22:50:02 +08:00
Chummy
b4b379e3e7 fix(providers): harden tool fallback and refresh model catalogs 2026-02-18 22:50:02 +08:00
Chummy
0bd2fbba2a feat(providers): add MiniMax OAuth credential flow 2026-02-18 22:31:20 +08:00
Chummy
0b66ed026c fix(provider): polish kimi-code wiring and onboarding parity 2026-02-18 22:22:10 +08:00
Chummy
88dcd17a30 feat(provider): add dedicated kimi-code provider support 2026-02-18 22:22:10 +08:00
Chummy
ce104bed45 feat(proxy): add scoped proxy configuration and docs runbooks
- add scope-aware proxy schema and runtime wiring for providers/channels/tools

- add agent callable proxy_config tool for fast proxy setup

- standardize docs system with index, template, and playbooks
2026-02-18 22:10:42 +08:00
Chummy
58acf1efd3 fix(provider): surface actionable custom-provider failure diagnostics 2026-02-18 21:50:14 +08:00
Lucien Loiseau
6062888d1b feat(providers): add OVHcloud AI Endpoints as native provider
Route OVHcloud through OpenAiProvider (with proper tool_call_id
serialization) instead of OpenAiCompatibleProvider, fixing tool-call
round-trips against vLLM-based endpoints.

- Add base_url field and with_base_url() constructor to OpenAiProvider
- Replace all hardcoded api.openai.com URLs with self.base_url
- Pass api_url through for the openai provider arm
- Register ovhcloud/ovh provider with env var OVH_AI_ENDPOINTS_ACCESS_TOKEN
2026-02-18 20:54:49 +08:00
Chummy
50fd5b81e1 fix(test): stabilize cron output capture and clippy cleanups 2026-02-18 20:29:26 +08:00
Chummy
c70d9b181d test: stabilize cron shell output capture and gemini warmup noop 2026-02-18 19:26:07 +08:00
Vernon Stinebaker
3b0133596c feat(providers): add native tool calling for OpenAI-compatible providers
Implement chat_with_tools() on CompatibleProvider so OpenAI-compatible
endpoints (OpenRouter, local LLMs, etc.) can use structured tool calling
instead of prompt-injected tool descriptions.

Changes:
- CompatibleProvider: capabilities() reports native_tool_calling, new
  chat_with_tools() sends tools in API request and parses tool_calls
  from response, chat() bridges to chat_with_tools() when ToolSpecs
  are provided
- RouterProvider: chat_with_tools() delegation with model hint resolution
- loop_.rs: expose tools_to_openai_format as pub(crate), add
  tools_to_openai_format_from_specs for ToolSpec-based conversion

Adds 9 new tests and updates 1 existing test.
2026-02-18 18:06:36 +08:00
Chummy
bc5b1a7841 fix(providers): harden reasoning_content fallback behavior 2026-02-18 17:07:38 +08:00
Vernon Stinebaker
dd4f5271d1 feat(providers): support reasoning_content fallback for thinking models
Reasoning/thinking models (Qwen3, GLM-4, DeepSeek, etc.) may return
output in `reasoning_content` instead of `content`. Add automatic
fallback for both OpenAI and OpenAI-compatible providers, including
streaming SSE support.

Changes:
- Add `reasoning_content` field to response structs in both providers
- Add `effective_content()` helper that prefers `content` but falls
  back to `reasoning_content` when content is empty/null/missing
- Update all extraction sites to use `effective_content()`
- Add streaming SSE fallback for `reasoning_content` chunks
- Add 16 focused unit tests covering all edge cases

Tested end-to-end against GLM-4.7-flash via local LLM server.
2026-02-18 17:07:38 +08:00
Alex Gorevski
9a6fa76825 readd tests, remove markdown files 2026-02-18 14:42:39 +08:00
Edvard
1336c2f03e feat(providers): add warmup() for OpenAI, Anthropic, Gemini, Compatible, GLM
All five providers have HTTP clients but did not implement warmup(),
relying on the trait default no-op. This adds lightweight warmup calls
to establish TLS + HTTP/2 connection pools on startup, reducing
first-request latency. Each warmup is skipped when credentials are
absent, matching the OpenRouter pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:35:03 +08:00
Chummy
da7c21f469 style(anthropic): format cache conversation test block 2026-02-18 14:29:50 +08:00
tercerapersona
455eb3b847 feat: add prompt caching support to Anthropic provider
Implements Anthropic's prompt caching API to enable significant cost
reduction (up to 90%) and latency improvements (up to 85%) for
requests with repeated content.

Key features:
- Auto-caching heuristics: large system prompts (>3KB), tool
  definitions, and long conversations (>4 messages)
- Full backward compatibility: cache_control fields are optional
- Supports both string and block-array system prompt formats
- Cache control on all content types (text, tool_use, tool_result)

Implementation details:
- Added CacheControl, SystemPrompt, and SystemBlock structures
- Updated NativeContentOut and NativeToolSpec with cache_control
- Strategic cache breakpoint placement (last tool, last message)
- Comprehensive test coverage for serialization and heuristics

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
(cherry picked from commit fff04f4edb5e4cb7e581b1b16035da8cc2e55cef)
2026-02-18 14:29:50 +08:00
Chummy
d42cb1e906 fix(auth): rebase PR #200 onto main and restore auth CLI flow 2026-02-18 12:57:44 +08:00
Codex
e8aa63822a fix PR #200 review issues 2026-02-18 12:57:44 +08:00
Codex
39087a446d Fix OpenAI Codex contract, SSE parsing, and default xhigh reasoning 2026-02-18 12:57:44 +08:00
Codex
007368d586 feat(auth): add subscription auth profiles and codex/claude flows 2026-02-18 12:57:44 +08:00
Edvard
89d0fb9a1e feat(providers): implement chat_with_history for GLM provider
The GLM provider previously relied on the trait default for
chat_with_history, which only forwarded the last user message. This adds
a proper multi-turn implementation that sends the full conversation
history to the GLM API, matching the pattern used by OpenRouter, Ollama,
and other providers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:33:51 +08:00
Chummy
2560399423 feat(observability): focus PR 596 on Prometheus backend 2026-02-18 12:06:05 +08:00
argenis de la rosa
eba544dbd4 feat(observability): implement Prometheus metrics backend with /metrics endpoint
- Adds PrometheusObserver backend with counters, histograms, and gauges
- Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth
- Adds GET /metrics endpoint to gateway for Prometheus scraping
- Adds provider/model labels to AgentStart and AgentEnd events for better observability
- Adds as_any() method to Observer trait for backend-specific downcast

Metrics exposed:
- zeroclaw_agent_starts_total (Counter) with provider/model labels
- zeroclaw_agent_duration_seconds (Histogram) with provider/model labels
- zeroclaw_tool_calls_total (Counter) with tool/success labels
- zeroclaw_tool_duration_seconds (Histogram) with tool label
- zeroclaw_channel_messages_total (Counter) with channel/direction labels
- zeroclaw_heartbeat_ticks_total (Counter)
- zeroclaw_errors_total (Counter) with component label
- zeroclaw_request_latency_seconds (Histogram)
- zeroclaw_tokens_used_last (Gauge)
- zeroclaw_active_sessions (Gauge)
- zeroclaw_queue_depth (Gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:06:05 +08:00
Chummy
9e9a4a53ab style(gemini): apply rustfmt to oauth endpoint patch 2026-02-18 10:25:15 +08:00
KNIGHTABDO
1d8e57d388 fix(gemini): route OAuth tokens to cloudcode-pa.googleapis.com
Gemini CLI OAuth tokens are scoped for Google's internal Code Assist
API at cloudcode-pa.googleapis.com/v1internal, not the public
generativelanguage.googleapis.com/v1beta endpoint.

This commit:
- Routes OAuth requests to the correct internal endpoint
- Wraps the request payload with model metadata (internal API format)
- Keeps API key auth unchanged on the public endpoint

Fixes #578
2026-02-18 10:25:15 +08:00
Edvard
508fb53ac1 fix(provider): delegate native tool calling through ReliableProvider
ReliableProvider wraps underlying providers with retry/fallback logic
but did not delegate `supports_native_tools()` or `chat_with_tools()`.
This caused the agent loop to fall back to prompt-based tool calling
for all providers, even those with native tool support (OpenRouter,
OpenAI, Anthropic). Models like Gemini 2.0 Flash would then output
tool calls as text instead of structured API responses, breaking the
tool execution loop entirely.

Add `supports_native_tools()` delegation to the primary provider and
`chat_with_tools()` with the same retry/fallback logic as the existing
`chat_with_system()` and `chat_with_history()` methods.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 10:15:46 +08:00