The OpenAiProvider overrode chat() with native tool support but never
overrode chat_with_tools(), which is the method called by
run_tool_call_loop in channel mode (IRC/Discord/etc). The trait default
for chat_with_tools() silently drops the tools parameter, sending plain
ChatRequest with no tools — causing the model to never use native tool
calls in channel mode.
Add chat_with_tools() override that deserializes tool specs, uses
convert_messages() for proper tool_call_id handling, and sends
NativeChatRequest with tools and tool_choice.
Also add Deserialize derive to NativeToolSpec and NativeToolFunctionSpec
to support deserialization from OpenAI-format JSON.
- add scope-aware proxy schema and runtime wiring for providers/channels/tools
- add agent callable proxy_config tool for fast proxy setup
- standardize docs system with index, template, and playbooks
Route OVHcloud through OpenAiProvider (with proper tool_call_id
serialization) instead of OpenAiCompatibleProvider, fixing tool-call
round-trips against vLLM-based endpoints.
- Add base_url field and with_base_url() constructor to OpenAiProvider
- Replace all hardcoded api.openai.com URLs with self.base_url
- Pass api_url through for the openai provider arm
- Register ovhcloud/ovh provider with env var OVH_AI_ENDPOINTS_ACCESS_TOKEN
Reasoning/thinking models (Qwen3, GLM-4, DeepSeek, etc.) may return
output in `reasoning_content` instead of `content`. Add automatic
fallback for both OpenAI and OpenAI-compatible providers, including
streaming SSE support.
Changes:
- Add `reasoning_content` field to response structs in both providers
- Add `effective_content()` helper that prefers `content` but falls
back to `reasoning_content` when content is empty/null/missing
- Update all extraction sites to use `effective_content()`
- Add streaming SSE fallback for `reasoning_content` chunks
- Add 16 focused unit tests covering all edge cases
Tested end-to-end against GLM-4.7-flash via local LLM server.
All five providers have HTTP clients but did not implement warmup(),
relying on the trait default no-op. This adds lightweight warmup calls
to establish TLS + HTTP/2 connection pools on startup, reducing
first-request latency. Each warmup is skipped when credentials are
absent, matching the OpenRouter pattern.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Switch Provider trait methods to return structured ChatResponse
- Map OpenAI-compatible tool_calls into shared ToolCall type
- Update reliable/router wrappers and provider tests for new interface
- Make agent loop prefer structured tool calls with text fallback parsing
- Adapt gateway replies to structured responses with safe tool-call fallback