* fix(gemini): align OAuth cloudcode payload and response parsing * docs(gemini): document OAuth vs API key endpoint behavior
254 lines
8.5 KiB
Markdown
254 lines
8.5 KiB
Markdown
# ZeroClaw Providers Reference
|
|
|
|
This document maps provider IDs, aliases, and credential environment variables.
|
|
|
|
Last verified: **February 19, 2026**.
|
|
|
|
## How to List Providers
|
|
|
|
```bash
|
|
zeroclaw providers
|
|
```
|
|
|
|
## Credential Resolution Order
|
|
|
|
Runtime resolution order is:
|
|
|
|
1. Explicit credential from config/CLI
|
|
2. Provider-specific env var(s)
|
|
3. Generic fallback env vars: `ZEROCLAW_API_KEY` then `API_KEY`
|
|
|
|
For resilient fallback chains (`reliability.fallback_providers`), each fallback
|
|
provider resolves credentials independently. The primary provider's explicit
|
|
credential is not reused for fallback providers.
|
|
|
|
## Provider Catalog
|
|
|
|
| Canonical ID | Aliases | Local | Provider-specific env var(s) |
|
|
|---|---|---:|---|
|
|
| `openrouter` | — | No | `OPENROUTER_API_KEY` |
|
|
| `anthropic` | — | No | `ANTHROPIC_OAUTH_TOKEN`, `ANTHROPIC_API_KEY` |
|
|
| `openai` | — | No | `OPENAI_API_KEY` |
|
|
| `ollama` | — | Yes | `OLLAMA_API_KEY` (optional) |
|
|
| `gemini` | `google`, `google-gemini` | No | `GEMINI_API_KEY`, `GOOGLE_API_KEY` |
|
|
| `venice` | — | No | `VENICE_API_KEY` |
|
|
| `vercel` | `vercel-ai` | No | `VERCEL_API_KEY` |
|
|
| `cloudflare` | `cloudflare-ai` | No | `CLOUDFLARE_API_KEY` |
|
|
| `moonshot` | `kimi` | No | `MOONSHOT_API_KEY` |
|
|
| `kimi-code` | `kimi_coding`, `kimi_for_coding` | No | `KIMI_CODE_API_KEY`, `MOONSHOT_API_KEY` |
|
|
| `synthetic` | — | No | `SYNTHETIC_API_KEY` |
|
|
| `opencode` | `opencode-zen` | No | `OPENCODE_API_KEY` |
|
|
| `zai` | `z.ai` | No | `ZAI_API_KEY` |
|
|
| `glm` | `zhipu` | No | `GLM_API_KEY` |
|
|
| `minimax` | `minimax-intl`, `minimax-io`, `minimax-global`, `minimax-cn`, `minimaxi`, `minimax-oauth`, `minimax-oauth-cn`, `minimax-portal`, `minimax-portal-cn` | No | `MINIMAX_OAUTH_TOKEN`, `MINIMAX_API_KEY` |
|
|
| `bedrock` | `aws-bedrock` | No | `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` (optional: `AWS_REGION`) |
|
|
| `qianfan` | `baidu` | No | `QIANFAN_API_KEY` |
|
|
| `qwen` | `dashscope`, `qwen-intl`, `dashscope-intl`, `qwen-us`, `dashscope-us`, `qwen-code`, `qwen-oauth`, `qwen_oauth` | No | `QWEN_OAUTH_TOKEN`, `DASHSCOPE_API_KEY` |
|
|
| `groq` | — | No | `GROQ_API_KEY` |
|
|
| `mistral` | — | No | `MISTRAL_API_KEY` |
|
|
| `xai` | `grok` | No | `XAI_API_KEY` |
|
|
| `deepseek` | — | No | `DEEPSEEK_API_KEY` |
|
|
| `together` | `together-ai` | No | `TOGETHER_API_KEY` |
|
|
| `fireworks` | `fireworks-ai` | No | `FIREWORKS_API_KEY` |
|
|
| `perplexity` | — | No | `PERPLEXITY_API_KEY` |
|
|
| `cohere` | — | No | `COHERE_API_KEY` |
|
|
| `copilot` | `github-copilot` | No | (use config/`API_KEY` fallback with GitHub token) |
|
|
| `lmstudio` | `lm-studio` | Yes | (optional; local by default) |
|
|
| `nvidia` | `nvidia-nim`, `build.nvidia.com` | No | `NVIDIA_API_KEY` |
|
|
|
|
### Gemini Notes
|
|
|
|
- Provider ID: `gemini` (aliases: `google`, `google-gemini`)
|
|
- Auth can come from `GEMINI_API_KEY`, `GOOGLE_API_KEY`, or Gemini CLI OAuth cache (`~/.gemini/oauth_creds.json`)
|
|
- API key requests use `generativelanguage.googleapis.com/v1beta`
|
|
- Gemini CLI OAuth requests use `cloudcode-pa.googleapis.com/v1internal` with Code Assist request envelope semantics
|
|
|
|
### Ollama Vision Notes
|
|
|
|
- Provider ID: `ollama`
|
|
- Vision input is supported through user message image markers: ``[IMAGE:<source>]``.
|
|
- After multimodal normalization, ZeroClaw sends image payloads through Ollama's native `messages[].images` field.
|
|
- If a non-vision provider is selected, ZeroClaw returns a structured capability error instead of silently ignoring images.
|
|
|
|
### Bedrock Notes
|
|
|
|
- Provider ID: `bedrock` (alias: `aws-bedrock`)
|
|
- API: [Converse API](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_Converse.html)
|
|
- Authentication: AWS AKSK (not a single API key). Set `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` environment variables.
|
|
- Optional: `AWS_SESSION_TOKEN` for temporary/STS credentials, `AWS_REGION` or `AWS_DEFAULT_REGION` (default: `us-east-1`).
|
|
- Default onboarding model: `anthropic.claude-sonnet-4-5-20250929-v1:0`
|
|
- Supports native tool calling and prompt caching (`cachePoint`).
|
|
- Cross-region inference profiles supported (e.g., `us.anthropic.claude-*`).
|
|
- Model IDs use Bedrock format: `anthropic.claude-sonnet-4-6`, `anthropic.claude-opus-4-6-v1`, etc.
|
|
|
|
### Ollama Reasoning Toggle
|
|
|
|
You can control Ollama reasoning/thinking behavior from `config.toml`:
|
|
|
|
```toml
|
|
[runtime]
|
|
reasoning_enabled = false
|
|
```
|
|
|
|
Behavior:
|
|
|
|
- `false`: sends `think: false` to Ollama `/api/chat` requests.
|
|
- `true`: sends `think: true`.
|
|
- Unset: omits `think` and keeps Ollama/model defaults.
|
|
|
|
### Kimi Code Notes
|
|
|
|
- Provider ID: `kimi-code`
|
|
- Endpoint: `https://api.kimi.com/coding/v1`
|
|
- Default onboarding model: `kimi-for-coding` (alternative: `kimi-k2.5`)
|
|
- Runtime auto-adds `User-Agent: KimiCLI/0.77` for compatibility.
|
|
|
|
### NVIDIA NIM Notes
|
|
|
|
- Canonical provider ID: `nvidia`
|
|
- Aliases: `nvidia-nim`, `build.nvidia.com`
|
|
- Base API URL: `https://integrate.api.nvidia.com/v1`
|
|
- Model discovery: `zeroclaw models refresh --provider nvidia`
|
|
|
|
Recommended starter model IDs (verified against NVIDIA API catalog on February 18, 2026):
|
|
|
|
- `meta/llama-3.3-70b-instruct`
|
|
- `deepseek-ai/deepseek-v3.2`
|
|
- `nvidia/llama-3.3-nemotron-super-49b-v1.5`
|
|
- `nvidia/llama-3.1-nemotron-ultra-253b-v1`
|
|
|
|
## Custom Endpoints
|
|
|
|
- OpenAI-compatible endpoint:
|
|
|
|
```toml
|
|
default_provider = "custom:https://your-api.example.com"
|
|
```
|
|
|
|
- Anthropic-compatible endpoint:
|
|
|
|
```toml
|
|
default_provider = "anthropic-custom:https://your-api.example.com"
|
|
```
|
|
|
|
## MiniMax OAuth Setup (config.toml)
|
|
|
|
Set the MiniMax provider and OAuth placeholder in config:
|
|
|
|
```toml
|
|
default_provider = "minimax-oauth"
|
|
api_key = "minimax-oauth"
|
|
```
|
|
|
|
Then provide one of the following credentials via environment variables:
|
|
|
|
- `MINIMAX_OAUTH_TOKEN` (preferred, direct access token)
|
|
- `MINIMAX_API_KEY` (legacy/static token)
|
|
- `MINIMAX_OAUTH_REFRESH_TOKEN` (auto-refreshes access token at startup)
|
|
|
|
Optional:
|
|
|
|
- `MINIMAX_OAUTH_REGION=global` or `cn` (defaults by provider alias)
|
|
- `MINIMAX_OAUTH_CLIENT_ID` to override the default OAuth client id
|
|
|
|
Channel compatibility note:
|
|
|
|
- For MiniMax-backed channel conversations, runtime history is normalized to keep valid `user`/`assistant` turn order.
|
|
- Channel-specific delivery guidance (for example Telegram attachment markers) is merged into the leading system prompt instead of being appended as a trailing `system` turn.
|
|
|
|
## Qwen Code OAuth Setup (config.toml)
|
|
|
|
Set Qwen Code OAuth mode in config:
|
|
|
|
```toml
|
|
default_provider = "qwen-code"
|
|
api_key = "qwen-oauth"
|
|
```
|
|
|
|
Credential resolution for `qwen-code`:
|
|
|
|
1. Explicit `api_key` value (if not the placeholder `qwen-oauth`)
|
|
2. `QWEN_OAUTH_TOKEN`
|
|
3. `~/.qwen/oauth_creds.json` (reuses Qwen Code cached OAuth credentials)
|
|
4. Optional refresh via `QWEN_OAUTH_REFRESH_TOKEN` (or cached refresh token)
|
|
5. If no OAuth placeholder is used, `DASHSCOPE_API_KEY` can still be used as fallback
|
|
|
|
Optional endpoint override:
|
|
|
|
- `QWEN_OAUTH_RESOURCE_URL` (normalized to `https://.../v1` if needed)
|
|
- If unset, `resource_url` from cached OAuth credentials is used when available
|
|
|
|
## Model Routing (`hint:<name>`)
|
|
|
|
You can route model calls by hint using `[[model_routes]]`:
|
|
|
|
```toml
|
|
[[model_routes]]
|
|
hint = "reasoning"
|
|
provider = "openrouter"
|
|
model = "anthropic/claude-opus-4-20250514"
|
|
|
|
[[model_routes]]
|
|
hint = "fast"
|
|
provider = "groq"
|
|
model = "llama-3.3-70b-versatile"
|
|
```
|
|
|
|
Then call with a hint model name (for example from tool or integration paths):
|
|
|
|
```text
|
|
hint:reasoning
|
|
```
|
|
|
|
## Embedding Routing (`hint:<name>`)
|
|
|
|
You can route embedding calls with the same hint pattern using `[[embedding_routes]]`.
|
|
Set `[memory].embedding_model` to a `hint:<name>` value to activate routing.
|
|
|
|
```toml
|
|
[memory]
|
|
embedding_model = "hint:semantic"
|
|
|
|
[[embedding_routes]]
|
|
hint = "semantic"
|
|
provider = "openai"
|
|
model = "text-embedding-3-small"
|
|
dimensions = 1536
|
|
|
|
[[embedding_routes]]
|
|
hint = "archive"
|
|
provider = "custom:https://embed.example.com/v1"
|
|
model = "your-embedding-model-id"
|
|
dimensions = 1024
|
|
```
|
|
|
|
Supported embedding providers:
|
|
|
|
- `none`
|
|
- `openai`
|
|
- `custom:<url>` (OpenAI-compatible embeddings endpoint)
|
|
|
|
Optional per-route key override:
|
|
|
|
```toml
|
|
[[embedding_routes]]
|
|
hint = "semantic"
|
|
provider = "openai"
|
|
model = "text-embedding-3-small"
|
|
api_key = "sk-route-specific"
|
|
```
|
|
|
|
## Upgrading Models Safely
|
|
|
|
Use stable hints and update only route targets when providers deprecate model IDs.
|
|
|
|
Recommended workflow:
|
|
|
|
1. Keep call sites stable (`hint:reasoning`, `hint:semantic`).
|
|
2. Change only the target model under `[[model_routes]]` or `[[embedding_routes]]`.
|
|
3. Run:
|
|
- `zeroclaw doctor`
|
|
- `zeroclaw status`
|
|
4. Smoke test one representative flow (chat + memory retrieval) before rollout.
|
|
|
|
This minimizes breakage because integrations and prompts do not need to change when model IDs are upgraded.
|