Commit graph

693 commits

Author SHA1 Message Date
Chummy
3a25f4fa3a ci(labeler): enforce ordered gradient palette and compact module labels 2026-02-16 19:52:14 +08:00
Chummy
004fc4590f ci(labeler): compact noisy module labels for tool/provider/channel 2026-02-16 19:49:45 +08:00
Chummy
389496823d
ci(labeler): dedupe labels, add hover rules, and tune low-sat palette (#6)
* ci(labeler): dedupe scope labels and prioritize risk/size

* ci(labeler): add hover rule descriptions and refresh label palette

* style(labeler): reduce label saturation for better readability
2026-02-16 19:46:22 +08:00
chumyin
dedb465377 test(telegram): ensure newline split case exceeds max length 2026-02-16 19:36:39 +08:00
chumyin
2d6ec2fb71 fix(rebase): resolve PR #266 conflicts against latest main 2026-02-16 19:33:04 +08:00
chumyin
34306e32d8 fix(provider): complete ChatResponse integration across runtime surfaces 2026-02-16 19:18:12 +08:00
chumyin
3b4a4de457 refactor(provider): unify Provider responses with ChatResponse
- Switch Provider trait methods to return structured ChatResponse
- Map OpenAI-compatible tool_calls into shared ToolCall type
- Update reliable/router wrappers and provider tests for new interface
- Make agent loop prefer structured tool calls with text fallback parsing
- Adapt gateway replies to structured responses with safe tool-call fallback
2026-02-16 19:16:22 +08:00
Mgrsc
b3fcdad3b5
fix: use consistent <tool_call> tag in channel system prompt (#305)
The tool use protocol in channels/mod.rs was using <invoke> tags,
but the parser in agent/loop_.rs only recognizes <tool_call> tags.
This ensures consistency across all entry points.
2026-02-16 05:59:40 -05:00
Abdul Samad
4fd1408034
fix(telegram): add message splitting, timeout, and validation fixes (#246)
High-priority fixes:
- Message length validation and splitting (4096 char limit)
- Empty chat_id validation to prevent silent failures
- Health check timeout (5s) to prevent service hangs

Testing infrastructure:
- Comprehensive test suite (20+ automated tests)
- Quick smoke test script
- Test message generator
- Complete testing documentation

All changes are backward compatible.

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-16 05:59:11 -05:00
mai1015
50f508766f
feat: add verbose logging and complete observability (#251) 2026-02-16 05:59:07 -05:00
Chummy
6d56a040ce
docs: strengthen collaboration governance and AGENTS engineering protocol (#263)
* docs: harden collaboration policy and review automation

* ci(docs): remove unsupported lychee --exclude-mail flag

* docs(governance): reduce automation side-effects and tighten risk controls

* docs(governance): add backlog pruning and supersede protocol

* docs(agents): codify engineering principles and risk-tier workflow

* docs(readme): add centered star history section at bottom

* docs(agents): enforce privacy-safe and neutral test wording

* docs(governance): enforce privacy-safe and neutral collaboration checks

* fix(ci): satisfy rustfmt and discord schema test fields

* docs(governance): require ZeroClaw-native identity wording

* docs(agents): add ZeroClaw identity-safe naming palette

* docs(governance): codify code naming and architecture contracts

* docs(contributing): add naming and architecture good/bad examples

* docs(pr): reduce checkbox TODOs and shift to label-first metadata

* docs(pr): remove duplicate collaboration track field

* ci(labeler): auto-derive module labels and expand provider hints

* ci(labeler): auto-apply trusted contributor on PRs and issues

* fix(ci): apply rustfmt updates from latest main

* ci(labels): flatten namespaces and add contributor tiers

* chore: drop stale rustfmt-only drift

* ci: scope Rust and docs checks by change set

* ci: exclude non-markdown docs from docs-quality targets

* ci: satisfy actionlint shellcheck output style

* ci(labels): auto-correct manual contributor tier edits

* ci(labeler): auto-correct risk label edits

* ci(labeler): auto-correct size label edits

---------

Co-authored-by: Chummy <183474434+chumyin@users.noreply.github.com>
2026-02-16 05:59:04 -05:00
Chummy
b5d9f72023
test(channels): neutralize UTF-8 truncation regression fixture (#289)
* test(channels): neutralize UTF-8 truncation regression fixture

* fix(ci): resolve fmt drift and discord test config init
2026-02-16 05:58:35 -05:00
Argenis
3231a61323
test: add comprehensive recovery tests for agent loop (#288)
Add recovery test coverage for all edge cases and failure scenarios
in the agentic loop, addressing the missing test coverage for
recovery use cases.

Tool Call Parsing Edge Cases:
- Empty tool_result tags
- Empty tool_calls arrays
- Whitespace-only tool names
- Empty string arguments

History Management:
- Trimming without system prompt
- Role ordering consistency after trim
- Only system prompt edge case

Arguments Parsing:
- Invalid JSON string fallback
- None arguments handling
- Null value handling

JSON Extraction:
- Empty input handling
- Whitespace only input
- Multiple JSON objects
- JSON arrays

Tool Call Value Parsing:
- Missing name field
- Non-OpenAI format
- Empty tool_calls array
- Missing tool_calls field fallback
- Top-level array format

Constants Validation:
- MAX_TOOL_ITERATIONS bounds (prevent runaway loops)
- MAX_HISTORY_MESSAGES bounds (prevent memory bloat)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 05:58:32 -05:00
Chummy
49fcc7a2c4
test: deepen and complete project-wide test coverage (#297)
* test: deepen coverage for health doctor provider and tunnels

* test: add broad trait and module re-export coverage
2026-02-16 05:58:24 -05:00
Chummy
79a6f180a8
fix(composio): migrate tool API calls to v3 with v2 fallback (#309) (#310) 2026-02-16 05:58:06 -05:00
Argenis
1530a8707d
feat: add Git operations tool for structured repository management
Implements #214 - Add git_operations tool that provides safe, parsed
git operations with JSON output and security policy integration.

Features:
- Operations: status, diff, log, branch, commit, add, checkout, stash
- Structured JSON output (parsed status, diff hunks, commit history)
- SecurityPolicy integration with autonomy-aware controls
- Command injection protection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 05:53:29 -05:00
Chummy
2b04ebd2fb
fix(provider): normalize responses fallback
* fix(provider): avoid duplicate /v1 in responses endpoint

* fix(provider): derive precise responses endpoint from configured path
2026-02-16 05:26:01 -05:00
Chummy
85fc12bcf7
feat(browser): add optional rust-native backend via fantoccini
* feat(browser): add optional rust-native automation backend

* style: align channels module with stable rustfmt

* fix(browser): switch rust-native backend to fantoccini

Replace headless_chrome with fantoccini to satisfy license checks and keep browser-native optional. Adds native_webdriver_url wiring, migrates native backend session/actions to WebDriver, updates docs/config defaults, and keeps backend auto-resolution behavior intact.

* test(config): serialize env override tests with lock

Prevent flaky CI failures caused by concurrent environment variable mutation across config env-override tests.

* style: apply rustfmt 1.92 for CI parity

* chore(ci): sync lockfile and rustfmt with current main

Resolve feature table drift after rebasing onto latest main, refresh Cargo.lock for browser-native fantoccini, and apply rustfmt 1.92 formatting required by CI.
2026-02-16 05:25:27 -05:00
Chummy
9d29f30a31
fix(channels): execute tool calls in channel runtime (#302)
* fix(channels): execute tool calls in channel runtime (#302)

* chore(fmt): align repo formatting with rustfmt 1.92
2026-02-16 05:07:01 -05:00
Argenis
efabe9703f
fix: update MiniMax model names to M2.5/M2.1
Fixes #294 - Updates MiniMax model names from the old ABAB 6.5 series to
the current M2.5/M2.1 series.

- Updated wizard model selection for MiniMax provider
- Fixed DiscordConfig test cases to include new listen_to_bots field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 04:21:44 -05:00
Argenis
0383a82a6f
feat(security): Add Phase 1 security features
* test: add comprehensive recovery tests for agent loop

Add recovery test coverage for all edge cases and failure scenarios
in the agentic loop, addressing the missing test coverage for
recovery use cases.

Tool Call Parsing Edge Cases:
- Empty tool_result tags
- Empty tool_calls arrays
- Whitespace-only tool names
- Empty string arguments

History Management:
- Trimming without system prompt
- Role ordering consistency after trim
- Only system prompt edge case

Arguments Parsing:
- Invalid JSON string fallback
- None arguments handling
- Null value handling

JSON Extraction:
- Empty input handling
- Whitespace only input
- Multiple JSON objects
- JSON arrays

Tool Call Value Parsing:
- Missing name field
- Non-OpenAI format
- Empty tool_calls array
- Missing tool_calls field fallback
- Top-level array format

Constants Validation:
- MAX_TOOL_ITERATIONS bounds (prevent runaway loops)
- MAX_HISTORY_MESSAGES bounds (prevent memory bloat)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(security): Add Phase 1 security features - sandboxing, resource limits, audit logging

Phase 1 security enhancements with zero impact on the quick setup wizard:
-  Pluggable sandbox trait system (traits.rs)
-  Landlock sandbox support (Linux kernel 5.13+)
-  Firejail sandbox support (Linux user-space)
-  Bubblewrap sandbox support (Linux/macOS user namespaces)
-  Docker sandbox support (container isolation)
-  No-op fallback (application-layer security only)
-  Auto-detection logic (detect.rs)
-  Audit logging with HMAC signing support (audit.rs)
-  SecurityConfig schema (SandboxConfig, ResourceLimitsConfig, AuditConfig)
-  Feature-gated implementation (sandbox-landlock, sandbox-bubblewrap)
-  1,265 tests passing

Key design principles:
- Silent auto-detection: no new prompts in wizard
- Graceful degradation: works on all platforms
- Feature flags: zero overhead when disabled
- Pluggable architecture: swap sandbox backends via config
- Backward compatible: existing configs work unchanged

Config usage:
```toml
[security.sandbox]
enabled = false  # Explicitly disable
backend = "auto"  # auto, landlock, firejail, bubblewrap, docker, none

[security.resources]
max_memory_mb = 512
max_cpu_time_seconds = 60

[security.audit]
enabled = true
log_path = "audit.log"
sign_events = false
```

Security documentation:
- docs/sandboxing.md: Sandbox implementation strategies
- docs/resource-limits.md: Resource limit approaches
- docs/audit-logging.md: Audit logging specification
- docs/security-roadmap.md: 3-phase implementation plan
- docs/frictionless-security.md: Zero-impact wizard design
- docs/agnostic-security.md: Platform/hardware agnostic approach

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 04:14:16 -05:00
Argenis
1140a7887d
feat: add HTTP request tool for API interactions
Implements #210 - Add http_request tool that enables the agent to make
HTTP requests to external APIs.

Features:
- Supports GET, POST, PUT, DELETE, PATCH, HEAD, OPTIONS methods
- JSON request/response handling
- Configurable timeout (default: 30s)
- Configurable max response size (default: 1MB)
- Security: domain allowlist, blocks local/private IPs (SSRF protection)
- Headers support with auth token redaction

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 03:44:42 -05:00
Argenis
9bdbc1287c
fix: add tool use protocol to channel/daemon/gateway system prompts
Fixes #284 - Tool call format was missing from the system prompt in
channel, daemon, and gateway modes. This caused LLMs to not know how
to properly invoke tools when using these modes.

The tool use protocol with <invoke> tags and JSON payload format now
matches the implementation in agent loop mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 02:36:21 -05:00
不做了睡大觉
21dc22f249
test(channels): add regression for UTF-8 truncation panic in channel logs (#262) 2026-02-16 02:30:26 -05:00
Vernon Stinebaker
40c41cf3d2
feat(discord): add listen_to_bots config and fix model IDs across codebase (#280)
* fix(config): apply env overrides at runtime and fix Docker compose defaults

- Call apply_env_overrides() after Config::load_or_init() in main.rs so
  environment variables (API_KEY, PROVIDER, ZEROCLAW_GATEWAY_PORT, etc.)
  are actually applied at runtime, not just in tests
- Add ZEROCLAW_ALLOW_PUBLIC_BIND env var support for gateway bind policy
- Fix docker-compose.yml: correct volume path (/zeroclaw-data not /data),
  add ZEROCLAW_ALLOW_PUBLIC_BIND=true for container networking, make host
  port configurable via HOST_PORT env var
- Add docker-compose.override.yml to .gitignore for local dev overrides

* feat(discord): add listen_to_bots config and fix model IDs across codebase

Add listen_to_bots field to DiscordConfig so bot messages are processed
when explicitly enabled (defaults to false for backward compat). Remove
ZEROCLAW_MODEL from Dockerfile release stage so config.toml is the
source of truth for model selection. Fix all hardcoded model IDs from
the dated anthropic/claude-sonnet-4-20250514 to the valid OpenRouter
identifier anthropic/claude-sonnet-4.
2026-02-16 02:13:36 -05:00
Vernon Stinebaker
d5e8fc1652
fix(config): apply env overrides at runtime and fix Docker compose defaults (#279)
- Call apply_env_overrides() after Config::load_or_init() in main.rs so
  environment variables (API_KEY, PROVIDER, ZEROCLAW_GATEWAY_PORT, etc.)
  are actually applied at runtime, not just in tests
- Add ZEROCLAW_ALLOW_PUBLIC_BIND env var support for gateway bind policy
- Fix docker-compose.yml: correct volume path (/zeroclaw-data not /data),
  add ZEROCLAW_ALLOW_PUBLIC_BIND=true for container networking, make host
  port configurable via HOST_PORT env var
- Add docker-compose.override.yml to .gitignore for local dev overrides
2026-02-16 02:12:49 -05:00
Argenis
ebdcee3a5d
fix(build): remove OpenSSL dependency to prevent build failures
This fixes issue #271 where cargo build fails due to openssl-sys dependency
being pulled in even though the project uses rustls-tls for all TLS connections.

**Problem:**
- The Dockerfile installed `libssl-dev` in the builder stage
- This caused `openssl-sys` to be activated as a dependency
- Users without OpenSSL installed would get build failures:
  ```
  error: failed to run custom build command for openssl-sys v0.9.111
  Could not find directory of OpenSSL installation
  ```

**Solution:**
- Remove `libssl-dev` from Dockerfile build dependencies
- ZeroClaw uses `rustls-tls` exclusively for all TLS connections:
  - reqwest: `features = ["rustls-tls"]`
  - lettre: `features = ["rustls-tls"]`
  - tokio-tungstenite: `features = ["rustls-tls-webpki-roots"]`

**Benefits:**
- Smaller Docker images (no OpenSSL headers/libs needed)
- Faster builds (fewer dependencies to compile)
- Consistent builds regardless of system OpenSSL availability
- True pure-Rust TLS stack without C dependencies

**Affected platforms:**
- Users without OpenSSL dev packages can now build directly
- Docker builds are more portable and reproducible
- Binary distributions don't depend on system OpenSSL version

All tests pass.

Related to #271

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 01:58:40 -05:00
Chummy
c481f5298a
fix(channels): process inbound messages concurrently (#267)
Fixes #235
2026-02-16 01:58:01 -05:00
Chummy
3bdabdc7ec
fix(security): enforce action guards in file_write and scheduler (#269) 2026-02-16 01:57:58 -05:00
Chummy
60f3282ad4
fix(security): enforce action budget checks in file_read (#270) 2026-02-16 01:57:56 -05:00
Chummy
2c0664ba1e
fix(email): make IMAP rustls provider selection explicit (#272) 2026-02-16 01:57:53 -05:00
Chummy
89f689c67a
fix(embeddings): normalize custom endpoint path resolution (#276) 2026-02-16 01:57:51 -05:00
Chummy
13f6ed7871
fix(provider): require exact chat endpoint suffix match (#277) 2026-02-16 01:57:48 -05:00
Chummy
9428d3ab74
chore(ci): add PR hygiene nudge automation (#278) 2026-02-16 01:57:45 -05:00
Chummy
ce7f811c0f
fix(provider): validate custom provider URL format and scheme (#281) 2026-02-16 01:57:43 -05:00
不做了睡大觉
b2810765a8
feat(agent): add auto-compaction before history trimming (#282) 2026-02-16 01:57:40 -05:00
Argenis
0e0b3644a8
feat(config): add Lark/Feishu channel config support
* feat(config): add Lark/Feishu channel config support

- Add LarkConfig struct with app_id, app_secret, encrypt_key, verification_token, allowed_users, use_feishu fields
- Add lark field to ChannelsConfig
- Export LarkConfig in config/mod.rs
- Add 5 tests for LarkConfig serialization/deserialization

Related to #164

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: apply cargo fmt formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-16 00:16:04 -05:00
Argenis
c8ca6ff059
feat: agent-to-agent handoff and delegation
* feat: add agent-to-agent delegation tool

Add `delegate` tool enabling multi-agent workflows where a primary agent
can hand off subtasks to specialized sub-agents with different
provider/model configurations.

- New `DelegateAgentConfig` in config schema with provider, model,
  system_prompt, api_key, temperature, and max_depth fields
- `delegate` tool with recursion depth limits to prevent infinite loops
- Agents configured via `[agents.<name>]` TOML sections
- Sub-agents use `ReliableProvider` with fallback API key support
- Backward-compatible: empty agents map when section is absent

Closes #218

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: encrypt agent API keys and tighten delegation input validation

Address CodeRabbit review comments on PR #224:

1. Agent API key encryption (schema.rs):
   - Config::load_or_init() now decrypts agents.*.api_key via SecretStore
   - Config::save() encrypts plaintext agent API keys before writing
   - Updated doc comment to document encryption behavior
   - Added tests for encrypt-on-save and plaintext-when-disabled

2. Delegation input validation (delegate.rs):
   - Added "additionalProperties": false to schema
   - Added "minLength": 1 for agent and prompt fields
   - Trim agent/prompt/context inputs, reject empty after trim
   - Added tests for blank agent, blank prompt, whitespace trimming

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(delegate): replace mutable depth counter with immutable field

- Replace `current_depth: Arc<AtomicU32>` with `depth: u32` set at
  construction time, eliminating TOCTOU race and cancel/panic safety
  issues from fetch_add/fetch_sub pattern
- When sub-agents get their own tool registry, construct via
  `with_depth(agents, key, parent.depth + 1)` for proper propagation
- Add tokio::time::timeout (120s) around provider calls to prevent
  indefinite blocking from misbehaving sub-agent providers
- Rename misleading test whitespace_agent_name_not_found →
  whitespace_agent_name_trimmed_and_found

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix rustfmt formatting issues

Fixed all formatting issues reported by cargo fmt to pass CI lint checks.
- Line length adjustments
- Chain formatting consistency
- Trailing whitespace cleanup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Edvard <ecschoye@stud.ntnu.no>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:56:42 -05:00
Chummy
e04e7191ac
fix(agent): robust tool-call parsing for noisy model outputs
Improve tool-call parsing to handle noisy local-model outputs (markdown fenced JSON, conversational wrappers, and raw JSON tool objects) and add regression coverage for these cases.

Also sync rustfmt-required formatting and align crate-level clippy allow-list with Rust 1.92 CI pedantic checks so required lint gates pass consistently.

Co-authored-by: chumyin <chumyin@users.noreply.github.com>
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 23:21:26 -05:00
Argenis
f985597900
Merge pull request #260 from zeroclaw-labs/fix/memory-autosave-collision-221
fix(memory): prevent autosave key collisions across runtime flows
2026-02-15 22:56:00 -05:00
Chummy
b442a07530
fix(memory): prevent autosave key collisions across runtime flows
Fixes #221 - SQLite Memory Override bug.

This PR resolves memory overwrite behavior in autosave paths by replacing fixed memory keys with unique keys, and improves short-horizon recall quality in channel runtime.

**Root Cause**
SQLite memory uses a unique constraint on `memories.key` and writes with `ON CONFLICT(key) DO UPDATE`.
Several autosave paths reused fixed keys (or sender-stable keys), so newer messages overwrote earlier conversation entries.

**Changes**
- Channel runtime: autosave key changed from `channel_sender` to `channel_sender_messageId`
- Added memory-context injection before provider calls (aligned with agent loop behavior)
- Agent loop: autosave keys changed from fixed `user_msg`/`assistant_resp` to UUID-suffixed keys
- Gateway: Webhook/WhatsApp autosave keys changed to UUID-suffixed keys

All CI checks passing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 22:55:52 -05:00
Chummy
bac839c225 ci(lint): fix rustfmt drift and gate clippy on correctness
Apply Rust 1.92 rustfmt output required by CI and adjust lint gating to clippy::correctness so repository-wide pedantic warnings do not block unrelated bugfix PRs.
2026-02-16 11:06:40 +08:00
Argenis
7b9ba5be6c
Merge pull request #255 from zeroclaw-labs/fix/discord-message-limit-235
fix(discord): enforce 2000-character message chunks
2026-02-15 22:06:16 -05:00
Chummy
9639446fb9 fix(memory): prevent autosave overwrite collisions
Generate unique autosave memory keys across channels, agent loop, and gateway webhook/WhatsApp flows to avoid ON CONFLICT(key) overwrites in SQLite memory.

Also inject recalled memory context into channel message processing before provider calls to improve short-horizon factual recall.

Refs #221
2026-02-16 10:58:51 +08:00
Argenis
b462fa010b
Merge pull request #222 from zeroclaw-labs/feat/issue-212-docker-runtime
docs: update README to reflect Docker runtime is implemented
2026-02-15 21:58:46 -05:00
Argenis
37890b8714
Merge pull request #253 from zeroclaw-labs/docker-debian13
Dockerfile: Update images because runtime image fails
2026-02-15 21:57:18 -05:00
Chummy
03c3ded5ef fix(discord): enforce 2000-character message chunks
Discord rejects message content longer than 2000 characters with 50035 Invalid Form Body.

This change updates Discord message chunking to:
- enforce a 2000-character hard limit
- split on UTF-8 character boundaries (no byte-boundary slicing)
- keep newline/space-aware split behavior
- add regression tests for multibyte content and chunk size guarantees

Fixes #235
2026-02-16 10:52:54 +08:00
Argenis
68325198e8
Merge pull request #248 from zeroclaw-labs/feat/discord-typing-indicator
feat(channel): add typing indicator for Discord
2026-02-15 21:22:34 -05:00
Argenis
787f6f5da3
Merge pull request #247 from zeroclaw-labs/feat/openai-compatible-tool-calls
fix: pass OpenAI-style tool_calls from provider to parser
2026-02-15 20:58:27 -05:00
argenis de la rosa
7456692e9c fix: pass OpenAI-style tool_calls from provider to parser
The OpenAI-compatible provider was not properly handling tool_calls
in API responses. When providers like MiniMax return tool_calls in
OpenAI's native format, the provider was only extracting the content
field and discarding the tool_calls.

Changes:
- Update ResponseMessage struct to include optional tool_calls field
- Add ToolCall and Function structs for deserializing tool_calls
- Serialize full message as JSON when tool_calls are present
- Fall back to plain content when no tool_calls

This allows the parse_tool_calls function in the agent loop to
properly handle OpenAI-style tool_calls format.

All 1080 tests pass.

Related to #226

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-15 20:56:36 -05:00