refactor(opencode): allow @test inside #[cfg(test)] mod blocks, drop file gate

The previous design routed Rust unit tests to NOT_TESTABLE: Rust unit-only because @test was forbidden from touching src/, which forced @make to write both the production code and the inline #[cfg(test)] mod tests in one dispatch — losing TDD's RED→GREEN separation. But Rust module tests inside #[cfg(test)] mod tests { ... } are the canonical unit-testing idiom, not an edge case. @test's File Constraint now allows modifying src/**/*.rs, but strictly inside #[cfg(test)] mod <name> { ... } blocks. Every line outside such a block stays read-only — adding pub, importing crates, declaring siblings, or any other production change is forbidden. Integration tests at tests/**/*.rs continue to work as before. The Phase 6 post-step file gate (git status snapshot + comm -23 diff against test-pattern globs) is removed. With @test legitimately writing inside src/, a path-based gate proves nothing — production edits and cfg(test) edits live in the same files. The boundary is enforced by the prompt rule and Phase 8 reviewer scrutiny. Phase 5 test-file guidance updated to distinguish module vs integration tests for Rust, with stub-first TDD applying to both when symbols don't yet exist. The "Rust integration TDD: stub-first" section is renamed to "Rust stub-first TDD" and now covers module tests too. NOT_TESTABLE's "Rust unit-only" reason is replaced with "Missing testability seam" for cases where the production code needs a small change before tests can be authored.
2026-05-07 14:34:04 +02:00 · 2026-05-07 14:34:04 +02:00 · 4dc3cffba6
commit 4dc3cffba6
parent 8373e32f34
2 changed files with 28 additions and 67 deletions
--- a/config/opencode/agents/test.md
+++ b/config/opencode/agents/test.md
@ -113,29 +113,18 @@ Python:
 - `**/test_data/**`
 - `**/test_fixtures/**`
-Rust (integration tests only — see "Rust unit tests" below):
+Rust:
- `tests/**/*.rs` (crate-level integration tests directory)
+- **Integration tests:** `tests/**/*.rs` and `**/tests/**/*.rs` (workspace-style `<crate>/tests/...`). Create new files; do not modify existing integration tests in unrelated tasks.
- `**/tests/**/*.rs` (per-crate integration tests in workspace layouts)
+- **Module tests:** `src/**/*.rs` — but **only inside `#[cfg(test)] mod <name> { … }` blocks**. You may:
- `**/test_data/**`
+  - Append a new `#[cfg(test)] mod tests { use super::*; … }` block at the end of an existing source file.
- `**/test_fixtures/**`
+  - Add new `#[test] fn` items inside an already-existing `#[cfg(test)] mod` block.
  - Edit/remove `#[test] fn` items you previously authored inside such a block.
 - **Test data / fixtures:** `**/test_data/**`, `**/test_fixtures/**`.
-**Anti-patterns — refuse the path even if the glob above matches:**
+**Strict boundary rule for Rust module tests:** every line outside a `#[cfg(test)] mod` block is read-only. Adding `pub`, changing function signatures, importing crates, declaring new `pub mod` siblings, touching the prelude, or any other production-code edit is forbidden — those changes belong to `@make`. If the test cannot be written without such a change, report the missing seam to the caller and return `NOT_TESTABLE` (or, for a fresh public API, request a stub-first `@make` pre-pass).
 - Anything under `src/` (e.g. `src/tests/foo.rs`, `src/**/tests/...`). `src/tests/` is a regular module under `src/`; it would require declaring `mod tests;` in production code (`lib.rs` / `main.rs`) and creating `mod.rs`, which you cannot do. If the caller asks for such a path, treat it as a wrong task spec: return `BLOCKED` with a note that the path is not a valid Rust test location, suggesting `tests/<feature>.rs` (or `NOT_TESTABLE: Rust unit-only` if the test really needs to be in-source).
-**You may NOT modify production/source code under any circumstances.**
+**Anti-patterns — refuse the path even if it would technically be writable:**
-
+- `src/tests/foo.rs` and similar regular submodule paths under `src/`. These are not `#[cfg(test)]` modules — they are normal modules that would require a `mod tests;` declaration in production code (`lib.rs` / `main.rs`), which you cannot add. Report as `BLOCKED` and suggest either `tests/<feature>.rs` (integration) or a `#[cfg(test)] mod tests` block inside the relevant `src/<module>.rs`.
 ### Rust unit tests
 Rust unit tests live inside production source files (inside `#[cfg(test)] mod tests { ... }` blocks in `src/**/*.rs`). Because that would require modifying production code, **you do not write Rust unit tests.** Options when the task spec requests unit-level coverage in Rust:
 1. Convert to an integration test under `tests/` if the unit is part of the public API.
 2. Return `NOT_TESTABLE` with reason `pure-wiring` or `external-system` if no integration-level seam exists, and let `@make` write the in-source tests.
 Report this constraint to the caller rather than silently degrading coverage.
 If you believe source code needs changes to be testable, report this to the caller — do not edit it yourself.
 This constraint is enforced by a post-step file gate. Violations cause your output to be discarded.
 ## Test Philosophy
@ -257,7 +246,7 @@ You may return `NOT_TESTABLE` only for these allowed reasons:
 | **External system without harness** | Change only affects API call to service with no local mock possible |
 | **Non-deterministic** | GPU numerical results, timing-dependent behavior |
 | **Pure wiring** | Decorator swap, import / `use` reorganization, no logic change |
-| **Rust unit-only** | Coverage requires `#[cfg(test)]` mod tests in production source; @test cannot write those — let @make handle it |
+| **Missing testability seam** | Test would require a production-code change beyond a `#[cfg(test)] mod` block (e.g. a private function needs `pub(crate)`, a refactor exposes a hook). Report the missing seam so `@make` can add it before tests are authored. |
 Must provide:
 - Which allowed reason applies
--- a/config/opencode/commands/workflow.md
+++ b/config/opencode/commands/workflow.md
@ -105,7 +105,7 @@ Do **not** provide:
 - Drop-in code blocks longer than ~5 lines that constitute "the answer"
 - Full function bodies for the changes being planned
 - Complete `match` arms / branch logic / loop bodies for new behavior
- Pre-written test bodies (those come from `@test`, or from `@make` for Rust unit-only)
+- Pre-written test bodies (those come from `@test`)
 - Stage-by-stage code transformations spelled out as ready-to-commit diffs
 If you've already written the implementation in the plan or spec, the artifact has overstepped. Convert finished code into structural description (signature + intent) and let `@make` produce the body.
@ -176,9 +176,10 @@ The test file path must follow the language's actual test layout. **Do not inven
  - Colocated: `<module>/tests/test_<feature>.py (create)`
  - Top-level: `tests/test_<feature>.py (create)`
 - **Rust**
-  - Crate-level integration tests: `tests/<feature>.rs (create)` (or, in a workspace, `<crate>/tests/<feature>.rs`).
+  - **Module tests** (most common — testing private/crate-internal functions): pick the relevant production source file, e.g. `src/<module>.rs`. `@test` is permitted to add or edit content **only inside `#[cfg(test)] mod <name> { … }` blocks** in that file (per `@test`'s File Constraint). The rest of the file remains read-only to `@test`.
-    - **If the test references not-yet-existing public API**, the task automatically requires a **stub-first `@make` pre-pass** before `@test` runs (see Phase 6 → "Rust integration TDD: stub-first"). Plan for two `@make` dispatches per such task: stub pass, then body pass.
+  - **Integration tests** (testing the crate's public API as a black box): `tests/<feature>.rs (create)`, or in a workspace `<crate>/tests/<feature>.rs`.
-  - **Unit-test-only tasks (in-source `#[cfg(test)] mod tests`):** mark the task as `NOT_TESTABLE` with reason `Rust unit-only` — `@test` cannot write inside production source. `@make` writes those inline as part of its production change.
+  - **In both cases**, if the test references not-yet-existing functions/types, the task requires a **stub-first `@make` pre-pass** so the symbols exist as `todo!()` bodies before `@test` runs. See Phase 6 → "Rust stub-first TDD". Plan for two `@make` dispatches per such task: stub pass, then body pass.
  - **`src/tests/<feature>.rs` is not a valid path** — it would be a regular submodule needing `mod tests;` in production code. Use one of the two forms above.
 - **Polyglot Nix flake**
  - Match the host language of the code under change (Python or Rust rules above), wrapping commands in `nix develop -c …` per the agents' devshell rule.
@ -232,73 +233,44 @@ For each task from Phase 5, dispatch `@test` with:
 `@test` writes failing tests and verifies RED with structured failure codes.
 **Post-step file gate (MANDATORY):**
 Before dispatching `@test`, snapshot every modified, staged, *and untracked* file. `git diff --name-only` alone misses untracked files, which is precisely the state of any new test file `@test` creates (it cannot `git add`). Use `git status --porcelain` so the gate sees them:
 ```bash
 git status --porcelain | sed 's/^...//' | sort -u > /tmp/pre_test_baseline.txt
 ```
 After `@test` completes, list NEW changes (in the post-snapshot but not the pre-snapshot):
 ```bash
 git status --porcelain | sed 's/^...//' | sort -u | comm -23 - /tmp/pre_test_baseline.txt > /tmp/test_new_files.txt
 ```
 Each line in `/tmp/test_new_files.txt` is a file path that did not exist (or was unmodified) before `@test` ran. The gate validates each one against the patterns below.
 All new files must match the project's test patterns:
 - Python: `**/test_*.py`, `**/*_test.py`, `**/conftest.py` (new only), `**/test_data/**`, `**/test_fixtures/**`
 - Rust: `tests/**/*.rs`, `**/tests/**/*.rs` (workspace-style `<crate>/tests/...`), `**/test_data/**`, `**/test_fixtures/**`
 **Anti-patterns — discard the output even if the glob matches:**
 - Anything under `src/` for Rust (e.g. `src/tests/foo.rs`, `src/**/tests/...`). `src/tests/` is a regular module path under `src/`, not a Rust test location, and `@test` cannot wire it up via `mod` declarations in production source. Such paths indicate the task spec gave a wrong test path — escalate, don't accept the file.
 If any non-matching file appears, or any anti-pattern matches: discard `@test` output, report violation.
 **Decision table — handling `@test` results:**
 | Condition | Action |
 |-----------|--------|
 | `TESTS_READY` + `escalate_to_check: false` | Proceed to Phase 7 |
 | `TESTS_READY` + `escalate_to_check: true` | Route tests to `@check` for light review. `@check` diagnoses, caller routes fixes to `@test`. Then proceed. |
-| `NOT_TESTABLE` (general reasons) | Route to `@check` for sign-off on justification. If approved, task goes to `@make` without tests. |
+| `NOT_TESTABLE` | Route to `@check` for sign-off on justification. If `Missing testability seam`, dispatch `@make` to add the seam first, then re-run `@test`. Otherwise the task goes to `@make` without tests. |
 | `NOT_TESTABLE` reason `Rust unit-only` | See "Rust unit-only routing" below. **Do not** include test code in the `@make` spec; pass test specs only. |
 | `BLOCKED` | Investigate. May need to revise task spec or plan. |
 | Test passes immediately | Investigate — behavior may already exist. Task spec may be wrong. |
 | Stub-first run: tests pass with zero `todo!()` panics | **Structural-only tests.** Every test is asserting type/struct/enum facts without calling any stubbed symbol. Reject the test output and route back to `@test` with a "must exercise the stubbed symbols by calling them" note. Do not let these tests gate Phase 7 — they cannot RED→GREEN, so the body-pass `@make` would commit code with false-green coverage. |
-### Rust unit-only routing
+### Rust stub-first TDD (mandatory for new symbols)
-When `@test` returns `NOT_TESTABLE: Rust unit-only` (the implementation needs in-source `#[cfg(test)] mod tests` blocks that `@test` is forbidden from writing), the orchestrator must:
+Whenever `@test` will write tests (module or integration) that reference functions / methods / types **that do not yet exist**, the test cannot RED meaningfully against absent code:
-1. Get `@check`'s sign-off on the justification (no integration-test seam exists).
+- *Module tests inside `src/<module>.rs`* — without the function, the `#[cfg(test)] mod tests` block fails to compile (`error[E0425]`), masking assertion diagnostics.
-2. Build the `@make` spec with **test specifications**, not test code:
+- *Integration tests inside `tests/<feature>.rs`* — same, but mediated through `lib.rs` re-exports.
   - "Add `#[cfg(test)] mod foo_tests` at the bottom of `<file>` exercising:"
   - For each behavior, a one-line description: input → expected output, edge case to cover, error path to assert.
   - Where applicable, name the function under test and the assertion type (`assert_eq!`, `assert!`, panic on invalid input).
 3. **Forbidden in the `@make` spec:** complete `#[test] fn …` bodies, full module blocks, or any `@test`-style RED-verified test code. `@make` writes the inline tests itself based on the spec.
 4. After `@make` completes, the orchestrator runs the test suite once to confirm RED→GREEN evidence and includes it in the workflow summary.
-This keeps the agents in their lanes: `@test` never writes inside `src/`, `@make` writes both the tests and the production code in a single coherent change, and the orchestrator sees explicit test pass evidence.
+To get a clean runtime RED, dispatch a **stub-first `@make` pass** *before* `@test` runs:
 ### Rust integration TDD: stub-first (mandatory)
 Rust integration tests live in a separate test crate (`tests/<feature>.rs`) that imports from `lib.rs`. Any test referencing not-yet-existing public API can only RED at *build* time, which masks assertion diagnostics. To avoid this, **for every Rust task whose `@test` step writes an integration test against public API that does not yet exist**, dispatch a stub-first `@make` pass *before* `@test` runs:
 **Stub pass (split from Phase 7's body pass):**
 1. Dispatch `@make` in **standard mode** (no tests exist yet) with this exact scope:
-   - **Goal:** add the planned public API as `todo!()`-bodied stubs so the integration test will compile.
+   - **Goal:** add the planned API as `todo!()`-bodied stubs so the test will compile.
-   - **Files to modify:** `src/lib.rs` (add `pub mod …;` declarations) plus any new `src/<module>.rs` files containing the stub functions/structs.
+   - **Files to modify:** the relevant `src/<module>.rs` for module tests, or `src/lib.rs` plus any new `src/<module>.rs` for integration tests (the latter need `pub mod …;` declarations so the test crate can import).
-   - **Stubs only:** every function body is exactly `todo!()`. Every method body is exactly `todo!()`. Public structs may use `pub struct Foo;` or `pub struct Foo { /* fields TBD */ }` — but no logic.
+   - **Stubs only:** every function body is exactly `todo!()`. Every method body is exactly `todo!()`. Structs may use `pub struct Foo;` or `pub struct Foo { /* fields TBD */ }` — but no logic.
-   - **Signatures must match the planned final API exactly** (return types, lifetimes, generics) — otherwise the integration test will mismatch later. Lift signatures from the Phase 3 plan / Phase 5 task spec.
+   - **Signatures must match the planned final API exactly** (return types, lifetimes, generics, visibility). Lift signatures from the Phase 3 plan / Phase 5 task spec.
   - **Acceptance criteria:** `cargo check` (wrapped in `nix develop -c …` if the project has a devshell) passes; no test command is run.
   - **Dispatch Hygiene still applies:** the stub pass is small and finalized — no draft bodies, no contradictory signatures.
 2. Verify `cargo check` passed in `@make`'s output. If not, fix and re-dispatch the stub pass before continuing.
-3. Dispatch `@test` as normal. The integration test now compiles; running it panics on `todo!()` at runtime, which is a clean `MISSING_BEHAVIOR` RED with a stack trace — far better than the build-error-RED form.
+3. Dispatch `@test`. The test now compiles; running it panics on `todo!()` at runtime, which is a clean `MISSING_BEHAVIOR` RED with a stack trace — far better than the build-error-RED form.
 4. **Panic-coverage check (MANDATORY).** After `@test` returns, re-run the test command in the orchestrator and verify that **every test in the new file panics on `todo!()`** (i.e. every test exercises at least one of the stubbed symbols). The decision rule:
   - If the test output shows N panics for N tests → proceed to body pass.
   - If any test passes without a `todo!()` panic → that test is structural-only (asserting type / variant-count / field facts without calling the stubbed code). **Reject** `@test`'s output and route back with the "Stub-first run: tests pass with zero `todo!()` panics" decision-table verdict. Require `@test` to rewrite each non-panicking test so it actually invokes the stubbed function/method.
   - This check is the only thing standing between false-green coverage and the body-pass commit. Skipping it has produced regressions like a system that compiles, "passes" tests, and silently no-ops in production.
 5. Continue to Phase 7's body pass (`@make` in TDD mode), where the same files are revisited and the `todo!()` bodies are replaced.
-**This routing is mandatory** for new public API in Rust. It is **not** required when the integration test exercises an existing public API (e.g. a behavior fix where the function already exists) — in that case `@test` runs directly and `@make` modifies the body in Phase 7.
+**This routing is mandatory** whenever new symbols are introduced in Rust (module or integration). It is **not** required when the test exercises an *existing* function/method (e.g. a behavior fix) — in that case `@test` runs directly and `@make` modifies the body in Phase 7.
 The stub pass and the body pass each produce their own atomic commit (per Phase 9 rules): `feat(<scope>): scaffold <thing> with todo!() stubs` followed by `feat(<scope>): implement <thing>` (or whichever conventional type fits).