diff --git a/config/opencode/agents/test.md b/config/opencode/agents/test.md index 54f7113..b3699b9 100644 --- a/config/opencode/agents/test.md +++ b/config/opencode/agents/test.md @@ -113,29 +113,18 @@ Python: - `**/test_data/**` - `**/test_fixtures/**` -Rust (integration tests only — see "Rust unit tests" below): -- `tests/**/*.rs` (crate-level integration tests directory) -- `**/tests/**/*.rs` (per-crate integration tests in workspace layouts) -- `**/test_data/**` -- `**/test_fixtures/**` +Rust: +- **Integration tests:** `tests/**/*.rs` and `**/tests/**/*.rs` (workspace-style `/tests/...`). Create new files; do not modify existing integration tests in unrelated tasks. +- **Module tests:** `src/**/*.rs` — but **only inside `#[cfg(test)] mod { … }` blocks**. You may: + - Append a new `#[cfg(test)] mod tests { use super::*; … }` block at the end of an existing source file. + - Add new `#[test] fn` items inside an already-existing `#[cfg(test)] mod` block. + - Edit/remove `#[test] fn` items you previously authored inside such a block. +- **Test data / fixtures:** `**/test_data/**`, `**/test_fixtures/**`. -**Anti-patterns — refuse the path even if the glob above matches:** -- Anything under `src/` (e.g. `src/tests/foo.rs`, `src/**/tests/...`). `src/tests/` is a regular module under `src/`; it would require declaring `mod tests;` in production code (`lib.rs` / `main.rs`) and creating `mod.rs`, which you cannot do. If the caller asks for such a path, treat it as a wrong task spec: return `BLOCKED` with a note that the path is not a valid Rust test location, suggesting `tests/.rs` (or `NOT_TESTABLE: Rust unit-only` if the test really needs to be in-source). +**Strict boundary rule for Rust module tests:** every line outside a `#[cfg(test)] mod` block is read-only. Adding `pub`, changing function signatures, importing crates, declaring new `pub mod` siblings, touching the prelude, or any other production-code edit is forbidden — those changes belong to `@make`. If the test cannot be written without such a change, report the missing seam to the caller and return `NOT_TESTABLE` (or, for a fresh public API, request a stub-first `@make` pre-pass). -**You may NOT modify production/source code under any circumstances.** - -### Rust unit tests - -Rust unit tests live inside production source files (inside `#[cfg(test)] mod tests { ... }` blocks in `src/**/*.rs`). Because that would require modifying production code, **you do not write Rust unit tests.** Options when the task spec requests unit-level coverage in Rust: - -1. Convert to an integration test under `tests/` if the unit is part of the public API. -2. Return `NOT_TESTABLE` with reason `pure-wiring` or `external-system` if no integration-level seam exists, and let `@make` write the in-source tests. - -Report this constraint to the caller rather than silently degrading coverage. - -If you believe source code needs changes to be testable, report this to the caller — do not edit it yourself. - -This constraint is enforced by a post-step file gate. Violations cause your output to be discarded. +**Anti-patterns — refuse the path even if it would technically be writable:** +- `src/tests/foo.rs` and similar regular submodule paths under `src/`. These are not `#[cfg(test)]` modules — they are normal modules that would require a `mod tests;` declaration in production code (`lib.rs` / `main.rs`), which you cannot add. Report as `BLOCKED` and suggest either `tests/.rs` (integration) or a `#[cfg(test)] mod tests` block inside the relevant `src/.rs`. ## Test Philosophy @@ -257,7 +246,7 @@ You may return `NOT_TESTABLE` only for these allowed reasons: | **External system without harness** | Change only affects API call to service with no local mock possible | | **Non-deterministic** | GPU numerical results, timing-dependent behavior | | **Pure wiring** | Decorator swap, import / `use` reorganization, no logic change | -| **Rust unit-only** | Coverage requires `#[cfg(test)]` mod tests in production source; @test cannot write those — let @make handle it | +| **Missing testability seam** | Test would require a production-code change beyond a `#[cfg(test)] mod` block (e.g. a private function needs `pub(crate)`, a refactor exposes a hook). Report the missing seam so `@make` can add it before tests are authored. | Must provide: - Which allowed reason applies diff --git a/config/opencode/commands/workflow.md b/config/opencode/commands/workflow.md index f1d3550..b71f0bd 100644 --- a/config/opencode/commands/workflow.md +++ b/config/opencode/commands/workflow.md @@ -105,7 +105,7 @@ Do **not** provide: - Drop-in code blocks longer than ~5 lines that constitute "the answer" - Full function bodies for the changes being planned - Complete `match` arms / branch logic / loop bodies for new behavior -- Pre-written test bodies (those come from `@test`, or from `@make` for Rust unit-only) +- Pre-written test bodies (those come from `@test`) - Stage-by-stage code transformations spelled out as ready-to-commit diffs If you've already written the implementation in the plan or spec, the artifact has overstepped. Convert finished code into structural description (signature + intent) and let `@make` produce the body. @@ -176,9 +176,10 @@ The test file path must follow the language's actual test layout. **Do not inven - Colocated: `/tests/test_.py (create)` - Top-level: `tests/test_.py (create)` - **Rust** - - Crate-level integration tests: `tests/.rs (create)` (or, in a workspace, `/tests/.rs`). - - **If the test references not-yet-existing public API**, the task automatically requires a **stub-first `@make` pre-pass** before `@test` runs (see Phase 6 → "Rust integration TDD: stub-first"). Plan for two `@make` dispatches per such task: stub pass, then body pass. - - **Unit-test-only tasks (in-source `#[cfg(test)] mod tests`):** mark the task as `NOT_TESTABLE` with reason `Rust unit-only` — `@test` cannot write inside production source. `@make` writes those inline as part of its production change. + - **Module tests** (most common — testing private/crate-internal functions): pick the relevant production source file, e.g. `src/.rs`. `@test` is permitted to add or edit content **only inside `#[cfg(test)] mod { … }` blocks** in that file (per `@test`'s File Constraint). The rest of the file remains read-only to `@test`. + - **Integration tests** (testing the crate's public API as a black box): `tests/.rs (create)`, or in a workspace `/tests/.rs`. + - **In both cases**, if the test references not-yet-existing functions/types, the task requires a **stub-first `@make` pre-pass** so the symbols exist as `todo!()` bodies before `@test` runs. See Phase 6 → "Rust stub-first TDD". Plan for two `@make` dispatches per such task: stub pass, then body pass. + - **`src/tests/.rs` is not a valid path** — it would be a regular submodule needing `mod tests;` in production code. Use one of the two forms above. - **Polyglot Nix flake** - Match the host language of the code under change (Python or Rust rules above), wrapping commands in `nix develop -c …` per the agents' devshell rule. @@ -232,73 +233,44 @@ For each task from Phase 5, dispatch `@test` with: `@test` writes failing tests and verifies RED with structured failure codes. -**Post-step file gate (MANDATORY):** -Before dispatching `@test`, snapshot every modified, staged, *and untracked* file. `git diff --name-only` alone misses untracked files, which is precisely the state of any new test file `@test` creates (it cannot `git add`). Use `git status --porcelain` so the gate sees them: -```bash -git status --porcelain | sed 's/^...//' | sort -u > /tmp/pre_test_baseline.txt -``` -After `@test` completes, list NEW changes (in the post-snapshot but not the pre-snapshot): -```bash -git status --porcelain | sed 's/^...//' | sort -u | comm -23 - /tmp/pre_test_baseline.txt > /tmp/test_new_files.txt -``` -Each line in `/tmp/test_new_files.txt` is a file path that did not exist (or was unmodified) before `@test` ran. The gate validates each one against the patterns below. -All new files must match the project's test patterns: -- Python: `**/test_*.py`, `**/*_test.py`, `**/conftest.py` (new only), `**/test_data/**`, `**/test_fixtures/**` -- Rust: `tests/**/*.rs`, `**/tests/**/*.rs` (workspace-style `/tests/...`), `**/test_data/**`, `**/test_fixtures/**` - -**Anti-patterns — discard the output even if the glob matches:** -- Anything under `src/` for Rust (e.g. `src/tests/foo.rs`, `src/**/tests/...`). `src/tests/` is a regular module path under `src/`, not a Rust test location, and `@test` cannot wire it up via `mod` declarations in production source. Such paths indicate the task spec gave a wrong test path — escalate, don't accept the file. - -If any non-matching file appears, or any anti-pattern matches: discard `@test` output, report violation. - **Decision table — handling `@test` results:** | Condition | Action | |-----------|--------| | `TESTS_READY` + `escalate_to_check: false` | Proceed to Phase 7 | | `TESTS_READY` + `escalate_to_check: true` | Route tests to `@check` for light review. `@check` diagnoses, caller routes fixes to `@test`. Then proceed. | -| `NOT_TESTABLE` (general reasons) | Route to `@check` for sign-off on justification. If approved, task goes to `@make` without tests. | -| `NOT_TESTABLE` reason `Rust unit-only` | See "Rust unit-only routing" below. **Do not** include test code in the `@make` spec; pass test specs only. | +| `NOT_TESTABLE` | Route to `@check` for sign-off on justification. If `Missing testability seam`, dispatch `@make` to add the seam first, then re-run `@test`. Otherwise the task goes to `@make` without tests. | | `BLOCKED` | Investigate. May need to revise task spec or plan. | | Test passes immediately | Investigate — behavior may already exist. Task spec may be wrong. | | Stub-first run: tests pass with zero `todo!()` panics | **Structural-only tests.** Every test is asserting type/struct/enum facts without calling any stubbed symbol. Reject the test output and route back to `@test` with a "must exercise the stubbed symbols by calling them" note. Do not let these tests gate Phase 7 — they cannot RED→GREEN, so the body-pass `@make` would commit code with false-green coverage. | -### Rust unit-only routing +### Rust stub-first TDD (mandatory for new symbols) -When `@test` returns `NOT_TESTABLE: Rust unit-only` (the implementation needs in-source `#[cfg(test)] mod tests` blocks that `@test` is forbidden from writing), the orchestrator must: +Whenever `@test` will write tests (module or integration) that reference functions / methods / types **that do not yet exist**, the test cannot RED meaningfully against absent code: -1. Get `@check`'s sign-off on the justification (no integration-test seam exists). -2. Build the `@make` spec with **test specifications**, not test code: - - "Add `#[cfg(test)] mod foo_tests` at the bottom of `` exercising:" - - For each behavior, a one-line description: input → expected output, edge case to cover, error path to assert. - - Where applicable, name the function under test and the assertion type (`assert_eq!`, `assert!`, panic on invalid input). -3. **Forbidden in the `@make` spec:** complete `#[test] fn …` bodies, full module blocks, or any `@test`-style RED-verified test code. `@make` writes the inline tests itself based on the spec. -4. After `@make` completes, the orchestrator runs the test suite once to confirm RED→GREEN evidence and includes it in the workflow summary. +- *Module tests inside `src/.rs`* — without the function, the `#[cfg(test)] mod tests` block fails to compile (`error[E0425]`), masking assertion diagnostics. +- *Integration tests inside `tests/.rs`* — same, but mediated through `lib.rs` re-exports. -This keeps the agents in their lanes: `@test` never writes inside `src/`, `@make` writes both the tests and the production code in a single coherent change, and the orchestrator sees explicit test pass evidence. - -### Rust integration TDD: stub-first (mandatory) - -Rust integration tests live in a separate test crate (`tests/.rs`) that imports from `lib.rs`. Any test referencing not-yet-existing public API can only RED at *build* time, which masks assertion diagnostics. To avoid this, **for every Rust task whose `@test` step writes an integration test against public API that does not yet exist**, dispatch a stub-first `@make` pass *before* `@test` runs: +To get a clean runtime RED, dispatch a **stub-first `@make` pass** *before* `@test` runs: **Stub pass (split from Phase 7's body pass):** 1. Dispatch `@make` in **standard mode** (no tests exist yet) with this exact scope: - - **Goal:** add the planned public API as `todo!()`-bodied stubs so the integration test will compile. - - **Files to modify:** `src/lib.rs` (add `pub mod …;` declarations) plus any new `src/.rs` files containing the stub functions/structs. - - **Stubs only:** every function body is exactly `todo!()`. Every method body is exactly `todo!()`. Public structs may use `pub struct Foo;` or `pub struct Foo { /* fields TBD */ }` — but no logic. - - **Signatures must match the planned final API exactly** (return types, lifetimes, generics) — otherwise the integration test will mismatch later. Lift signatures from the Phase 3 plan / Phase 5 task spec. + - **Goal:** add the planned API as `todo!()`-bodied stubs so the test will compile. + - **Files to modify:** the relevant `src/.rs` for module tests, or `src/lib.rs` plus any new `src/.rs` for integration tests (the latter need `pub mod …;` declarations so the test crate can import). + - **Stubs only:** every function body is exactly `todo!()`. Every method body is exactly `todo!()`. Structs may use `pub struct Foo;` or `pub struct Foo { /* fields TBD */ }` — but no logic. + - **Signatures must match the planned final API exactly** (return types, lifetimes, generics, visibility). Lift signatures from the Phase 3 plan / Phase 5 task spec. - **Acceptance criteria:** `cargo check` (wrapped in `nix develop -c …` if the project has a devshell) passes; no test command is run. - **Dispatch Hygiene still applies:** the stub pass is small and finalized — no draft bodies, no contradictory signatures. 2. Verify `cargo check` passed in `@make`'s output. If not, fix and re-dispatch the stub pass before continuing. -3. Dispatch `@test` as normal. The integration test now compiles; running it panics on `todo!()` at runtime, which is a clean `MISSING_BEHAVIOR` RED with a stack trace — far better than the build-error-RED form. +3. Dispatch `@test`. The test now compiles; running it panics on `todo!()` at runtime, which is a clean `MISSING_BEHAVIOR` RED with a stack trace — far better than the build-error-RED form. 4. **Panic-coverage check (MANDATORY).** After `@test` returns, re-run the test command in the orchestrator and verify that **every test in the new file panics on `todo!()`** (i.e. every test exercises at least one of the stubbed symbols). The decision rule: - If the test output shows N panics for N tests → proceed to body pass. - If any test passes without a `todo!()` panic → that test is structural-only (asserting type / variant-count / field facts without calling the stubbed code). **Reject** `@test`'s output and route back with the "Stub-first run: tests pass with zero `todo!()` panics" decision-table verdict. Require `@test` to rewrite each non-panicking test so it actually invokes the stubbed function/method. - This check is the only thing standing between false-green coverage and the body-pass commit. Skipping it has produced regressions like a system that compiles, "passes" tests, and silently no-ops in production. 5. Continue to Phase 7's body pass (`@make` in TDD mode), where the same files are revisited and the `todo!()` bodies are replaced. -**This routing is mandatory** for new public API in Rust. It is **not** required when the integration test exercises an existing public API (e.g. a behavior fix where the function already exists) — in that case `@test` runs directly and `@make` modifies the body in Phase 7. +**This routing is mandatory** whenever new symbols are introduced in Rust (module or integration). It is **not** required when the test exercises an *existing* function/method (e.g. a behavior fix) — in that case `@test` runs directly and `@make` modifies the body in Phase 7. The stub pass and the body pass each produce their own atomic commit (per Phase 9 rules): `feat(): scaffold with todo!() stubs` followed by `feat(): implement ` (or whichever conventional type fits).