feat(opencode): one-task-per-run model + 9 routing fixes (ADRs 13-21)

Captures the design grilling outcome. Adds ADRs 13-21 covering: - run-level plan_rework_remaining counter to bound P3<->P5.5/P7/P8 thrash - non-resumable workflow with throwaway-worktree recovery procedure - @simplify advisory at every gate (not just Phase 8) - Phase 8 fix specs go to disk as task-fix-N.md (preserves ADR-6) - Phase 5.5 BLOCK protocol: orchestrator edits plan, decrements counter, re-enters P4 - Phase 8 NOT_TESTABLE manifest in reviewer prompt - unified Implementation Incomplete diagnosis (test_design / production_logic / split_needed) - Phase 1 working-tree cleanliness + depends-on enforcement - one-task-per-run pivot: Phase 5 still splits N tasks, only task-1 runs; tasks 2..N filed as sub-issues with rich seed bodies; split_needed at P7 aborts to Failure Handler (one-task-per-run = no salvageable prior work) Auto-resolves big-diff Phase 8 reviews, cross-task regression-within-run, and mid-flight task-split routing. Rewrites routing matrix and three Mermaid diagrams; updates @pm (depends-on frontmatter, split-time filing), @check (third diagnosis verdict), @make (escalate: split_needed flag).
2026-05-08 13:02:54 +02:00 · 2026-05-08 13:02:54 +02:00 · af6481a5a7
commit af6481a5a7
parent 0b15944d1c
5 changed files with 342 additions and 130 deletions
--- a/config/opencode/agents/check.md
+++ b/config/opencode/agents/check.md
@ -145,6 +145,14 @@ High-level:
 - No excessive mocking (>2 mocks is a yellow flag)?
 - Diagnose issues and report findings. Do NOT edit test files — the caller routes fixes back to `@test`.

+**When diagnosing `Implementation Incomplete` from `@make`** (the `/workflow` Phase 7 unified diagnosis path, per ADR-19): you receive `@make`'s self-diagnosis hint (`escalate: test_design`, `escalate: split_needed`, or no flag), the test files, the in-progress production diff, and the task spec. Return one of three verdicts in your output:
+
+- **`test_design`** — the test demands production code that's impossible, internally-inconsistent, or testing the wrong observable. The fix is in the tests. (Caller routes to `@test` for redesign.)
+- **`production_logic`** — the test is sound; `@make`'s implementation is wrong or incomplete. The fix is in the production code. (Caller re-dispatches `@make` with your notes.)
+- **`split_needed`** — the task itself is over-scoped: no realistic implementation can satisfy the AC within the task's stated files-to-modify. Either the AC require touching files not listed, or the AC mix multiple concerns that should have been split at Phase 5 (per the workflow's Split Heuristic). (Caller aborts to the Failure Handler; the user re-plans from scratch.)
+
+State the verdict explicitly — e.g. "Diagnosis: `split_needed` — the AC implies modifying both `src/foo.rs` and the EventLoop registration in `src/main.rs`, but the task spec lists only `src/foo.rs`. This is a Phase 5 split error, not a code or test error." Calibrate confidence honestly: `split_needed` is the heaviest verdict (it kills the run); reserve it for cases where neither test redesign nor code-fix would plausibly converge.
+
 **When reviewing NOT_TESTABLE verdicts:**
 - Does the reason match an allowed category (config-only, external-system, non-deterministic, pure-wiring)?
 - Was a test approach genuinely attempted?