feat(opencode): extend Phase 7 escalation to mid-implementation test-design errors

Phase 7's escalation rule was gated on @make flagging concerns "during
entry validation" only. When @make got past entry validation, started
implementing, and ground for 2-3 attempts because the test demanded
impossible production code, the orchestrator had no documented route
— it would re-dispatch @make with marginal context tweaks instead of
recognizing the failure as test-architecture.

Splits the escalation into two clearly-named paths (entry-validation
vs mid-implementation) that both route through @check (test diagnosis)
→ @test (redesign) → fresh @make. Bounded at max 2 escalation cycles
before reverting to a Phase 3 plan revisit, to prevent thrashing when
the actual problem is upstream.

@make.md gains a new Iteration Limits red-flag class — "Test-design
suspicion" — instructing @make to stop and report with an explicit
`escalate: test_design` flag in the Blocking Issue section. The flag
is the routing signal the orchestrator switches on.
This commit is contained in:
Harald Hoyer 2026-05-08 10:13:29 +02:00
parent aac4d44a49
commit 534361f1b5
2 changed files with 16 additions and 7 deletions

View file

@ -406,12 +406,20 @@ Do **not** quote the task spec inline.
4. Refactor while keeping green
5. Report RED→GREEN evidence
**Escalation:** If `@make` flags test quality concerns during entry validation:
1. `@make` reports the issue to caller
2. Caller routes to `@check` for diagnosis
3. `@check` reports findings
4. Caller routes to `@test` for fixes
5. Fixed tests return to `@make`
**Escalation — two paths route through `@check``@test` → back to `@make`:**
1. **Entry-validation escalation.** Before implementing, `@make`'s entry check (run tests, verify RED, compare against handoff) reveals test-quality concerns — wrong assertion target, mixed failure codes, mocks of internal boundaries, etc. `@make` reports without writing any production code.
2. **Mid-implementation escalation.** After implementing, `@make` hits its iteration limit (23 attempts) because the test demands production code that's impossible or contradicts the spec. `@make` returns `Implementation Incomplete` with the flag `escalate: test_design`. **Do not** re-dispatch `@make` with marginal context tweaks — that just burns cycles on a test that needs redesign, not better implementation.
In both cases:
1. `@make` returns its report (entry-time concern or mid-impl `escalate: test_design`).
2. Orchestrator routes the report to `@check` for diagnosis (light review of the *tests*, not the implementation).
3. `@check` confirms or rejects the test-design suspicion.
4. **If confirmed:** orchestrator routes to `@test` to redesign the tests. Apply Dispatch Hygiene. Fixed tests return to `@make` for fresh entry validation and a clean implementation attempt.
5. **If rejected:** the issue is in the production code; orchestrator re-dispatches `@make` with `@check`'s diagnostic notes attached.
**Iteration limit on this loop: max 2 cycles.** If a test-design suspicion keeps surfacing but `@check` never confirms it, the design problem is upstream — revisit the Phase 3 plan rather than thrashing between `@test` and `@make`.
For NOT_TESTABLE tasks, `@make` runs in standard mode.