Real-world observation: @make struggles when @test sets up tests
incorrectly because the existing escalate: test_design trigger is
described abstractly ("test seems to demand wrong thing"). When @make
sees an unfamiliar smell, it tends to attempt implementation, fail,
attempt again, and only escalate after burning 2-3 cycles. The
protocol exists; the recognition criteria don't.
Restructure Entry Validation step 5 into a named "Test triage" step
with a concrete checklist that fires *before* any implementation
attempt. Four categories of smells:
- **Mocking smells:** mocks the SUT, >2 mocks, mock-call-as-primary
assertion, internal-boundary mocking
- **Structural-only smells:** variant counts, type ascriptions,
function-pointer coercion, struct-literal-with-field-reads,
stub-first no-panics (mirrors @test.md's anti-patterns)
- **Wrong-target smells:** asserts on private state / log strings,
demands contradicting spec, physically impossible demands
- **Setup smells:** fixtures bypassing production validation,
wrong-module imports, references to nonexistent infrastructure
Iteration Limits step 5 now cross-references the same checklist
instead of restating abstract criteria, so both gates apply the same
recognition rules with a single source of truth.
A "NOT for" caveat prevents over-eager escalation: when the test is
fine but the implementation is just hard, that's not a smell, that's
the test doing its job.
The checklist is inlined (not pulled from @test.md at runtime) because
subagents have separate contexts. Periodic manual sync between
@make.md's checklist and @test.md's anti-patterns is acceptable —
they shouldn't drift much in practice.
Refs: config/opencode/agents/test.md (anti-patterns + structural-only
list it mirrors), config/opencode/workflow-design.md ADR-19 (unified
Implementation Incomplete diagnosis path)