Real-world observation: @make struggles when @test sets up tests
incorrectly because the existing escalate: test_design trigger is
described abstractly ("test seems to demand wrong thing"). When @make
sees an unfamiliar smell, it tends to attempt implementation, fail,
attempt again, and only escalate after burning 2-3 cycles. The
protocol exists; the recognition criteria don't.
Restructure Entry Validation step 5 into a named "Test triage" step
with a concrete checklist that fires *before* any implementation
attempt. Four categories of smells:
- **Mocking smells:** mocks the SUT, >2 mocks, mock-call-as-primary
assertion, internal-boundary mocking
- **Structural-only smells:** variant counts, type ascriptions,
function-pointer coercion, struct-literal-with-field-reads,
stub-first no-panics (mirrors @test.md's anti-patterns)
- **Wrong-target smells:** asserts on private state / log strings,
demands contradicting spec, physically impossible demands
- **Setup smells:** fixtures bypassing production validation,
wrong-module imports, references to nonexistent infrastructure
Iteration Limits step 5 now cross-references the same checklist
instead of restating abstract criteria, so both gates apply the same
recognition rules with a single source of truth.
A "NOT for" caveat prevents over-eager escalation: when the test is
fine but the implementation is just hard, that's not a smell, that's
the test doing its job.
The checklist is inlined (not pulled from @test.md at runtime) because
subagents have separate contexts. Periodic manual sync between
@make.md's checklist and @test.md's anti-patterns is acceptable —
they shouldn't drift much in practice.
Refs: config/opencode/agents/test.md (anti-patterns + structural-only
list it mirrors), config/opencode/workflow-design.md ADR-19 (unified
Implementation Incomplete diagnosis path)
@make, @test, @check often need to inspect dependency source (trait
definitions, impl details, test patterns) to inform implementation or
verify findings. Opencode applies a CWD check on tool access, so reads
outside the worktree previously prompted for each access.
- Add permission.read/grep/glob path allowlists for the three locations
cargo deps live: ~/.cargo/registry/src/, ~/.cargo/git/checkouts/, and
/nix/store/*-vendor-*/ for crane / buildRustPackage projects.
- Document the discovery pattern in each agent: `cargo metadata
--format-version 1` returns absolute paths via packages[].manifest_path.
- Cross-reference the registry paths from the permission.bash allowlist
comment so future readers see the bash inspection commands (rg/ls)
intentionally accept paths outside CWD.
- @check gets its first permission block (was tools-only before).
Path-pattern syntax for read/grep/glob isn't fully documented; if
opencode rejects it, fall back to `permission: { external_directory:
allow }` at the project config level.
Phase 7's escalation rule was gated on @make flagging concerns "during
entry validation" only. When @make got past entry validation, started
implementing, and ground for 2-3 attempts because the test demanded
impossible production code, the orchestrator had no documented route
— it would re-dispatch @make with marginal context tweaks instead of
recognizing the failure as test-architecture.
Splits the escalation into two clearly-named paths (entry-validation
vs mid-implementation) that both route through @check (test diagnosis)
→ @test (redesign) → fresh @make. Bounded at max 2 escalation cycles
before reverting to a Phase 3 plan revisit, to prevent thrashing when
the actual problem is upstream.
@make.md gains a new Iteration Limits red-flag class — "Test-design
suspicion" — instructing @make to stop and report with an explicit
`escalate: test_design` flag in the Blocking Issue section. The flag
is the routing signal the orchestrator switches on.
Both agents previously hardcoded the Python/uv toolchain. They now
detect the language from marker files (pyproject.toml, Cargo.toml,
flake.nix) and run the appropriate test/lint/format/type-check commands
for Python, Rust, or both. When a flake.nix devshell is present, every
toolchain command is wrapped in `nix develop -c …`.
@make's permission allowlist gains `cargo *` and `nix develop -c *`,
plus matching denies for cargo add/remove/install/publish. The
Verification Tiers and Baseline Verification sections are rewritten as
per-language bullets, and output/TDD-evidence examples are now
language-neutral. Generalised the "no Kubernetes deployments"
constraint to cover any deploy/publish.
@test gains the same devshell + cargo allows (scoped to test, check,
clippy, fmt only — no build/run/install). Its file constraint adds
`tests/**/*.rs` for Rust integration tests, with an explicit note that
Rust unit tests stay with @make because they live inside production
source files. Failure-classification hints add Rust compiler-error
mappings, and the NOT_TESTABLE table gets a "Rust unit-only" row.
Adds @check, @simplify, @test, @make, @pm subagents and the /workflow
and /review slash commands from the autonomous multi-agent workflow
gist by ppries.
@pm is rewritten to manage issues in a local ./TODO.md file instead of
Linear (file-only access, documented schema, structured JSON output).
/workflow is adapted: TODO.md-based issue context, generic worktree
paths (no hardcoded ~/repos/veo/sunstone), generic branch examples,
and a Phase 1 guard that verifies origin is on GitHub before any
work begins.