The orchestrator was running `git add ./TODO/` and `git commit -m
chore(todo): ...` itself in Phase 9, baking filesystem-tracker
specifics into commands/workflow.md. The point of @pm as an
abstraction is that it should be swappable — a Linear-backed @pm or a
Notion-backed @pm should drop in without touching the workflow
command. With API-backed trackers, "commit the TODO updates" is a
no-op and `git add ./TODO/` is wrong.
Push persistence shape behind the @pm boundary:
- New @pm capability `Commit pending changes` accepts a commit message
and returns {ok, sha, message}. Filesystem @pm runs `git add ./TODO/`
+ `git commit -m <msg>` and returns the SHA. Tracker-backed
implementations no-op and return sha: null.
- @pm gains tightly-scoped bash access: `git add ./TODO/*`,
`git commit -m *`, `git status --porcelain ./TODO/*` only. Push,
reset, rebase, checkout, branch, tag are explicit denies. Everything
else falls through to the default deny.
- Phase 9 "Commit TODO Changes" replaces orchestrator-side git with a
@pm dispatch; orchestrator constructs the message from run context
and captures the returned SHA for the summary.
- Failure Handler gains a step 5 (commit pending after the failure
comment add). Today the comment is left uncommitted in the working
tree and gets discarded with the throwaway worktree (ADR-14) —
forensic loss. With this change the failure note lands as its own
commit on the failed branch.
- Routing matrix Phase 9 rows updated; ADR-22's superseded wording
about orchestrator-side staging removed.
Stub-pass / body-pass / wip code commits remain orchestrator-owned —
those are code, not tracker-specific.
Refs: config/opencode/workflow-design.md ADR-23
In recent runs the orchestrator skipped @pm and edited TODO/ files
itself, despite the workflow.md anti-pattern warning. Root cause: the
workflow doc literally taught the orchestrator the path layout
(`./TODO/<ID>.md`), making self-help a discoverable shortcut.
Fix: remove the recipe. The orchestrator now never constructs or reads
any per-issue TODO path. All TODO operations go through @pm dispatches;
@pm returns the absolute file path of every issue it touches, and the
orchestrator captures and reuses those paths downstream.
- Phase 1 loses the TODO-existence and depends-on checks (former steps
3 and 9 of the recent edit) — Phase 1 is now git/worktree-only.
- Phase 2 expands @pm's existing dispatch into a `Validate run
prerequisites` operation that returns either {ok: true,
issue_file_path, issue: {...}} or {ok: false, error_code, message}
with error_code in {tracker_missing, issue_not_found,
dependency_unmet, dependency_missing}. depends-on enforcement moves
here.
- Phase 7 split_needed exit, Phase 9 TODO Update, Phase 9 Commit TODO
Changes, and Failure Handler all reference @pm-returned paths or use
`git add ./TODO/` blanketly (safe because Phase 1 verified clean tree
and only @pm writes there during a run).
- pm.md gains a path-return rule: every read returns issue_file_path,
every write returns the modified paths. Run-Prerequisite Output
format documented with all four error codes.
- ADR-22 captures the rationale; routing matrix updates Phase 1/2 rows;
pipeline diagram labels updated.
The fix is discoverability-only — no permission deny on TODO/, per
explicit user direction. The schema lives in agents/pm.md, which the
orchestrator does not load.
Refs: config/opencode/workflow-design.md ADR-22
@make, @test, @check often need to inspect dependency source (trait
definitions, impl details, test patterns) to inform implementation or
verify findings. Opencode applies a CWD check on tool access, so reads
outside the worktree previously prompted for each access.
- Add permission.read/grep/glob path allowlists for the three locations
cargo deps live: ~/.cargo/registry/src/, ~/.cargo/git/checkouts/, and
/nix/store/*-vendor-*/ for crane / buildRustPackage projects.
- Document the discovery pattern in each agent: `cargo metadata
--format-version 1` returns absolute paths via packages[].manifest_path.
- Cross-reference the registry paths from the permission.bash allowlist
comment so future readers see the bash inspection commands (rg/ls)
intentionally accept paths outside CWD.
- @check gets its first permission block (was tools-only before).
Path-pattern syntax for read/grep/glob isn't fully documented; if
opencode rejects it, fall back to `permission: { external_directory:
allow }` at the project config level.
Phase 7's escalation rule was gated on @make flagging concerns "during
entry validation" only. When @make got past entry validation, started
implementing, and ground for 2-3 attempts because the test demanded
impossible production code, the orchestrator had no documented route
— it would re-dispatch @make with marginal context tweaks instead of
recognizing the failure as test-architecture.
Splits the escalation into two clearly-named paths (entry-validation
vs mid-implementation) that both route through @check (test diagnosis)
→ @test (redesign) → fresh @make. Bounded at max 2 escalation cycles
before reverting to a Phase 3 plan revisit, to prevent thrashing when
the actual problem is upstream.
@make.md gains a new Iteration Limits red-flag class — "Test-design
suspicion" — instructing @make to stop and report with an explicit
`escalate: test_design` flag in the Blocking Issue section. The flag
is the routing signal the orchestrator switches on.
@pm originally had two read modes — git-ref (via `git show <ref>:TODO.md`)
and filesystem. Git-ref existed because the workflow once ran in a bare
repo with no working tree. Once the workflow was simplified to assume
opencode is launched in the worktree, every dispatch (Phase 2 read,
Phase 9 update, Failure handler) uses filesystem mode. Git-ref mode
became dead weight: it added bash permissions, an allowlist, a "Bash
Discipline" section, and a dual-mode "How to Read" section, but the
workflow never invoked it. A reviewer correctly flagged the resulting
inconsistency between the two-mode docs and the single-mode usage.
@pm is now single-mode. Bash access is removed (bash: false, no
permission allowlist). The "How to Read" section collapses to "you
operate on TODO/ via the filesystem only" with one explicit pointer
that ad-hoc historical reads (`git show main:TODO/GAL-39.md`) are
out of scope — the user can run that themselves.
The workflow drops the now-redundant "(live filesystem mode)"
qualifier from Phase 2 / Phase 9 / Failure handler dispatches and
the Roles & Dispatch table updates @pm's constraint to "No bash."
The previous design routed Rust unit tests to NOT_TESTABLE: Rust
unit-only because @test was forbidden from touching src/, which
forced @make to write both the production code and the inline
#[cfg(test)] mod tests in one dispatch — losing TDD's RED→GREEN
separation. But Rust module tests inside #[cfg(test)] mod tests
{ ... } are the canonical unit-testing idiom, not an edge case.
@test's File Constraint now allows modifying src/**/*.rs, but
strictly inside #[cfg(test)] mod <name> { ... } blocks. Every line
outside such a block stays read-only — adding pub, importing crates,
declaring siblings, or any other production change is forbidden.
Integration tests at tests/**/*.rs continue to work as before.
The Phase 6 post-step file gate (git status snapshot + comm -23
diff against test-pattern globs) is removed. With @test legitimately
writing inside src/, a path-based gate proves nothing — production
edits and cfg(test) edits live in the same files. The boundary is
enforced by the prompt rule and Phase 8 reviewer scrutiny.
Phase 5 test-file guidance updated to distinguish module vs
integration tests for Rust, with stub-first TDD applying to both
when symbols don't yet exist. The "Rust integration TDD: stub-first"
section is renamed to "Rust stub-first TDD" and now covers module
tests too. NOT_TESTABLE's "Rust unit-only" reason is replaced with
"Missing testability seam" for cases where the production code
needs a small change before tests can be authored.
A workflow run produced test names like move_enemies_following_path_
panics_on_todo, path_types_randomly_assigned, and spawn_enemies_
special_stage_panics_on_todo. The first and third leak the stub-first
RED mechanic into the test name; once @make's body pass turns them
GREEN, the name lies. The middle one is too vague to describe a
contract.
Adds a Test Naming subsection to @test's Test Philosophy stating the
TDD survival principle — the name describes the contract under test,
not the current state, and must remain accurate after the body pass.
Bans ..._panics_on_todo / ..._fails_red / ..._stub_works / generic
placeholders / vague verbs / implementation-detail leakage. Requires
action + observable outcome and shows bad-to-good rewrites of the
three names from this run.
The single TODO.md schema is replaced by a Linear-style folder layout
matching the user's existing setup at /home/harald/git/bglga/TODO:
TODO/
├── README.md # category-grouped index (top-level only)
├── GAL-1.md
├── GAL-2.md
└── …
Each issue file has YAML frontmatter (id, title, status, parent,
labels) and a body with optional sections (Sub-issues, Acceptance
criteria, Integration test hints, Comments). The status set shrinks
to Todo / In Progress / Done; Branch / PR / Priority / Assignee
fields are gone. Comments are date-only.
@pm gains directory-walking semantics (still scoped to TODO/), bash
allowlist additions for git ls-tree and ls, and a propagation rule:
status flips to/from Done update the dependent index — README.md for
top-level issues, or the parent file's Sub-issues line for sub-issues.
The workflow's Phase 1 sanity check now verifies TODO/, TODO/README.md,
and TODO/<ID>.md all exist. Phase 2 reads the issue file and flips Todo
to In Progress with index propagation. Phase 9 stages everything under
TODO/ as a separate atomic chore(todo) commit, sets the status to Done
(or leaves In Progress for incomplete runs), and adds a date + branch +
commit comment. Failure handler routes through the same directory.
A workflow run on a Bevy weaving feature exposed two compounding
failures:
1. @test wrote 8 structural-only Rust tests that never invoked
weave_enemies or trigger_weaving. Every test passed against the
stub-first @make pre-pass because none of them called the
stubbed symbols, so todo!() never fired. The body-pass committed
code that "passed" the suite and silently broke trigger_weaving
in special stages.
2. @check found the trigger_weaving regression at Phase 8 (final
review) and the orchestrator decided to "fix them directly"
rather than dispatching @make — taking the license offered by
the existing review-loop wording.
Test-quality fixes:
- Phase 3 Test Design now requires each behavior to be expressed as
an action + observable outcome. Structural facts ("enum has 3
variants", "struct has these fields") are explicitly disqualified.
- Phase 6 stub-first flow gains a mandatory Panic-coverage check:
after @test returns, the orchestrator re-runs the test command and
rejects the output unless every test panics on todo!() (i.e. every
test exercises at least one stubbed symbol). Any passing test is
structural-only and routes back to @test.
- Phase 6 decision table gets a "Stub-first run: tests pass with zero
todo!() panics" row covering the same case.
- @test's Test Philosophy gains an explicit Do-NOT-write list of
structural-only patterns (variant_count, type ascriptions,
Box::new(my_fn), struct-literal-only flows, all-pass-on-stubs)
plus a positive rule: every test must call a function and assert
on observable outcome, or return NOT_TESTABLE rather than pad the
suite.
Orchestrator boundary fix:
- Phase 8 review loop replaces "fix them directly (no need to
re-dispatch @make for small fixes)" with the principle "the
orchestrator does not write production code; @make does". BLOCK,
behavioral, correctness, and test-quality findings round-trip
through @make. Only AST-preserving cosmetic edits (typos in
comments, trailing newlines) may be applied directly. Compiler-
detected issues (unused imports, dead code) go through @make.
Rust integration tests live in a separate test crate that imports from
lib.rs, so any test referencing not-yet-existing public API can only
RED at build time. The build error masks assertion diagnostics and
makes the RED state opaque — no stack trace, no left/right values.
For Rust tasks whose @test step writes an integration test against
public API that does not yet exist, the orchestrator now dispatches a
stub-first @make pass before @test runs:
1. @make adds the planned public API as todo!()-bodied stubs in
lib.rs and any new src/<module>.rs. Signatures lifted verbatim
from the Phase 5 task spec. Acceptance criterion is cargo check
only — no test command runs.
2. @test writes the integration test, which now compiles and panics
at todo!() with a stack trace — a clean MISSING_BEHAVIOR RED.
3. Phase 7 dispatches @make again to replace the todo!() bodies with
real implementations. Two atomic commits per task: scaffold then
implement.
Phase 5's Rust test-path guidance now flags the two-dispatch
requirement up front. test.md's Rust failure-classification hints
recognize todo!() / unimplemented!() panics as MISSING_BEHAVIOR with
a pointer to the workflow's stub-first section.
A workflow run on a Bevy/Rust project produced the test-file path
`src/tests/test_<feature>.rs`, which @test correctly flagged as
contradictory: it isn't a valid Rust test location (would require
declaring `mod tests;` in production source, which @test cannot do)
yet the file-gate glob `**/tests/**/*.rs` accidentally matched it.
Phase 5 now gives language-aware Test File guidance: Python uses
colocated or top-level `tests/`, Rust uses crate-level `tests/<feature>.rs`,
and Rust unit-only tasks are routed to NOT_TESTABLE for @make to
handle inline. Phase 6's file gate gains an explicit anti-pattern
clause discarding any new file under `src/` even when the glob matches.
@test's own File Constraint mirrors the anti-pattern so the agent
rejects the bad path with BLOCKED before the orchestrator's gate
even runs — defense in depth on both sides of the dispatch boundary.
Both agents previously hardcoded the Python/uv toolchain. They now
detect the language from marker files (pyproject.toml, Cargo.toml,
flake.nix) and run the appropriate test/lint/format/type-check commands
for Python, Rust, or both. When a flake.nix devshell is present, every
toolchain command is wrapped in `nix develop -c …`.
@make's permission allowlist gains `cargo *` and `nix develop -c *`,
plus matching denies for cargo add/remove/install/publish. The
Verification Tiers and Baseline Verification sections are rewritten as
per-language bullets, and output/TDD-evidence examples are now
language-neutral. Generalised the "no Kubernetes deployments"
constraint to cover any deploy/publish.
@test gains the same devshell + cargo allows (scoped to test, check,
clippy, fmt only — no build/run/install). Its file constraint adds
`tests/**/*.rs` for Rust integration tests, with an explicit note that
Rust unit tests stay with @make because they live inside production
source files. Failure-classification hints add Rust compiler-error
mappings, and the NOT_TESTABLE table gets a "Rust unit-only" row.
Gives @pm narrowly-scoped bash access (git show *, git rev-parse *) so
it can read TODO.md directly from any git ref. The workflow no longer
needs to mktemp + redirect the file before invoking the agent; Phase 2
just tells @pm the bare repo path and default branch and lets it run
git show "$DEFAULT_BRANCH:TODO.md" itself. Cleanup steps for the temp
snapshot are removed from Phase 10 and the failure handler.
Drops all GitHub-specific tooling (gh CLI, draft PR creation) so the
workflow stops at git commit and leaves push/PR/MR to the user.
TODO.md is now expected to be a tracked file on the default branch.
Phase 1 verifies the repo is bare via `git rev-parse --is-bare-repository`,
resolves the default branch from HEAD / init.defaultBranch, and snapshots
TODO.md via `git show "$DEFAULT_BRANCH:TODO.md"` to a tempfile that @pm
reads in Phase 2. Phase 10 updates the live TODO.md inside the worktree
and commits the change separately. The /review command drops its PR
mode for the same reason; @pm documents the read-only-snapshot vs.
live-worktree path distinction.
Adds @check, @simplify, @test, @make, @pm subagents and the /workflow
and /review slash commands from the autonomous multi-agent workflow
gist by ppries.
@pm is rewritten to manage issues in a local ./TODO.md file instead of
Linear (file-only access, documented schema, structured JSON output).
/workflow is adapted: TODO.md-based issue context, generic worktree
paths (no hardcoded ~/repos/veo/sunstone), generic branch examples,
and a Phase 1 guard that verifies origin is on GitHub before any
work begins.