nixcfg

Author	SHA1	Message	Date
Harald Hoyer	534361f1b5	feat(opencode): extend Phase 7 escalation to mid-implementation test-design errors Phase 7's escalation rule was gated on @make flagging concerns "during entry validation" only. When @make got past entry validation, started implementing, and ground for 2-3 attempts because the test demanded impossible production code, the orchestrator had no documented route — it would re-dispatch @make with marginal context tweaks instead of recognizing the failure as test-architecture. Splits the escalation into two clearly-named paths (entry-validation vs mid-implementation) that both route through @check (test diagnosis) → @test (redesign) → fresh @make. Bounded at max 2 escalation cycles before reverting to a Phase 3 plan revisit, to prevent thrashing when the actual problem is upstream. @make.md gains a new Iteration Limits red-flag class — "Test-design suspicion" — instructing @make to stop and report with an explicit `escalate: test_design` flag in the Blocking Issue section. The flag is the routing signal the orchestrator switches on.	2026-05-08 10:20:16 +02:00
Harald Hoyer	aac4d44a49	feat(opencode): file unresolved bugs/blockers as TODO sub-issues in Phase 9 A workflow run wrapped up with "Unresolved: Score not resetting on game restart (pre-existing bug, out of scope)" — a real bug discovered while implementing GAL-39. Buried in summary.md, which is per-run, untracked, overwritten on the next run, and read by nobody (the user has walked away by design). Adds a File Follow-ups subsection to Phase 9, after the TODO Update. Tracked-worthy items are routed through @pm as sub-issues of the current issue (parent: $ISSUE_ID), so they auto-show in the parent's Sub-issues list and don't need a README.md category at unattended runtime. Three categories file an issue: - Pre-existing bugs found out of scope → label `bug` - Unresolved review-loop blockers (Phase 4 or 8 cycle exhaustion) → label `followup` - @test NOT_TESTABLE "future seam" notes → label `tech-debt` Things explicitly NOT filed: @simplify advisories the orchestrator chose not to act on (records, not missing work), cosmetic nits, duplicates of existing issues. Those live in the run summary's new "Advisory notes (not filed)" section. Renames "Commit TODO Changes" subsection so the worked issue update plus any filed follow-ups commit together as one atomic chore(todo) commit. The Run Summary's old "Unresolved items" bullet is replaced with two sharper bullets: "Filed follow-ups" (lists IDs of created sub-issues) and "Advisory notes (not filed)".	2026-05-08 10:20:16 +02:00
Harald Hoyer	c3407c9c98	refactor(opencode): drop @pm git-ref read mode, no longer used by workflow @pm originally had two read modes — git-ref (via `git show <ref>:TODO.md`) and filesystem. Git-ref existed because the workflow once ran in a bare repo with no working tree. Once the workflow was simplified to assume opencode is launched in the worktree, every dispatch (Phase 2 read, Phase 9 update, Failure handler) uses filesystem mode. Git-ref mode became dead weight: it added bash permissions, an allowlist, a "Bash Discipline" section, and a dual-mode "How to Read" section, but the workflow never invoked it. A reviewer correctly flagged the resulting inconsistency between the two-mode docs and the single-mode usage. @pm is now single-mode. Bash access is removed (bash: false, no permission allowlist). The "How to Read" section collapses to "you operate on TODO/ via the filesystem only" with one explicit pointer that ad-hoc historical reads (`git show main:TODO/GAL-39.md`) are out of scope — the user can run that themselves. The workflow drops the now-redundant "(live filesystem mode)" qualifier from Phase 2 / Phase 9 / Failure handler dispatches and the Roles & Dispatch table updates @pm's constraint to "No bash."	2026-05-08 10:20:16 +02:00
Harald Hoyer	cc971b80e0	feat(opencode): add Phase 5.5 task-split review by @check ppries' README mentioned "@check reviews task split for completeness and coverage" as a workflow step but the gist's actual workflow.md never implemented it, and neither did ours. Without a split-review gate between Phase 5 and Phase 6, an over- or under-split task surfaces only at Phase 8 final review — after expensive @test and @make dispatches have already run on a broken split. Adds Phase 5.5: a short, focused review of the task split as a set, dispatched only to @check (split is structural / coverage, not complexity, so @simplify is not involved). The dispatch passes the absolute paths to plan.md and every task-N.md and asks @check to evaluate the split against five questions: coverage, no overlap, single-purpose, integration contracts, testable AC. Loop limited to 2 cycles (less than the plan-review's 3), with a BLOCK verdict routing back to Phase 4 when the plan itself does not decompose cleanly. The phase is explicitly framed as "a quick gate, not a deep review" — no line-by-line code feedback (there's no code yet), no design re-litigation (that was Phase 4) — to keep it from expanding into a second plan review. No phase renumbering downstream — 5.5 fits between 5 and 6 without disturbing existing cross-references.	2026-05-08 10:20:16 +02:00
Harald Hoyer	236b4d2470	fix(opencode): teach orchestrator about subagents and enforce on-disk artifacts Two related orchestration failures from recent runs: 1. An orchestrator missed the multi-agent concept entirely and produced reviews / implementations itself instead of dispatching @check / @make. The workflow described phases as "Dispatch @<name>" everywhere but never explained who the cast was, what "dispatch" meant, or that the orchestrator (agent: build) is distinct from the subagents. 2. Another orchestrator dispatched @test pointing at a $RUN_DIR/task-N.md that it never wrote — the file-write instruction in Phase 5 was a single bolded sentence inside a paragraph, easy to skim past, and nothing checked artifact existence before dispatching. Adds a top-level "Roles & Dispatch" section between the parse line and Run Artifacts. It establishes the multi-agent model, lists the cast (@check / @simplify / @test / @make / @pm) with one-line role and permission notes, defines "Dispatch" as a tool call (not a role-play instruction), and lists three anti-patterns the orchestrator must avoid (acting as a subagent, skipping a dispatch, paraphrasing artifacts instead of letting subagents read them from disk). Restructures Phase 5 as five explicit numbered steps. Step 4 mandates writing each task to $RUN_DIR/task-<N>.md and verifying with test -f; step 5 requires dropping inline copies once the file is the source of truth. The phase is "not done" until every task file exists on disk. Adds a row to Dispatch Hygiene's Pre-Dispatch Validation table that requires test -f verification of any artifact path the dispatch references; missing files route back to the producing phase.	2026-05-08 10:20:16 +02:00
Harald Hoyer	25f4c6f179	feat(opencode): write plan and task specs to .workflow/run-<id>/ on disk Plans and task specs were previously re-emitted as inline prompt text on every dispatch. That meant @check and @simplify might receive paraphrased versions of the same plan, mid-loop revisions could leak as "actually let me reconsider" passes, and the same content rode through orchestrator context many times across review/test/make dispatches. The orchestrator now writes finalized artifacts to a per-run directory: .workflow/run-<ISSUE-ID>/ plan.md # Phase 3 output task-1.md # Phase 5 output, one file per task task-2.md summary.md # Phase 9 output (was .workflow/workflow-summary.md) Subagents read these by absolute path; the dispatch prompt body shrinks to agent role, artifact path, and short per-dispatch context. Mid-loop revisions (Phase 4 review cycles, etc.) edit the file in place so every subsequent dispatch sees the same byte-for-byte source of truth — the Finalized-Text Rule has a physical anchor. Phase 1 captures WORKTREE_PATH, ISSUE_ID, and RUN_DIR. Phase 3 mkdirs the run directory and writes plan.md. Phase 4 dispatches reviewers against plan.md by path. Phase 5 writes one task-N.md per task. Phase 6/7 dispatch @test/@make against task-N.md by path; the @test→@make TDD handoff stays inline. Phase 8 reviewers re-read plan.md from disk. Phase 9 renames "Local Summary" to "Run Summary" and writes to $RUN_DIR/summary.md. The staging exclusion broadens from a single file to the whole .workflow/ tree, and Failure Handling follows suit.	2026-05-08 10:20:16 +02:00
Harald Hoyer	4dc3cffba6	refactor(opencode): allow @test inside #[cfg(test)] mod blocks, drop file gate The previous design routed Rust unit tests to NOT_TESTABLE: Rust unit-only because @test was forbidden from touching src/, which forced @make to write both the production code and the inline #[cfg(test)] mod tests in one dispatch — losing TDD's RED→GREEN separation. But Rust module tests inside #[cfg(test)] mod tests { ... } are the canonical unit-testing idiom, not an edge case. @test's File Constraint now allows modifying src/*/.rs, but strictly inside #[cfg(test)] mod <name> { ... } blocks. Every line outside such a block stays read-only — adding pub, importing crates, declaring siblings, or any other production change is forbidden. Integration tests at tests/*/.rs continue to work as before. The Phase 6 post-step file gate (git status snapshot + comm -23 diff against test-pattern globs) is removed. With @test legitimately writing inside src/, a path-based gate proves nothing — production edits and cfg(test) edits live in the same files. The boundary is enforced by the prompt rule and Phase 8 reviewer scrutiny. Phase 5 test-file guidance updated to distinguish module vs integration tests for Rust, with stub-first TDD applying to both when symbols don't yet exist. The "Rust integration TDD: stub-first" section is renamed to "Rust stub-first TDD" and now covers module tests too. NOT_TESTABLE's "Rust unit-only" reason is replaced with "Missing testability seam" for cases where the production code needs a small change before tests can be authored.	2026-05-08 10:20:16 +02:00
Harald Hoyer	5a5cf269dc	refactor(opencode): migrate @pm and workflow to per-issue TODO/ folder The single TODO.md schema is replaced by a Linear-style folder layout matching the user's existing setup at /home/harald/git/bglga/TODO: TODO/ ├── README.md # category-grouped index (top-level only) ├── GAL-1.md ├── GAL-2.md └── … Each issue file has YAML frontmatter (id, title, status, parent, labels) and a body with optional sections (Sub-issues, Acceptance criteria, Integration test hints, Comments). The status set shrinks to Todo / In Progress / Done; Branch / PR / Priority / Assignee fields are gone. Comments are date-only. @pm gains directory-walking semantics (still scoped to TODO/), bash allowlist additions for git ls-tree and ls, and a propagation rule: status flips to/from Done update the dependent index — README.md for top-level issues, or the parent file's Sub-issues line for sub-issues. The workflow's Phase 1 sanity check now verifies TODO/, TODO/README.md, and TODO/<ID>.md all exist. Phase 2 reads the issue file and flips Todo to In Progress with index propagation. Phase 9 stages everything under TODO/ as a separate atomic chore(todo) commit, sets the status to Done (or leaves In Progress for incomplete runs), and adds a date + branch + commit comment. Failure handler routes through the same directory.	2026-05-08 10:20:16 +02:00
Harald Hoyer	91ba5bd272	fix(opencode): close two false-green test loopholes and the orchestrator-as-implementer escape hatch A workflow run on a Bevy weaving feature exposed two compounding failures: 1. @test wrote 8 structural-only Rust tests that never invoked weave_enemies or trigger_weaving. Every test passed against the stub-first @make pre-pass because none of them called the stubbed symbols, so todo!() never fired. The body-pass committed code that "passed" the suite and silently broke trigger_weaving in special stages. 2. @check found the trigger_weaving regression at Phase 8 (final review) and the orchestrator decided to "fix them directly" rather than dispatching @make — taking the license offered by the existing review-loop wording. Test-quality fixes: - Phase 3 Test Design now requires each behavior to be expressed as an action + observable outcome. Structural facts ("enum has 3 variants", "struct has these fields") are explicitly disqualified. - Phase 6 stub-first flow gains a mandatory Panic-coverage check: after @test returns, the orchestrator re-runs the test command and rejects the output unless every test panics on todo!() (i.e. every test exercises at least one stubbed symbol). Any passing test is structural-only and routes back to @test. - Phase 6 decision table gets a "Stub-first run: tests pass with zero todo!() panics" row covering the same case. - @test's Test Philosophy gains an explicit Do-NOT-write list of structural-only patterns (variant_count, type ascriptions, Box::new(my_fn), struct-literal-only flows, all-pass-on-stubs) plus a positive rule: every test must call a function and assert on observable outcome, or return NOT_TESTABLE rather than pad the suite. Orchestrator boundary fix: - Phase 8 review loop replaces "fix them directly (no need to re-dispatch @make for small fixes)" with the principle "the orchestrator does not write production code; @make does". BLOCK, behavioral, correctness, and test-quality findings round-trip through @make. Only AST-preserving cosmetic edits (typos in comments, trailing newlines) may be applied directly. Compiler- detected issues (unused imports, dead code) go through @make.	2026-05-08 10:20:16 +02:00
Harald Hoyer	91e8aab383	fix(opencode): require sequential @make dispatches, tighten @test parallelism A workflow run dispatched two @make agents in parallel. Both agents write source files, run cargo verification commands, and may both target the same file (e.g. src/lib.rs for a new `pub mod` plus a later registration) — concurrent edits corrupt each other and Cargo's target/ lock serialises the builds anyway, so parallelism only adds risk without giving speedup. Phase 7 now states explicitly that @make dispatches are SEQUENTIAL — never in parallel — and lists the reasons inline. The rule covers all @make invocations: standard mode, TDD mode, the Rust stub-pass and body-pass, and integration-fix dispatches. Stub-pass/body-pass ordering within a task is strict so @test always RED-verifies against a deterministic crate state. Phase 6's parallelism rule splits per language: Python parallel @test is still allowed for disjoint test files, but Rust @test runs sequentially since cargo serialises the build and shared crate-level helper files race.	2026-05-08 10:20:16 +02:00
Harald Hoyer	f0cc300358	fix(opencode): make Phase 6 file gate see untracked files `git diff --name-only` only shows tracked files with unstaged modifications. It does not show untracked files — which is precisely the state of any new test file @test creates, since @test's sandbox denies `git add`. The pre/post snapshots therefore both missed new files entirely and `comm -23 post pre` returned nothing, letting the gate cheerfully conclude nothing changed even when @test had just created tests/foo.rs (or, worse, src/lib.rs). Switch both snapshots to `git status --porcelain \| sed 's/^...//' \| sort -u`, which captures modified, staged, and untracked files in a single pass. Inline rationale notes the untracked blind spot so the orchestrator does not fall back to git diff.	2026-05-08 10:20:16 +02:00
Harald Hoyer	17ad3ba6ef	refactor(opencode): hoist dispatch rules into a top-level Dispatch Hygiene section A workflow run on GAL-38 dispatched a plan to @check that contained a self-contradicting "Wait, the movement should be direct position assignment, not delta… Let me reconsider…" passage with two versions of the same move_enemies code, plus drop-in cargo-pasted match arms / function bodies (plan-as-implementation). The rules added in `832306c` caught these patterns when @make was the recipient but did not cover the plan itself, plan-review dispatches, test-author dispatches, or final-review dispatches. Hoists the Finalized-Text Rule and Pre-Dispatch Validation table out of Phase 5/7 into a new top-level "Dispatch Hygiene" section between Phase 3 and Phase 4, and adds an explicit "No-Implementation-in-Plan- or-Spec Rule" that bans drop-in code blocks > ~5 lines, full function bodies, and stage-by-stage transformations from plans and specs alike. Phases 3, 4, 5, 6, 7, 8 each gain a one-line pointer requiring the orchestrator to apply Dispatch Hygiene before sending. Phase 5's former "Code Context Anti-patterns" becomes "Code Context — what to include" with positive framing, deferring the negative list to the hoisted rules. The Phase 6 stub-first section's stale anti-pattern reference is updated to point at Dispatch Hygiene as well.	2026-05-08 10:20:16 +02:00
Harald Hoyer	1aa98a8051	fix(opencode): require real shell timestamp in workflow summary A workflow run wrote the timestamp as `2026-05-06T???:???:?? (session date)` because the agent had no time-of-day source and inserted a placeholder. Phase 9 now mandates capturing the timestamp from the shell at write time via `date -Iseconds` and forbids placeholders — omit the field rather than fabricate one.	2026-05-07 05:45:35 +02:00
Harald Hoyer	5b5c59aa84	feat(opencode): mandate stub-first @make pre-pass for Rust integration TDD Rust integration tests live in a separate test crate that imports from lib.rs, so any test referencing not-yet-existing public API can only RED at build time. The build error masks assertion diagnostics and makes the RED state opaque — no stack trace, no left/right values. For Rust tasks whose @test step writes an integration test against public API that does not yet exist, the orchestrator now dispatches a stub-first @make pass before @test runs: 1. @make adds the planned public API as todo!()-bodied stubs in lib.rs and any new src/<module>.rs. Signatures lifted verbatim from the Phase 5 task spec. Acceptance criterion is cargo check only — no test command runs. 2. @test writes the integration test, which now compiles and panics at todo!() with a stack trace — a clean MISSING_BEHAVIOR RED. 3. Phase 7 dispatches @make again to replace the todo!() bodies with real implementations. Two atomic commits per task: scaffold then implement. Phase 5's Rust test-path guidance now flags the two-dispatch requirement up front. test.md's Rust failure-classification hints recognize todo!() / unimplemented!() panics as MISSING_BEHAVIOR with a pointer to the workflow's stub-first section.	2026-05-07 05:42:16 +02:00
Harald Hoyer	832306c817	fix(opencode): harden workflow against multi-task spec dumps A workflow run on a Rust/Bevy task produced a single @make dispatch covering six tasks (~2 hours of work), with the orchestrator drafting the full replacement code, including a self-contradicting "actually that's wrong, let me correct…" revision pass and a `nix develop --command bash -c "cargo check"` invocation that @make's sandbox denies. None of the failure modes were caught before dispatch. Phase 5 gains three new subsections: - Split Heuristic — explicit rules for when a task must be split (>2 concerns, >50 lines / 2 files, structural+runtime+wiring mix); prescribes the foundations / implementation / wiring split. - Code Context Anti-patterns — the field is for seam-revealing snippets, not finished answers; max ~5-line snippets, no full replacement bodies. - Finalized-Text Rule — task specs must be single-author finalized text, no "actually, that's wrong" revision passes, no two-version code blocks, no unresolved questions. Phase 6 promotes the Rust unit-only NOT_TESTABLE case out of the decision table into a dedicated routing subsection. The orchestrator must pass test specifications (one-line behavior descriptions, target functions, assertion types) to @make — never test code — and run the suite once after @make to capture RED→GREEN evidence. Phase 7 gains a mandatory Pre-Dispatch Validation table that rejects specs containing `bash -c` / `sh -c` (any nesting), `nix develop -c bash`, `cd <path> &&`, oversized Code Context blocks, contradictory revisions, or duplicated test bodies. Repeated trips signal a Phase 5 split problem and route back to splitting.	2026-05-06 20:25:40 +02:00
Harald Hoyer	d5d90d8b9f	fix(opencode): reject Rust src/tests/ paths as a wrong task spec A workflow run on a Bevy/Rust project produced the test-file path `src/tests/test_<feature>.rs`, which @test correctly flagged as contradictory: it isn't a valid Rust test location (would require declaring `mod tests;` in production source, which @test cannot do) yet the file-gate glob `/tests//*.rs` accidentally matched it. Phase 5 now gives language-aware Test File guidance: Python uses colocated or top-level `tests/`, Rust uses crate-level `tests/<feature>.rs`, and Rust unit-only tasks are routed to NOT_TESTABLE for @make to handle inline. Phase 6's file gate gains an explicit anti-pattern clause discarding any new file under `src/` even when the glob matches. @test's own File Constraint mirrors the anti-pattern so the agent rejects the bad path with BLOCKED before the orchestrator's gate even runs — defense in depth on both sides of the dispatch boundary.	2026-05-06 18:31:14 +02:00
Harald Hoyer	e2e35acdae	refactor(opencode): assume opencode runs in the worktree, drop bare-repo plumbing The workflow previously created a worktree itself (Phase 3) and worked around opencode's lack of per-subagent CWD by capturing absolute paths and threading them through every dispatch (the "Subagent Dispatch Convention"). That ceremony exists only because the orchestrator's CWD differed from where subagents were rooted. Now the workflow assumes the user has already created the worktree and launched opencode inside it. Subagents inherit that as their project root, so all the absolute-path plumbing goes away. Phase 3 is removed, phases renumber to 1-9, and the Subagent Dispatch Convention section is dropped. Phase 1 is a sanity check (non-bare worktree, TODO.md present, HEAD not detached, current branch != base branch) that resolves the base branch from an optional second argument or by trying main then master. @pm now uses live filesystem mode against ./TODO.md throughout (the git-ref read mode stays available for ad-hoc use). Phase 8's diff uses git diff "$BASE_BRANCH"...HEAD without git -C wrapping.	2026-05-06 17:31:56 +02:00
Harald Hoyer	f750c76877	fix(opencode): keep workflow-summary.md local, never commit it A per-branch artifact written by every run causes merge conflicts when multiple workflow branches are merged together. The summary is now documented as an intentionally untracked local file: not staged in the main commit, not committed in its own commit, and not staged in the failure-path WIP commit. Recommends the user add `.opencode/` to `.gitignore`.	2026-05-06 16:51:19 +02:00
Harald Hoyer	28c7785816	fix(opencode): pass absolute worktree path to every subagent dispatch Subagents do not inherit the orchestrator's `cd`, so dispatched prompts that referred to files relative to the worktree were resolved against the bare repo root and failed with "file not found" (observed when @check tried to read src/main.rs after Phase 3). Phase 3 now captures `WORKTREE_PATH="$(pwd)"` after entering the worktree. A new "Subagent Dispatch Convention" section requires every dispatch in phases 5, 7, 8, 9, and 10 to open with `Worktree: <path>` and pass file references as absolute paths under `$WORKTREE_PATH/`. Phase 9's diff command uses `git -C "$WORKTREE_PATH"` rather than relying on shell CWD, and @pm updates receive the explicit absolute path to `$WORKTREE_PATH/TODO.md`.	2026-05-06 15:56:45 +02:00
Harald Hoyer	d22acf6906	refactor(opencode): let @pm read TODO.md via git show, drop tempfile Gives @pm narrowly-scoped bash access (git show , git rev-parse ) so it can read TODO.md directly from any git ref. The workflow no longer needs to mktemp + redirect the file before invoking the agent; Phase 2 just tells @pm the bare repo path and default branch and lets it run git show "$DEFAULT_BRANCH:TODO.md" itself. Cleanup steps for the temp snapshot are removed from Phase 10 and the failure handler.	2026-05-06 15:42:17 +02:00
Harald Hoyer	2941faa822	refactor(opencode): make workflow forge-agnostic and read TODO.md from bare repo Drops all GitHub-specific tooling (gh CLI, draft PR creation) so the workflow stops at git commit and leaves push/PR/MR to the user. TODO.md is now expected to be a tracked file on the default branch. Phase 1 verifies the repo is bare via `git rev-parse --is-bare-repository`, resolves the default branch from HEAD / init.defaultBranch, and snapshots TODO.md via `git show "$DEFAULT_BRANCH:TODO.md"` to a tempfile that @pm reads in Phase 2. Phase 10 updates the live TODO.md inside the worktree and commits the change separately. The /review command drops its PR mode for the same reason; @pm documents the read-only-snapshot vs. live-worktree path distinction.	2026-05-06 15:28:08 +02:00
Harald Hoyer	4ec1561af4	feat(opencode): add multi-agent workflow agents and commands Adds @check, @simplify, @test, @make, @pm subagents and the /workflow and /review slash commands from the autonomous multi-agent workflow gist by ppries. @pm is rewritten to manage issues in a local ./TODO.md file instead of Linear (file-only access, documented schema, structured JSON output). /workflow is adapted: TODO.md-based issue context, generic worktree paths (no hardcoded ~/repos/veo/sunstone), generic branch examples, and a Phase 1 guard that verifies origin is on GitHub before any work begins.	2026-05-06 14:56:42 +02:00

22 commits