nixcfg

Author	SHA1	Message	Date
Harald Hoyer	56713cd7b8	feat(opencode): @pm owns the TODO commit (ADR-23) The orchestrator was running `git add ./TODO/` and `git commit -m chore(todo): ...` itself in Phase 9, baking filesystem-tracker specifics into commands/workflow.md. The point of @pm as an abstraction is that it should be swappable — a Linear-backed @pm or a Notion-backed @pm should drop in without touching the workflow command. With API-backed trackers, "commit the TODO updates" is a no-op and `git add ./TODO/` is wrong. Push persistence shape behind the @pm boundary: - New @pm capability `Commit pending changes` accepts a commit message and returns {ok, sha, message}. Filesystem @pm runs `git add ./TODO/` + `git commit -m <msg>` and returns the SHA. Tracker-backed implementations no-op and return sha: null. - @pm gains tightly-scoped bash access: `git add ./TODO/`, `git commit -m `, `git status --porcelain ./TODO/*` only. Push, reset, rebase, checkout, branch, tag are explicit denies. Everything else falls through to the default deny. - Phase 9 "Commit TODO Changes" replaces orchestrator-side git with a @pm dispatch; orchestrator constructs the message from run context and captures the returned SHA for the summary. - Failure Handler gains a step 5 (commit pending after the failure comment add). Today the comment is left uncommitted in the working tree and gets discarded with the throwaway worktree (ADR-14) — forensic loss. With this change the failure note lands as its own commit on the failed branch. - Routing matrix Phase 9 rows updated; ADR-22's superseded wording about orchestrator-side staging removed. Stub-pass / body-pass / wip code commits remain orchestrator-owned — those are code, not tracker-specific. Refs: config/opencode/workflow-design.md ADR-23	2026-05-08 14:04:47 +02:00
Harald Hoyer	a3e0de6d04	feat(opencode): hide TODO paths from orchestrator (ADR-22) In recent runs the orchestrator skipped @pm and edited TODO/ files itself, despite the workflow.md anti-pattern warning. Root cause: the workflow doc literally taught the orchestrator the path layout (`./TODO/<ID>.md`), making self-help a discoverable shortcut. Fix: remove the recipe. The orchestrator now never constructs or reads any per-issue TODO path. All TODO operations go through @pm dispatches; @pm returns the absolute file path of every issue it touches, and the orchestrator captures and reuses those paths downstream. - Phase 1 loses the TODO-existence and depends-on checks (former steps 3 and 9 of the recent edit) — Phase 1 is now git/worktree-only. - Phase 2 expands @pm's existing dispatch into a `Validate run prerequisites` operation that returns either {ok: true, issue_file_path, issue: {...}} or {ok: false, error_code, message} with error_code in {tracker_missing, issue_not_found, dependency_unmet, dependency_missing}. depends-on enforcement moves here. - Phase 7 split_needed exit, Phase 9 TODO Update, Phase 9 Commit TODO Changes, and Failure Handler all reference @pm-returned paths or use `git add ./TODO/` blanketly (safe because Phase 1 verified clean tree and only @pm writes there during a run). - pm.md gains a path-return rule: every read returns issue_file_path, every write returns the modified paths. Run-Prerequisite Output format documented with all four error codes. - ADR-22 captures the rationale; routing matrix updates Phase 1/2 rows; pipeline diagram labels updated. The fix is discoverability-only — no permission deny on TODO/, per explicit user direction. The schema lives in agents/pm.md, which the orchestrator does not load. Refs: config/opencode/workflow-design.md ADR-22	2026-05-08 13:45:51 +02:00
Harald Hoyer	3e515d54eb	feat(opencode): allow agents to read external Rust crate source @make, @test, @check often need to inspect dependency source (trait definitions, impl details, test patterns) to inform implementation or verify findings. Opencode applies a CWD check on tool access, so reads outside the worktree previously prompted for each access. - Add permission.read/grep/glob path allowlists for the three locations cargo deps live: ~/.cargo/registry/src/, ~/.cargo/git/checkouts/, and /nix/store/-vendor-/ for crane / buildRustPackage projects. - Document the discovery pattern in each agent: `cargo metadata --format-version 1` returns absolute paths via packages[].manifest_path. - Cross-reference the registry paths from the permission.bash allowlist comment so future readers see the bash inspection commands (rg/ls) intentionally accept paths outside CWD. - @check gets its first permission block (was tools-only before). Path-pattern syntax for read/grep/glob isn't fully documented; if opencode rejects it, fall back to `permission: { external_directory: allow }` at the project config level.	2026-05-08 13:24:30 +02:00
Harald Hoyer	af6481a5a7	feat(opencode): one-task-per-run model + 9 routing fixes (ADRs 13-21) Captures the design grilling outcome. Adds ADRs 13-21 covering: - run-level plan_rework_remaining counter to bound P3<->P5.5/P7/P8 thrash - non-resumable workflow with throwaway-worktree recovery procedure - @simplify advisory at every gate (not just Phase 8) - Phase 8 fix specs go to disk as task-fix-N.md (preserves ADR-6) - Phase 5.5 BLOCK protocol: orchestrator edits plan, decrements counter, re-enters P4 - Phase 8 NOT_TESTABLE manifest in reviewer prompt - unified Implementation Incomplete diagnosis (test_design / production_logic / split_needed) - Phase 1 working-tree cleanliness + depends-on enforcement - one-task-per-run pivot: Phase 5 still splits N tasks, only task-1 runs; tasks 2..N filed as sub-issues with rich seed bodies; split_needed at P7 aborts to Failure Handler (one-task-per-run = no salvageable prior work) Auto-resolves big-diff Phase 8 reviews, cross-task regression-within-run, and mid-flight task-split routing. Rewrites routing matrix and three Mermaid diagrams; updates @pm (depends-on frontmatter, split-time filing), @check (third diagnosis verdict), @make (escalate: split_needed flag).	2026-05-08 13:02:54 +02:00
Harald Hoyer	534361f1b5	feat(opencode): extend Phase 7 escalation to mid-implementation test-design errors Phase 7's escalation rule was gated on @make flagging concerns "during entry validation" only. When @make got past entry validation, started implementing, and ground for 2-3 attempts because the test demanded impossible production code, the orchestrator had no documented route — it would re-dispatch @make with marginal context tweaks instead of recognizing the failure as test-architecture. Splits the escalation into two clearly-named paths (entry-validation vs mid-implementation) that both route through @check (test diagnosis) → @test (redesign) → fresh @make. Bounded at max 2 escalation cycles before reverting to a Phase 3 plan revisit, to prevent thrashing when the actual problem is upstream. @make.md gains a new Iteration Limits red-flag class — "Test-design suspicion" — instructing @make to stop and report with an explicit `escalate: test_design` flag in the Blocking Issue section. The flag is the routing signal the orchestrator switches on.	2026-05-08 10:20:16 +02:00
Harald Hoyer	c3407c9c98	refactor(opencode): drop @pm git-ref read mode, no longer used by workflow @pm originally had two read modes — git-ref (via `git show <ref>:TODO.md`) and filesystem. Git-ref existed because the workflow once ran in a bare repo with no working tree. Once the workflow was simplified to assume opencode is launched in the worktree, every dispatch (Phase 2 read, Phase 9 update, Failure handler) uses filesystem mode. Git-ref mode became dead weight: it added bash permissions, an allowlist, a "Bash Discipline" section, and a dual-mode "How to Read" section, but the workflow never invoked it. A reviewer correctly flagged the resulting inconsistency between the two-mode docs and the single-mode usage. @pm is now single-mode. Bash access is removed (bash: false, no permission allowlist). The "How to Read" section collapses to "you operate on TODO/ via the filesystem only" with one explicit pointer that ad-hoc historical reads (`git show main:TODO/GAL-39.md`) are out of scope — the user can run that themselves. The workflow drops the now-redundant "(live filesystem mode)" qualifier from Phase 2 / Phase 9 / Failure handler dispatches and the Roles & Dispatch table updates @pm's constraint to "No bash."	2026-05-08 10:20:16 +02:00
Harald Hoyer	4dc3cffba6	refactor(opencode): allow @test inside #[cfg(test)] mod blocks, drop file gate The previous design routed Rust unit tests to NOT_TESTABLE: Rust unit-only because @test was forbidden from touching src/, which forced @make to write both the production code and the inline #[cfg(test)] mod tests in one dispatch — losing TDD's RED→GREEN separation. But Rust module tests inside #[cfg(test)] mod tests { ... } are the canonical unit-testing idiom, not an edge case. @test's File Constraint now allows modifying src/*/.rs, but strictly inside #[cfg(test)] mod <name> { ... } blocks. Every line outside such a block stays read-only — adding pub, importing crates, declaring siblings, or any other production change is forbidden. Integration tests at tests/*/.rs continue to work as before. The Phase 6 post-step file gate (git status snapshot + comm -23 diff against test-pattern globs) is removed. With @test legitimately writing inside src/, a path-based gate proves nothing — production edits and cfg(test) edits live in the same files. The boundary is enforced by the prompt rule and Phase 8 reviewer scrutiny. Phase 5 test-file guidance updated to distinguish module vs integration tests for Rust, with stub-first TDD applying to both when symbols don't yet exist. The "Rust integration TDD: stub-first" section is renamed to "Rust stub-first TDD" and now covers module tests too. NOT_TESTABLE's "Rust unit-only" reason is replaced with "Missing testability seam" for cases where the production code needs a small change before tests can be authored.	2026-05-08 10:20:16 +02:00
Harald Hoyer	8373e32f34	fix(opencode): forbid RED-state references in test names A workflow run produced test names like move_enemies_following_path_ panics_on_todo, path_types_randomly_assigned, and spawn_enemies_ special_stage_panics_on_todo. The first and third leak the stub-first RED mechanic into the test name; once @make's body pass turns them GREEN, the name lies. The middle one is too vague to describe a contract. Adds a Test Naming subsection to @test's Test Philosophy stating the TDD survival principle — the name describes the contract under test, not the current state, and must remain accurate after the body pass. Bans ..._panics_on_todo / ..._fails_red / ..._stub_works / generic placeholders / vague verbs / implementation-detail leakage. Requires action + observable outcome and shows bad-to-good rewrites of the three names from this run.	2026-05-08 10:20:16 +02:00
Harald Hoyer	5a5cf269dc	refactor(opencode): migrate @pm and workflow to per-issue TODO/ folder The single TODO.md schema is replaced by a Linear-style folder layout matching the user's existing setup at /home/harald/git/bglga/TODO: TODO/ ├── README.md # category-grouped index (top-level only) ├── GAL-1.md ├── GAL-2.md └── … Each issue file has YAML frontmatter (id, title, status, parent, labels) and a body with optional sections (Sub-issues, Acceptance criteria, Integration test hints, Comments). The status set shrinks to Todo / In Progress / Done; Branch / PR / Priority / Assignee fields are gone. Comments are date-only. @pm gains directory-walking semantics (still scoped to TODO/), bash allowlist additions for git ls-tree and ls, and a propagation rule: status flips to/from Done update the dependent index — README.md for top-level issues, or the parent file's Sub-issues line for sub-issues. The workflow's Phase 1 sanity check now verifies TODO/, TODO/README.md, and TODO/<ID>.md all exist. Phase 2 reads the issue file and flips Todo to In Progress with index propagation. Phase 9 stages everything under TODO/ as a separate atomic chore(todo) commit, sets the status to Done (or leaves In Progress for incomplete runs), and adds a date + branch + commit comment. Failure handler routes through the same directory.	2026-05-08 10:20:16 +02:00
Harald Hoyer	91ba5bd272	fix(opencode): close two false-green test loopholes and the orchestrator-as-implementer escape hatch A workflow run on a Bevy weaving feature exposed two compounding failures: 1. @test wrote 8 structural-only Rust tests that never invoked weave_enemies or trigger_weaving. Every test passed against the stub-first @make pre-pass because none of them called the stubbed symbols, so todo!() never fired. The body-pass committed code that "passed" the suite and silently broke trigger_weaving in special stages. 2. @check found the trigger_weaving regression at Phase 8 (final review) and the orchestrator decided to "fix them directly" rather than dispatching @make — taking the license offered by the existing review-loop wording. Test-quality fixes: - Phase 3 Test Design now requires each behavior to be expressed as an action + observable outcome. Structural facts ("enum has 3 variants", "struct has these fields") are explicitly disqualified. - Phase 6 stub-first flow gains a mandatory Panic-coverage check: after @test returns, the orchestrator re-runs the test command and rejects the output unless every test panics on todo!() (i.e. every test exercises at least one stubbed symbol). Any passing test is structural-only and routes back to @test. - Phase 6 decision table gets a "Stub-first run: tests pass with zero todo!() panics" row covering the same case. - @test's Test Philosophy gains an explicit Do-NOT-write list of structural-only patterns (variant_count, type ascriptions, Box::new(my_fn), struct-literal-only flows, all-pass-on-stubs) plus a positive rule: every test must call a function and assert on observable outcome, or return NOT_TESTABLE rather than pad the suite. Orchestrator boundary fix: - Phase 8 review loop replaces "fix them directly (no need to re-dispatch @make for small fixes)" with the principle "the orchestrator does not write production code; @make does". BLOCK, behavioral, correctness, and test-quality findings round-trip through @make. Only AST-preserving cosmetic edits (typos in comments, trailing newlines) may be applied directly. Compiler- detected issues (unused imports, dead code) go through @make.	2026-05-08 10:20:16 +02:00
Harald Hoyer	5b5c59aa84	feat(opencode): mandate stub-first @make pre-pass for Rust integration TDD Rust integration tests live in a separate test crate that imports from lib.rs, so any test referencing not-yet-existing public API can only RED at build time. The build error masks assertion diagnostics and makes the RED state opaque — no stack trace, no left/right values. For Rust tasks whose @test step writes an integration test against public API that does not yet exist, the orchestrator now dispatches a stub-first @make pass before @test runs: 1. @make adds the planned public API as todo!()-bodied stubs in lib.rs and any new src/<module>.rs. Signatures lifted verbatim from the Phase 5 task spec. Acceptance criterion is cargo check only — no test command runs. 2. @test writes the integration test, which now compiles and panics at todo!() with a stack trace — a clean MISSING_BEHAVIOR RED. 3. Phase 7 dispatches @make again to replace the todo!() bodies with real implementations. Two atomic commits per task: scaffold then implement. Phase 5's Rust test-path guidance now flags the two-dispatch requirement up front. test.md's Rust failure-classification hints recognize todo!() / unimplemented!() panics as MISSING_BEHAVIOR with a pointer to the workflow's stub-first section.	2026-05-07 05:42:16 +02:00
Harald Hoyer	d5d90d8b9f	fix(opencode): reject Rust src/tests/ paths as a wrong task spec A workflow run on a Bevy/Rust project produced the test-file path `src/tests/test_<feature>.rs`, which @test correctly flagged as contradictory: it isn't a valid Rust test location (would require declaring `mod tests;` in production source, which @test cannot do) yet the file-gate glob `/tests//*.rs` accidentally matched it. Phase 5 now gives language-aware Test File guidance: Python uses colocated or top-level `tests/`, Rust uses crate-level `tests/<feature>.rs`, and Rust unit-only tasks are routed to NOT_TESTABLE for @make to handle inline. Phase 6's file gate gains an explicit anti-pattern clause discarding any new file under `src/` even when the glob matches. @test's own File Constraint mirrors the anti-pattern so the agent rejects the bad path with BLOCKED before the orchestrator's gate even runs — defense in depth on both sides of the dispatch boundary.	2026-05-06 18:31:14 +02:00
Harald Hoyer	8fcf7e5d34	feat(opencode): make @make and @test polyglot (Python, Rust, nix devshell) Both agents previously hardcoded the Python/uv toolchain. They now detect the language from marker files (pyproject.toml, Cargo.toml, flake.nix) and run the appropriate test/lint/format/type-check commands for Python, Rust, or both. When a flake.nix devshell is present, every toolchain command is wrapped in `nix develop -c …`. @make's permission allowlist gains `cargo ` and `nix develop -c `, plus matching denies for cargo add/remove/install/publish. The Verification Tiers and Baseline Verification sections are rewritten as per-language bullets, and output/TDD-evidence examples are now language-neutral. Generalised the "no Kubernetes deployments" constraint to cover any deploy/publish. @test gains the same devshell + cargo allows (scoped to test, check, clippy, fmt only — no build/run/install). Its file constraint adds `tests/*/.rs` for Rust integration tests, with an explicit note that Rust unit tests stay with @make because they live inside production source files. Failure-classification hints add Rust compiler-error mappings, and the NOT_TESTABLE table gets a "Rust unit-only" row.	2026-05-06 17:09:34 +02:00
Harald Hoyer	c879870ccf	fix(opencode): remove temperature	2026-05-06 16:43:35 +02:00
Harald Hoyer	d22acf6906	refactor(opencode): let @pm read TODO.md via git show, drop tempfile Gives @pm narrowly-scoped bash access (git show , git rev-parse ) so it can read TODO.md directly from any git ref. The workflow no longer needs to mktemp + redirect the file before invoking the agent; Phase 2 just tells @pm the bare repo path and default branch and lets it run git show "$DEFAULT_BRANCH:TODO.md" itself. Cleanup steps for the temp snapshot are removed from Phase 10 and the failure handler.	2026-05-06 15:42:17 +02:00
Harald Hoyer	37be2d9505	fix(opencode): remove agent models and temperature	2026-05-06 15:33:11 +02:00
Harald Hoyer	2941faa822	refactor(opencode): make workflow forge-agnostic and read TODO.md from bare repo Drops all GitHub-specific tooling (gh CLI, draft PR creation) so the workflow stops at git commit and leaves push/PR/MR to the user. TODO.md is now expected to be a tracked file on the default branch. Phase 1 verifies the repo is bare via `git rev-parse --is-bare-repository`, resolves the default branch from HEAD / init.defaultBranch, and snapshots TODO.md via `git show "$DEFAULT_BRANCH:TODO.md"` to a tempfile that @pm reads in Phase 2. Phase 10 updates the live TODO.md inside the worktree and commits the change separately. The /review command drops its PR mode for the same reason; @pm documents the read-only-snapshot vs. live-worktree path distinction.	2026-05-06 15:28:08 +02:00
Harald Hoyer	4ec1561af4	feat(opencode): add multi-agent workflow agents and commands Adds @check, @simplify, @test, @make, @pm subagents and the /workflow and /review slash commands from the autonomous multi-agent workflow gist by ppries. @pm is rewritten to manage issues in a local ./TODO.md file instead of Linear (file-only access, documented schema, structured JSON output). /workflow is adapted: TODO.md-based issue context, generic worktree paths (no hardcoded ~/repos/veo/sunstone), generic branch examples, and a Phase 1 guard that verifies origin is on GitHub before any work begins.	2026-05-06 14:56:42 +02:00

18 commits