Commit graph

842 commits

Author SHA1 Message Date
8bd096ff8d feat(halo): inc. mtp to 6 2026-05-19 06:40:13 +02:00
46cdf2f6f6 Revert "feat(halo): new MTP version"
This reverts commit 53c11a120c.
2026-05-19 06:40:13 +02:00
Harald Hoyer
b185a6159f feat(darwin): enable launchd ssh-agent with FIDO/SK support
Apple's built-in ssh-agent has no sk-api/libfido2 support and refuses
signing operations for ed25519-sk / ecdsa-sk hardware keys. Enable the
existing metacfg.security.ssh module (which runs pkgs.openssh's
ssh-agent under launchd) via the common darwin suite, and export
SSH_AUTH_SOCK from environment.shellInit so bash, zsh, and fish (via
/etc/fish/foreign-env/shellInit) all point at the nix-managed socket.
2026-05-18 12:18:22 +02:00
0990389464 feat(m4): install defuddle-cli 2026-05-16 14:13:31 +02:00
a29301179b feat(opencode): install kepano/obsidian-skills into ~/.agents/skills
Add obsidian-skills as a flake input (flake = false) and map each
skill subdirectory into ~/.agents/skills/<skill>, alongside the
existing local skills. Updates flow through `nix flake update
obsidian-skills`.
2026-05-16 14:13:31 +02:00
b0fc627d0a feat: add config/claude/statusline-command.sh 2026-05-16 13:41:09 +02:00
492362fa31 feat(amd): enable Wake-on-LAN on enp7s0 2026-05-16 13:40:25 +02:00
4da2eed356 chore(homes): remove broken x86_64-linux home configs
- harald@sgx-nixos: orphan, no matching NixOS system and no
  home.stateVersion set, so it failed standalone evaluation
- harald@sgx-azure: referenced metacfg.tools.direnv.enable but no
  modules/home/tools/direnv exists, causing eval failure
2026-05-16 11:26:31 +02:00
a2b7dc510b fix(pi): PI_OFFLINE 1 2026-05-16 08:51:46 +02:00
4d32148059 chore: use ~/.agent/skills 2026-05-15 21:26:56 +02:00
c65c0d8756 chore(opencode): remove domain from config 2026-05-15 20:41:44 +02:00
53c11a120c feat(halo): new MTP version 2026-05-15 19:32:51 +02:00
Harald Hoyer
b5ae777a4a feat(home/js): deploy ~/.npmrc and ~/.bunfig.toml everywhere
New metacfg.cli-apps.js module (enabled by default) pins minimum
release ages for npm and bun across all home configurations, so the
mitigation against newly published malicious packages applies
uniformly rather than living as untracked dotfiles on one machine.
2026-05-15 08:41:56 +02:00
38d2d4f4ae fix(halo): q6_k with mtp 2 2026-05-15 07:47:43 +02:00
baaab26eb7 chore: flake.lock 2026-05-14 08:04:19 +02:00
1e3b2fc9a7 feat(halo): unsloth MTP 2026-05-13 19:42:54 +02:00
42c52bd87f refactor(mx): drive opencode bot via direct chat-completions API
The bot no longer shells out to `opencode run`. Instead it POSTs to the
OpenAI-compatible /chat/completions endpoint exposed by llama-server on
halo.hoyer.tail:8000 directly. This removes the Bun/sqlite cold-start
overhead per request, drops the pkgs.opencode runtime dependency, and
eliminates the ExecStartPre dance that materialized config.json into the
service's $HOME.

Conversation history is now stored as a proper OpenAI `messages` list
with system/user/assistant roles, instead of the XML blob that was
inlined into a single `opencode run` argument. The interactive opencode
setup (config/opencode/config.json) is unchanged — only the bot stops
depending on it.

The module gains a `modelBaseUrl` option; `model` is now the bare model
name (`halo-8000`) without the provider/ prefix that the opencode CLI
required.
2026-05-13 16:38:58 +02:00
aa3bc3c457 feat(pi): package @earendil-works/pi-coding-agent as pi
Vendors the npm tarball + lockfile and wraps the `pi` binary with `fd` and
`ripgrep` on PATH. Also installs it on the m4 darwin host.

`buildNpmPackage` is pulled from `inputs.unstable` because nixos-25.11's
`prefetch-npm-deps-0.1.0` panics on cacache index entries that contain
either multiple lines or JSON values with embedded spaces (npm's
`accept: application/...; q=1.0, ...` headers). For this lockfile,
`@esbuild/netbsd-arm64` and `@rollup/rollup-linux-x64-musl` trigger
both conditions and `--map-cache` fails with `EOF while parsing a
string at line 1 column 369`. Fixed upstream in nixos-unstable, which
now uses `lines()` + `split_once('\t')`.
2026-05-13 16:34:38 +02:00
9e692f45ba chore: bot pw 2026-05-13 15:31:26 +02:00
d8e8293c0e feat(mx): add Nextcloud Talk opencode bot pointing at halo.hoyer.tail:8000
Mirrors the existing nextcloud-claude-bot setup but invokes `opencode run`
against the local `halo-8000` provider/model. The bot listens on
127.0.0.1:8086, is exposed via the `/_opencode-bot/` location on
nc.hoyer.xyz, and uses `@Halo` as its mention trigger in group chats.

The opencode config (config/opencode/config.json) is installed into the
service's $HOME/.config/opencode/ on each start, so the bot picks up the
same provider definition the user uses interactively. The model map keys
are renamed to `halo-8000` / `halo-8001` so the canonical
`provider/model` reference works without an alias indirection.
2026-05-13 15:08:18 +02:00
dadfb07914 fix(halo): set --alias halo-8000 2026-05-13 14:52:49 +02:00
83b2ed3b57 fix(opencode): full halo URL 2026-05-13 14:42:25 +02:00
fb32301831 chore(opencode): fix URLS 2026-05-13 11:40:04 +02:00
f9a2e0d301 chore(x1,amd): disable cratedocs-mcp service
Keep it enabled only on sgx.
2026-05-13 11:35:59 +02:00
4ce7bcf354 fix(mx): make tailscale exit-node advertisement actually apply
tailscale set is strict about boolean flags and silently ignores
--advertise-exit-node without =true. Result: the tailscaled-set unit
ran cleanly but AdvertiseRoutes stayed null. Spell the value out so the
flag takes effect.
2026-05-13 09:28:20 +02:00
b9cfdc99a7 feat(base): blacklist unused network kernel modules
Disable rxrpc, kafs, af_key, esp4, esp6 across all systems that enable
metacfg.base. None of them are used on these hosts, and they have a
history of CVEs — blacklisting reduces kernel attack surface.
2026-05-13 09:16:21 +02:00
67b7c3a9fd feat(headscale): add ACL policy, isolate mx, make mx an exit node
Introduces a headscale ACL policy (file-mode) plus matching client config:

- New systems/x86_64-linux/attic/headscale-policy.hujson:
  * tag:llm restricts a node to talking only to halo:8000
  * all other harald@ nodes have full mesh access to each other
  * harald@ nodes can route internet traffic via approved exit nodes
  * autoApprovers.exitNode = [tag:llm] auto-approves the exit route
    advertised by any tag:llm node (currently mx)

- attic headscale.nix: wire policy.mode = "file" / policy.path to
  the .hujson above.

- mx default.nix: enable useRoutingFeatures = "server" (needed for IP
  forwarding) and add extraSetFlags = ["--advertise-exit-node"] so the
  flag is reapplied on every activation, not just initial login.

Operational steps after deploy:
  headscale nodes tag -i 10 -t tag:llm
2026-05-13 09:06:40 +02:00
87bdaf15da fix(attic): keep headscale domain as headscale.hoyer.xyz
Avoid breaking existing clients and the registered OIDC redirect URI by
keeping the original domain. Only the host backing it changes (mx -> attic);
DNS just needs to be repointed.
2026-05-13 08:48:37 +02:00
12c25bcde8 refactor(attic): move headscale from mx to attic
Headscale is moving off the mx mailserver onto the attic cache host.
The new public URL is https://headscale.hoyer.world.

- Switch from useACMEHost = "hoyer.xyz" (mx wildcard DNS-01) to
  enableACME = true, since attic only has HTTP-01 configured.
- Move headscale port to 8081 to avoid clashing with atticd on 8080.
- Drop the 192.168.178.254 LAN nameserver from dns.nameservers.global,
  which isn't reachable from the Hetzner instance.

Operational steps still required on attic:
- Provision /var/lib/headscale/client_secret
- Migrate the headscale state DB from mx
- Point headscale.hoyer.world DNS at attic
- Update the Nextcloud OIDC client's redirect URI
2026-05-13 08:42:46 +02:00
1094facb1e refactor(opencode): use $1 substitution, drop BASE_BRANCH and arg parsing
The workflow command was asking the model to parse $ARGUMENTS into
positional tokens (issue ID, optional base branch). Opencode supports
$1, $2, $3, ... for direct positional substitution at template-load
time — the model never needs to see or parse a joined argument string.
Two simplifications:

1. Replace $ARGUMENTS-parsing with $1. Every reference to the issue ID
   in commands/workflow.md is now $1, which opencode substitutes
   literally before the prompt loads. Eliminates a class of parsing
   errors (whitespace edge cases, mis-splits, hallucinated extra args)
   and removes the orchestrator's need to "remember" an ISSUE_ID
   variable across phases.

2. Drop BASE_BRANCH entirely. It was used in three places:
   - Phase 1 "branch != base" check — actual concern is "don't run on
     a protected branch." Replace with refusal on main/master/develop/
     any matching protected name.
   - Phase 8 `git diff "$BASE_BRANCH"...HEAD` — anchor the diff to
     START_SHA captured at Phase 1 instead. With one-task-per-run
     (ADR-21), the run produces a small bounded diff from a known
     starting point; START_SHA is more accurate than diffing against
     a separate branch tip that may have moved.
   - Failure Handler recovery procedure — user-facing instructions;
     name "your usual integration branch" instead of $BASE_BRANCH.

The command signature collapses from `/workflow <ISSUE-ID> [base-branch]`
to just `/workflow <ISSUE-ID>` — single positional, zero parsing.

Routing matrix Phase 1 row updated for the protected-branch refusal;
ADR-14's recovery-procedure paragraph no longer names BASE_BRANCH.

Refs: config/opencode/workflow-design.md ADR-14, ADR-21
2026-05-13 07:20:41 +02:00
e440bf39fd feat(halo): llama-server-27B-MTP.nix 2026-05-12 16:16:15 +02:00
33690dcc98 feat(halo): new mtp llama.cpp 2026-05-12 16:14:13 +02:00
ca4ee90828 feat(halo): coder next 2026-05-11 12:22:34 +02:00
7b04b55ce8 feat(halo): cache-ram 0 2026-05-10 20:50:08 +02:00
04342222a2 fix(halo): 27b 2026-05-10 20:46:12 +02:00
689cdec28d feat(halo): activate qwen 27b 2026-05-10 20:44:38 +02:00
bef528e26a feat(halo): use qwen-35b-a3b 2026-05-10 20:44:38 +02:00
267c05b107 feat(opencode): give @make a concrete test-smell checklist
Real-world observation: @make struggles when @test sets up tests
incorrectly because the existing escalate: test_design trigger is
described abstractly ("test seems to demand wrong thing"). When @make
sees an unfamiliar smell, it tends to attempt implementation, fail,
attempt again, and only escalate after burning 2-3 cycles. The
protocol exists; the recognition criteria don't.

Restructure Entry Validation step 5 into a named "Test triage" step
with a concrete checklist that fires *before* any implementation
attempt. Four categories of smells:

- **Mocking smells:** mocks the SUT, >2 mocks, mock-call-as-primary
  assertion, internal-boundary mocking
- **Structural-only smells:** variant counts, type ascriptions,
  function-pointer coercion, struct-literal-with-field-reads,
  stub-first no-panics (mirrors @test.md's anti-patterns)
- **Wrong-target smells:** asserts on private state / log strings,
  demands contradicting spec, physically impossible demands
- **Setup smells:** fixtures bypassing production validation,
  wrong-module imports, references to nonexistent infrastructure

Iteration Limits step 5 now cross-references the same checklist
instead of restating abstract criteria, so both gates apply the same
recognition rules with a single source of truth.

A "NOT for" caveat prevents over-eager escalation: when the test is
fine but the implementation is just hard, that's not a smell, that's
the test doing its job.

The checklist is inlined (not pulled from @test.md at runtime) because
subagents have separate contexts. Periodic manual sync between
@make.md's checklist and @test.md's anti-patterns is acceptable —
they shouldn't drift much in practice.

Refs: config/opencode/agents/test.md (anti-patterns + structural-only
list it mirrors), config/opencode/workflow-design.md ADR-19 (unified
Implementation Incomplete diagnosis path)
2026-05-08 15:16:33 +02:00
56713cd7b8 feat(opencode): @pm owns the TODO commit (ADR-23)
The orchestrator was running `git add ./TODO/` and `git commit -m
chore(todo): ...` itself in Phase 9, baking filesystem-tracker
specifics into commands/workflow.md. The point of @pm as an
abstraction is that it should be swappable — a Linear-backed @pm or a
Notion-backed @pm should drop in without touching the workflow
command. With API-backed trackers, "commit the TODO updates" is a
no-op and `git add ./TODO/` is wrong.

Push persistence shape behind the @pm boundary:

- New @pm capability `Commit pending changes` accepts a commit message
  and returns {ok, sha, message}. Filesystem @pm runs `git add ./TODO/`
  + `git commit -m <msg>` and returns the SHA. Tracker-backed
  implementations no-op and return sha: null.
- @pm gains tightly-scoped bash access: `git add ./TODO/*`,
  `git commit -m *`, `git status --porcelain ./TODO/*` only. Push,
  reset, rebase, checkout, branch, tag are explicit denies. Everything
  else falls through to the default deny.
- Phase 9 "Commit TODO Changes" replaces orchestrator-side git with a
  @pm dispatch; orchestrator constructs the message from run context
  and captures the returned SHA for the summary.
- Failure Handler gains a step 5 (commit pending after the failure
  comment add). Today the comment is left uncommitted in the working
  tree and gets discarded with the throwaway worktree (ADR-14) —
  forensic loss. With this change the failure note lands as its own
  commit on the failed branch.
- Routing matrix Phase 9 rows updated; ADR-22's superseded wording
  about orchestrator-side staging removed.

Stub-pass / body-pass / wip code commits remain orchestrator-owned —
those are code, not tracker-specific.

Refs: config/opencode/workflow-design.md ADR-23
2026-05-08 14:04:47 +02:00
a3e0de6d04 feat(opencode): hide TODO paths from orchestrator (ADR-22)
In recent runs the orchestrator skipped @pm and edited TODO/ files
itself, despite the workflow.md anti-pattern warning. Root cause: the
workflow doc literally taught the orchestrator the path layout
(`./TODO/<ID>.md`), making self-help a discoverable shortcut.

Fix: remove the recipe. The orchestrator now never constructs or reads
any per-issue TODO path. All TODO operations go through @pm dispatches;
@pm returns the absolute file path of every issue it touches, and the
orchestrator captures and reuses those paths downstream.

- Phase 1 loses the TODO-existence and depends-on checks (former steps
  3 and 9 of the recent edit) — Phase 1 is now git/worktree-only.
- Phase 2 expands @pm's existing dispatch into a `Validate run
  prerequisites` operation that returns either {ok: true,
  issue_file_path, issue: {...}} or {ok: false, error_code, message}
  with error_code in {tracker_missing, issue_not_found,
  dependency_unmet, dependency_missing}. depends-on enforcement moves
  here.
- Phase 7 split_needed exit, Phase 9 TODO Update, Phase 9 Commit TODO
  Changes, and Failure Handler all reference @pm-returned paths or use
  `git add ./TODO/` blanketly (safe because Phase 1 verified clean tree
  and only @pm writes there during a run).
- pm.md gains a path-return rule: every read returns issue_file_path,
  every write returns the modified paths. Run-Prerequisite Output
  format documented with all four error codes.
- ADR-22 captures the rationale; routing matrix updates Phase 1/2 rows;
  pipeline diagram labels updated.

The fix is discoverability-only — no permission deny on TODO/, per
explicit user direction. The schema lives in agents/pm.md, which the
orchestrator does not load.

Refs: config/opencode/workflow-design.md ADR-22
2026-05-08 13:45:51 +02:00
3e515d54eb feat(opencode): allow agents to read external Rust crate source
@make, @test, @check often need to inspect dependency source (trait
definitions, impl details, test patterns) to inform implementation or
verify findings. Opencode applies a CWD check on tool access, so reads
outside the worktree previously prompted for each access.

- Add permission.read/grep/glob path allowlists for the three locations
  cargo deps live: ~/.cargo/registry/src/, ~/.cargo/git/checkouts/, and
  /nix/store/*-vendor-*/ for crane / buildRustPackage projects.
- Document the discovery pattern in each agent: `cargo metadata
  --format-version 1` returns absolute paths via packages[].manifest_path.
- Cross-reference the registry paths from the permission.bash allowlist
  comment so future readers see the bash inspection commands (rg/ls)
  intentionally accept paths outside CWD.
- @check gets its first permission block (was tools-only before).

Path-pattern syntax for read/grep/glob isn't fully documented; if
opencode rejects it, fall back to `permission: { external_directory:
allow }` at the project config level.
2026-05-08 13:24:30 +02:00
af6481a5a7 feat(opencode): one-task-per-run model + 9 routing fixes (ADRs 13-21)
Captures the design grilling outcome. Adds ADRs 13-21 covering:
- run-level plan_rework_remaining counter to bound P3<->P5.5/P7/P8 thrash
- non-resumable workflow with throwaway-worktree recovery procedure
- @simplify advisory at every gate (not just Phase 8)
- Phase 8 fix specs go to disk as task-fix-N.md (preserves ADR-6)
- Phase 5.5 BLOCK protocol: orchestrator edits plan, decrements counter, re-enters P4
- Phase 8 NOT_TESTABLE manifest in reviewer prompt
- unified Implementation Incomplete diagnosis (test_design / production_logic / split_needed)
- Phase 1 working-tree cleanliness + depends-on enforcement
- one-task-per-run pivot: Phase 5 still splits N tasks, only task-1 runs;
  tasks 2..N filed as sub-issues with rich seed bodies; split_needed at P7
  aborts to Failure Handler (one-task-per-run = no salvageable prior work)

Auto-resolves big-diff Phase 8 reviews, cross-task regression-within-run, and
mid-flight task-split routing. Rewrites routing matrix and three Mermaid
diagrams; updates @pm (depends-on frontmatter, split-time filing), @check
(third diagnosis verdict), @make (escalate: split_needed flag).
2026-05-08 13:02:54 +02:00
0b15944d1c docs(opencode): make workflow-design Mermaid diagrams Forgejo-compatible
Forgejo's Mermaid parser is stricter than GitHub's and rejected two
diagrams in workflow-design.md:

1. Flowchart 3.1 — `@check`, `@test`, `@make` in pipe-delimited edge
   labels were tokenised as LINK_ID (newer Mermaid uses `@{...}` for
   edge IDs), e.g. `P7E -->|@check → @test → @make| P7` failed at
   the first @.
2. State diagram 3.2 — the second colon inside transition labels
   (`escalate: test_design`) collided with the `:` field separator
   that splits transition from label.

Drops the @-prefix from labels in all three diagrams (`@check` → `check`
in prose-of-the-label only; ADRs and prose elsewhere keep `@check`
backticked, which is just markdown). Replaces second colons with
descriptive text. Drops parentheses from state-diagram transition
labels. Drops the Unicode arrow `→` in favour of plain words.
Quotes the flowchart node-label strings to keep `<br/>` safe.

The ADR text and prose continue to use `@<name>` references — those
live in markdown, not Mermaid, and render the same.
2026-05-08 10:24:57 +02:00
af0c1d6ea5 docs(opencode): add workflow-design.md as design rationale + decision log
Operational rules in commands/workflow.md and the agent files have
been accreting through repeated patches, with the rationale scattered
across commit messages and conversations. New gaps kept surfacing
after the fact (Phase 7 mid-impl escalation, Phase 8 routing for
test-design findings, Phase 5.5 entirely missing) because there was
no single place to audit the flow.

Adds config/opencode/workflow-design.md as a sibling to commands/
and agents/. It is the design rationale and decision log; operational
rules stay in the command and agent files. The intended flow is:
discuss new ideas / failure modes here → reach a decision → update
the operational files → record the decision in the ADR log.

Pre-populated with: cast & responsibilities table; three Mermaid
diagrams (phase pipeline, Phase 7 escalation state machine, issue
lifecycle); a routing matrix that lists every observed (phase,
signal) → action pair so gaps are visible at a glance; 12 ADRs
covering decisions made over the past several days (forge-agnostic,
TODO/ folder, worktree-only, polyglot agents, absolute-path dispatch,
run artifacts on disk, stub-first Rust TDD, @test inside cfg(test)
mod, Phase 5.5, single-mode @pm, file follow-ups, Phase 7 mid-impl
escalation); and 5 open questions teed up for future discussion.
2026-05-08 10:20:16 +02:00
534361f1b5 feat(opencode): extend Phase 7 escalation to mid-implementation test-design errors
Phase 7's escalation rule was gated on @make flagging concerns "during
entry validation" only. When @make got past entry validation, started
implementing, and ground for 2-3 attempts because the test demanded
impossible production code, the orchestrator had no documented route
— it would re-dispatch @make with marginal context tweaks instead of
recognizing the failure as test-architecture.

Splits the escalation into two clearly-named paths (entry-validation
vs mid-implementation) that both route through @check (test diagnosis)
→ @test (redesign) → fresh @make. Bounded at max 2 escalation cycles
before reverting to a Phase 3 plan revisit, to prevent thrashing when
the actual problem is upstream.

@make.md gains a new Iteration Limits red-flag class — "Test-design
suspicion" — instructing @make to stop and report with an explicit
`escalate: test_design` flag in the Blocking Issue section. The flag
is the routing signal the orchestrator switches on.
2026-05-08 10:20:16 +02:00
aac4d44a49 feat(opencode): file unresolved bugs/blockers as TODO sub-issues in Phase 9
A workflow run wrapped up with "Unresolved: Score not resetting on game
restart (pre-existing bug, out of scope)" — a real bug discovered while
implementing GAL-39. Buried in summary.md, which is per-run, untracked,
overwritten on the next run, and read by nobody (the user has walked
away by design).

Adds a File Follow-ups subsection to Phase 9, after the TODO Update.
Tracked-worthy items are routed through @pm as sub-issues of the
current issue (parent: $ISSUE_ID), so they auto-show in the parent's
Sub-issues list and don't need a README.md category at unattended
runtime. Three categories file an issue:

- Pre-existing bugs found out of scope → label `bug`
- Unresolved review-loop blockers (Phase 4 or 8 cycle exhaustion)
  → label `followup`
- @test NOT_TESTABLE "future seam" notes → label `tech-debt`

Things explicitly NOT filed: @simplify advisories the orchestrator
chose not to act on (records, not missing work), cosmetic nits,
duplicates of existing issues. Those live in the run summary's new
"Advisory notes (not filed)" section.

Renames "Commit TODO Changes" subsection so the worked issue update
plus any filed follow-ups commit together as one atomic chore(todo)
commit. The Run Summary's old "Unresolved items" bullet is replaced
with two sharper bullets: "Filed follow-ups" (lists IDs of created
sub-issues) and "Advisory notes (not filed)".
2026-05-08 10:20:16 +02:00
c3407c9c98 refactor(opencode): drop @pm git-ref read mode, no longer used by workflow
@pm originally had two read modes — git-ref (via `git show <ref>:TODO.md`)
and filesystem. Git-ref existed because the workflow once ran in a bare
repo with no working tree. Once the workflow was simplified to assume
opencode is launched in the worktree, every dispatch (Phase 2 read,
Phase 9 update, Failure handler) uses filesystem mode. Git-ref mode
became dead weight: it added bash permissions, an allowlist, a "Bash
Discipline" section, and a dual-mode "How to Read" section, but the
workflow never invoked it. A reviewer correctly flagged the resulting
inconsistency between the two-mode docs and the single-mode usage.

@pm is now single-mode. Bash access is removed (bash: false, no
permission allowlist). The "How to Read" section collapses to "you
operate on TODO/ via the filesystem only" with one explicit pointer
that ad-hoc historical reads (`git show main:TODO/GAL-39.md`) are
out of scope — the user can run that themselves.

The workflow drops the now-redundant "(live filesystem mode)"
qualifier from Phase 2 / Phase 9 / Failure handler dispatches and
the Roles & Dispatch table updates @pm's constraint to "No bash."
2026-05-08 10:20:16 +02:00
cc971b80e0 feat(opencode): add Phase 5.5 task-split review by @check
ppries' README mentioned "@check reviews task split for completeness
and coverage" as a workflow step but the gist's actual workflow.md
never implemented it, and neither did ours. Without a split-review
gate between Phase 5 and Phase 6, an over- or under-split task
surfaces only at Phase 8 final review — after expensive @test and
@make dispatches have already run on a broken split.

Adds Phase 5.5: a short, focused review of the task split as a set,
dispatched only to @check (split is structural / coverage, not
complexity, so @simplify is not involved). The dispatch passes the
absolute paths to plan.md and every task-N.md and asks @check to
evaluate the split against five questions: coverage, no overlap,
single-purpose, integration contracts, testable AC.

Loop limited to 2 cycles (less than the plan-review's 3), with a
BLOCK verdict routing back to Phase 4 when the plan itself does not
decompose cleanly. The phase is explicitly framed as "a quick gate,
not a deep review" — no line-by-line code feedback (there's no code
yet), no design re-litigation (that was Phase 4) — to keep it from
expanding into a second plan review.

No phase renumbering downstream — 5.5 fits between 5 and 6 without
disturbing existing cross-references.
2026-05-08 10:20:16 +02:00
236b4d2470 fix(opencode): teach orchestrator about subagents and enforce on-disk artifacts
Two related orchestration failures from recent runs:

1. An orchestrator missed the multi-agent concept entirely and produced
   reviews / implementations itself instead of dispatching @check / @make.
   The workflow described phases as "Dispatch @<name>" everywhere but
   never explained who the cast was, what "dispatch" meant, or that the
   orchestrator (agent: build) is distinct from the subagents.
2. Another orchestrator dispatched @test pointing at a $RUN_DIR/task-N.md
   that it never wrote — the file-write instruction in Phase 5 was a
   single bolded sentence inside a paragraph, easy to skim past, and
   nothing checked artifact existence before dispatching.

Adds a top-level "Roles & Dispatch" section between the parse line and
Run Artifacts. It establishes the multi-agent model, lists the cast
(@check / @simplify / @test / @make / @pm) with one-line role and
permission notes, defines "Dispatch" as a tool call (not a role-play
instruction), and lists three anti-patterns the orchestrator must
avoid (acting as a subagent, skipping a dispatch, paraphrasing
artifacts instead of letting subagents read them from disk).

Restructures Phase 5 as five explicit numbered steps. Step 4 mandates
writing each task to $RUN_DIR/task-<N>.md and verifying with test -f;
step 5 requires dropping inline copies once the file is the source of
truth. The phase is "not done" until every task file exists on disk.

Adds a row to Dispatch Hygiene's Pre-Dispatch Validation table that
requires test -f verification of any artifact path the dispatch
references; missing files route back to the producing phase.
2026-05-08 10:20:16 +02:00
25f4c6f179 feat(opencode): write plan and task specs to .workflow/run-<id>/ on disk
Plans and task specs were previously re-emitted as inline prompt text on
every dispatch. That meant @check and @simplify might receive paraphrased
versions of the same plan, mid-loop revisions could leak as "actually let
me reconsider" passes, and the same content rode through orchestrator
context many times across review/test/make dispatches.

The orchestrator now writes finalized artifacts to a per-run directory:

  .workflow/run-<ISSUE-ID>/
    plan.md         # Phase 3 output
    task-1.md       # Phase 5 output, one file per task
    task-2.md
    summary.md      # Phase 9 output (was .workflow/workflow-summary.md)

Subagents read these by absolute path; the dispatch prompt body shrinks
to agent role, artifact path, and short per-dispatch context. Mid-loop
revisions (Phase 4 review cycles, etc.) edit the file in place so every
subsequent dispatch sees the same byte-for-byte source of truth — the
Finalized-Text Rule has a physical anchor.

Phase 1 captures WORKTREE_PATH, ISSUE_ID, and RUN_DIR. Phase 3 mkdirs
the run directory and writes plan.md. Phase 4 dispatches reviewers
against plan.md by path. Phase 5 writes one task-N.md per task. Phase
6/7 dispatch @test/@make against task-N.md by path; the @test→@make
TDD handoff stays inline. Phase 8 reviewers re-read plan.md from disk.
Phase 9 renames "Local Summary" to "Run Summary" and writes to
$RUN_DIR/summary.md. The staging exclusion broadens from a single
file to the whole .workflow/ tree, and Failure Handling follows suit.
2026-05-08 10:20:16 +02:00