nixcfg

Author	SHA1	Message	Date
Harald Hoyer	267c05b107	feat(opencode): give @make a concrete test-smell checklist Real-world observation: @make struggles when @test sets up tests incorrectly because the existing escalate: test_design trigger is described abstractly ("test seems to demand wrong thing"). When @make sees an unfamiliar smell, it tends to attempt implementation, fail, attempt again, and only escalate after burning 2-3 cycles. The protocol exists; the recognition criteria don't. Restructure Entry Validation step 5 into a named "Test triage" step with a concrete checklist that fires before any implementation attempt. Four categories of smells: - Mocking smells: mocks the SUT, >2 mocks, mock-call-as-primary assertion, internal-boundary mocking - Structural-only smells: variant counts, type ascriptions, function-pointer coercion, struct-literal-with-field-reads, stub-first no-panics (mirrors @test.md's anti-patterns) - Wrong-target smells: asserts on private state / log strings, demands contradicting spec, physically impossible demands - Setup smells: fixtures bypassing production validation, wrong-module imports, references to nonexistent infrastructure Iteration Limits step 5 now cross-references the same checklist instead of restating abstract criteria, so both gates apply the same recognition rules with a single source of truth. A "NOT for" caveat prevents over-eager escalation: when the test is fine but the implementation is just hard, that's not a smell, that's the test doing its job. The checklist is inlined (not pulled from @test.md at runtime) because subagents have separate contexts. Periodic manual sync between @make.md's checklist and @test.md's anti-patterns is acceptable — they shouldn't drift much in practice. Refs: config/opencode/agents/test.md (anti-patterns + structural-only list it mirrors), config/opencode/workflow-design.md ADR-19 (unified Implementation Incomplete diagnosis path)	2026-05-08 15:16:33 +02:00
Harald Hoyer	3e515d54eb	feat(opencode): allow agents to read external Rust crate source @make, @test, @check often need to inspect dependency source (trait definitions, impl details, test patterns) to inform implementation or verify findings. Opencode applies a CWD check on tool access, so reads outside the worktree previously prompted for each access. - Add permission.read/grep/glob path allowlists for the three locations cargo deps live: ~/.cargo/registry/src/, ~/.cargo/git/checkouts/, and /nix/store/-vendor-/ for crane / buildRustPackage projects. - Document the discovery pattern in each agent: `cargo metadata --format-version 1` returns absolute paths via packages[].manifest_path. - Cross-reference the registry paths from the permission.bash allowlist comment so future readers see the bash inspection commands (rg/ls) intentionally accept paths outside CWD. - @check gets its first permission block (was tools-only before). Path-pattern syntax for read/grep/glob isn't fully documented; if opencode rejects it, fall back to `permission: { external_directory: allow }` at the project config level.	2026-05-08 13:24:30 +02:00
Harald Hoyer	af6481a5a7	feat(opencode): one-task-per-run model + 9 routing fixes (ADRs 13-21) Captures the design grilling outcome. Adds ADRs 13-21 covering: - run-level plan_rework_remaining counter to bound P3<->P5.5/P7/P8 thrash - non-resumable workflow with throwaway-worktree recovery procedure - @simplify advisory at every gate (not just Phase 8) - Phase 8 fix specs go to disk as task-fix-N.md (preserves ADR-6) - Phase 5.5 BLOCK protocol: orchestrator edits plan, decrements counter, re-enters P4 - Phase 8 NOT_TESTABLE manifest in reviewer prompt - unified Implementation Incomplete diagnosis (test_design / production_logic / split_needed) - Phase 1 working-tree cleanliness + depends-on enforcement - one-task-per-run pivot: Phase 5 still splits N tasks, only task-1 runs; tasks 2..N filed as sub-issues with rich seed bodies; split_needed at P7 aborts to Failure Handler (one-task-per-run = no salvageable prior work) Auto-resolves big-diff Phase 8 reviews, cross-task regression-within-run, and mid-flight task-split routing. Rewrites routing matrix and three Mermaid diagrams; updates @pm (depends-on frontmatter, split-time filing), @check (third diagnosis verdict), @make (escalate: split_needed flag).	2026-05-08 13:02:54 +02:00
Harald Hoyer	534361f1b5	feat(opencode): extend Phase 7 escalation to mid-implementation test-design errors Phase 7's escalation rule was gated on @make flagging concerns "during entry validation" only. When @make got past entry validation, started implementing, and ground for 2-3 attempts because the test demanded impossible production code, the orchestrator had no documented route — it would re-dispatch @make with marginal context tweaks instead of recognizing the failure as test-architecture. Splits the escalation into two clearly-named paths (entry-validation vs mid-implementation) that both route through @check (test diagnosis) → @test (redesign) → fresh @make. Bounded at max 2 escalation cycles before reverting to a Phase 3 plan revisit, to prevent thrashing when the actual problem is upstream. @make.md gains a new Iteration Limits red-flag class — "Test-design suspicion" — instructing @make to stop and report with an explicit `escalate: test_design` flag in the Blocking Issue section. The flag is the routing signal the orchestrator switches on.	2026-05-08 10:20:16 +02:00
Harald Hoyer	8fcf7e5d34	feat(opencode): make @make and @test polyglot (Python, Rust, nix devshell) Both agents previously hardcoded the Python/uv toolchain. They now detect the language from marker files (pyproject.toml, Cargo.toml, flake.nix) and run the appropriate test/lint/format/type-check commands for Python, Rust, or both. When a flake.nix devshell is present, every toolchain command is wrapped in `nix develop -c …`. @make's permission allowlist gains `cargo ` and `nix develop -c `, plus matching denies for cargo add/remove/install/publish. The Verification Tiers and Baseline Verification sections are rewritten as per-language bullets, and output/TDD-evidence examples are now language-neutral. Generalised the "no Kubernetes deployments" constraint to cover any deploy/publish. @test gains the same devshell + cargo allows (scoped to test, check, clippy, fmt only — no build/run/install). Its file constraint adds `tests/*/.rs` for Rust integration tests, with an explicit note that Rust unit tests stay with @make because they live inside production source files. Failure-classification hints add Rust compiler-error mappings, and the NOT_TESTABLE table gets a "Rust unit-only" row.	2026-05-06 17:09:34 +02:00
Harald Hoyer	37be2d9505	fix(opencode): remove agent models and temperature	2026-05-06 15:33:11 +02:00
Harald Hoyer	4ec1561af4	feat(opencode): add multi-agent workflow agents and commands Adds @check, @simplify, @test, @make, @pm subagents and the /workflow and /review slash commands from the autonomous multi-agent workflow gist by ppries. @pm is rewritten to manage issues in a local ./TODO.md file instead of Linear (file-only access, documented schema, structured JSON output). /workflow is adapted: TODO.md-based issue context, generic worktree paths (no hardcoded ~/repos/veo/sunstone), generic branch examples, and a Phase 1 guard that verifies origin is on GitHub before any work begins.	2026-05-06 14:56:42 +02:00

7 commits