Commit graph

872 commits

Author SHA1 Message Date
a935e77f83 feat(skills): add rust-recent-features reference skill
Documents Rust language, stdlib, and Cargo features stabilized after the
~2025 training cutoff (1.85–1.95, 2024 edition) so agents generate
current syntax instead of relying on a stale mental model.
2026-05-21 23:34:28 +02:00
9986d286b1 refactor(openwebui): drop stale backend env vars now managed via UI
The Ollama/OpenAI connection env vars are PersistentConfig: read only on
first launch and thereafter owned by Open WebUI's DB. They no longer
reflected the live backend, so remove them and document that connections
are configured through the admin UI.
2026-05-21 23:15:47 +02:00
fdefdf31b2 feat(litellm): add LiteLLM gateway on sgx fronting halo's llama-server
Exposes an OpenAI-compatible endpoint on sgx:4000 (LAN-reachable) that
routes the `coder` model to halo's llama-server, so clients get a stable
gateway with per-key auth instead of hardcoding halo's address. Master
key is sourced from a sops-encrypted env file.
2026-05-21 23:15:47 +02:00
ccd8750899 chore(halo): set spec-draft-p-min for coder model
Add a 0.74 confidence threshold so speculative drafting stops early
once the draft model's predicted token probability drops below it,
favoring shorter, higher-acceptance draft sequences.
2026-05-21 23:15:09 +02:00
3a070413e4 chore(halo): upgrade coder model to Q8 quant and bump spec draft
Switch the coder model from Q6_K to the UD-Q8_K_XL quant for better
output quality, and raise spec-draft-n-max from 4 to 5 to allow longer
speculative draft sequences.
2026-05-21 23:11:00 +02:00
689389ebf8 chore(halo): rename model to coder and add ngram-simple speculation
Rename the Qwen3.6-27B model section to "coder" so it matches the
opencode provider config, and add ngram-simple to the speculative
decoding chain alongside draft-mtp.
2026-05-21 22:07:57 +02:00
e9bedc0455 feat(opencode): also expose formatters on PATH
Re-add home.packages so nixfmt, prettier, shfmt, ruff, taplo and stylua
are available for interactive use, alongside the store-path-pinned
references in the generated config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:07:33 +02:00
d10570a0d8 refactor(opencode): generate config.json from the home module
Build opencode's config.json with pkgs.formats.json instead of shipping
a static file, pinning each formatter command to its store-path binary
via lib.getExe. Drops the standalone config/opencode/config.json.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:06:35 +02:00
d0c58a5c9d feat(opencode): configure formatters and provide them on PATH
Add formatter entries for nix, prettier (md/yaml/json/web), shell,
python, toml and lua, and install the matching tools via the opencode
home module so they are available wherever opencode runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:02:53 +02:00
7641bab17f fix(opencode): rename halo model to coder and drop trailing comma
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 21:56:59 +02:00
9ad34ade8f chore: flake update 2026-05-21 20:54:51 +02:00
ee396ffd42 chore(halo): more parallel 2026-05-21 20:54:08 +02:00
70da67555f chore(halo): llama.cpp update 2026-05-21 20:46:06 +02:00
1376ab0ba0 chore(halo): reduce ubatch size 2026-05-21 08:47:39 +02:00
6c5ce8742c fix(halo): only one model 2026-05-20 14:23:42 +02:00
5ee2f65337 chore(halo): tune llama models.ini and drop 35B-A3B model
Serve only Qwen3.6-27B; remove the unused 35B-A3B preset.

Tuning:
- Move model-specific keys (spec-type, sampling temp/top-p/top-k/min-p)
  out of the [*] defaults into [Qwen3.6-27B] so they no longer leak onto
  other models; draft-mtp in particular only works on MTP-weighted models.
- Drop the duplicate parallel key from [*].
- Bump ubatch-size 256 -> 512 for faster iGPU prefill on Strix Halo.
- Add threads-batch = 16 to use all cores for prefill while keeping
  generation at threads = 8 under full GPU offload.
2026-05-20 14:23:42 +02:00
2e5fb2bf83 chore: enable LSP servers in opencode config
Auto-start language servers when matching file extensions are detected.
2026-05-20 14:09:21 +02:00
30b83e520c chore: reconfig opencode 2026-05-20 13:39:28 +02:00
Harald Hoyer
5b44e037a1 feat(halo): add song <URL> command to convert via song.link
Resolves the URL through the Odesli public API (api.song.link) and
replies with the canonical song.link page plus per-platform deep links
(Spotify, Apple Music, YouTube/YT Music, Tidal, Deezer, Amazon Music,
SoundCloud). Country is pinned to DE.
2026-05-20 09:42:11 +02:00
ac70c57c15 chore(halo): preload both llama models and tune preset
Preload Qwen3.6-27B and Qwen3.6-35B-A3B at startup (load-on-startup)
so both are warm immediately under --models-max 2, set parallel = 1
as the [*] fallback for any other model, and adjust per-model context
size and draft depth.
2026-05-20 07:14:26 +02:00
31e491e314 Revert "fix(halo): 27 only"
This reverts commit 72e7bf613f.
2026-05-20 07:05:27 +02:00
72e7bf613f fix(halo): 27 only 2026-05-20 02:14:08 +02:00
807a3d0d8e fix(halo): context 2026-05-20 01:21:10 +02:00
0edf975c30 feat(halo): serve multiple llama models via models.ini preset
Replace the per-model llama-server units with a single service that
uses llama-server's --models-preset (models.ini) and --models-max 2,
so the 35B-A3B and 27B models are loaded on demand from one config.

Drop the now-redundant 27B / 27B-MTP / coder-next variant files and
the unused CacheDirectory + slot-save-path KV-slot handling.
2026-05-20 00:23:50 +02:00
ae068cfd84 feat(mx): increase halo bot timeout 2026-05-19 23:52:46 +02:00
b4063fda66 feat(halo): MTP --parallel 2 2026-05-19 23:48:53 +02:00
Harald Hoyer
f07af7f5da feat(skills): add adversarial-review
A skeptical PR review skill that defaults to REJECT. Encodes the
staff-engineer adversarial stance: lead with problems, assume bugs
exist, require severity+location+fix+test per finding, mandate an
execution trace, and end with an explicit verdict.

Includes base-branch detection (gh pr view → upstream → heuristic →
ask) so the review never silently diffs against the wrong base.
2026-05-19 15:03:28 +02:00
Harald Hoyer
5b92ed1850 feat(rialo): add pi 2026-05-19 14:27:50 +02:00
3631e2fe81 chore: flake.lock update 2026-05-19 06:40:44 +02:00
bbca21240f feat(halo): new llama-cpp-rocm 2026-05-19 06:40:13 +02:00
8bd096ff8d feat(halo): inc. mtp to 6 2026-05-19 06:40:13 +02:00
46cdf2f6f6 Revert "feat(halo): new MTP version"
This reverts commit 53c11a120c.
2026-05-19 06:40:13 +02:00
Harald Hoyer
b185a6159f feat(darwin): enable launchd ssh-agent with FIDO/SK support
Apple's built-in ssh-agent has no sk-api/libfido2 support and refuses
signing operations for ed25519-sk / ecdsa-sk hardware keys. Enable the
existing metacfg.security.ssh module (which runs pkgs.openssh's
ssh-agent under launchd) via the common darwin suite, and export
SSH_AUTH_SOCK from environment.shellInit so bash, zsh, and fish (via
/etc/fish/foreign-env/shellInit) all point at the nix-managed socket.
2026-05-18 12:18:22 +02:00
0990389464 feat(m4): install defuddle-cli 2026-05-16 14:13:31 +02:00
a29301179b feat(opencode): install kepano/obsidian-skills into ~/.agents/skills
Add obsidian-skills as a flake input (flake = false) and map each
skill subdirectory into ~/.agents/skills/<skill>, alongside the
existing local skills. Updates flow through `nix flake update
obsidian-skills`.
2026-05-16 14:13:31 +02:00
b0fc627d0a feat: add config/claude/statusline-command.sh 2026-05-16 13:41:09 +02:00
492362fa31 feat(amd): enable Wake-on-LAN on enp7s0 2026-05-16 13:40:25 +02:00
4da2eed356 chore(homes): remove broken x86_64-linux home configs
- harald@sgx-nixos: orphan, no matching NixOS system and no
  home.stateVersion set, so it failed standalone evaluation
- harald@sgx-azure: referenced metacfg.tools.direnv.enable but no
  modules/home/tools/direnv exists, causing eval failure
2026-05-16 11:26:31 +02:00
a2b7dc510b fix(pi): PI_OFFLINE 1 2026-05-16 08:51:46 +02:00
4d32148059 chore: use ~/.agent/skills 2026-05-15 21:26:56 +02:00
c65c0d8756 chore(opencode): remove domain from config 2026-05-15 20:41:44 +02:00
53c11a120c feat(halo): new MTP version 2026-05-15 19:32:51 +02:00
Harald Hoyer
b5ae777a4a feat(home/js): deploy ~/.npmrc and ~/.bunfig.toml everywhere
New metacfg.cli-apps.js module (enabled by default) pins minimum
release ages for npm and bun across all home configurations, so the
mitigation against newly published malicious packages applies
uniformly rather than living as untracked dotfiles on one machine.
2026-05-15 08:41:56 +02:00
38d2d4f4ae fix(halo): q6_k with mtp 2 2026-05-15 07:47:43 +02:00
baaab26eb7 chore: flake.lock 2026-05-14 08:04:19 +02:00
1e3b2fc9a7 feat(halo): unsloth MTP 2026-05-13 19:42:54 +02:00
42c52bd87f refactor(mx): drive opencode bot via direct chat-completions API
The bot no longer shells out to `opencode run`. Instead it POSTs to the
OpenAI-compatible /chat/completions endpoint exposed by llama-server on
halo.hoyer.tail:8000 directly. This removes the Bun/sqlite cold-start
overhead per request, drops the pkgs.opencode runtime dependency, and
eliminates the ExecStartPre dance that materialized config.json into the
service's $HOME.

Conversation history is now stored as a proper OpenAI `messages` list
with system/user/assistant roles, instead of the XML blob that was
inlined into a single `opencode run` argument. The interactive opencode
setup (config/opencode/config.json) is unchanged — only the bot stops
depending on it.

The module gains a `modelBaseUrl` option; `model` is now the bare model
name (`halo-8000`) without the provider/ prefix that the opencode CLI
required.
2026-05-13 16:38:58 +02:00
aa3bc3c457 feat(pi): package @earendil-works/pi-coding-agent as pi
Vendors the npm tarball + lockfile and wraps the `pi` binary with `fd` and
`ripgrep` on PATH. Also installs it on the m4 darwin host.

`buildNpmPackage` is pulled from `inputs.unstable` because nixos-25.11's
`prefetch-npm-deps-0.1.0` panics on cacache index entries that contain
either multiple lines or JSON values with embedded spaces (npm's
`accept: application/...; q=1.0, ...` headers). For this lockfile,
`@esbuild/netbsd-arm64` and `@rollup/rollup-linux-x64-musl` trigger
both conditions and `--map-cache` fails with `EOF while parsing a
string at line 1 column 369`. Fixed upstream in nixos-unstable, which
now uses `lines()` + `split_once('\t')`.
2026-05-13 16:34:38 +02:00
9e692f45ba chore: bot pw 2026-05-13 15:31:26 +02:00
d8e8293c0e feat(mx): add Nextcloud Talk opencode bot pointing at halo.hoyer.tail:8000
Mirrors the existing nextcloud-claude-bot setup but invokes `opencode run`
against the local `halo-8000` provider/model. The bot listens on
127.0.0.1:8086, is exposed via the `/_opencode-bot/` location on
nc.hoyer.xyz, and uses `@Halo` as its mention trigger in group chats.

The opencode config (config/opencode/config.json) is installed into the
service's $HOME/.config/opencode/ on each start, so the bot picks up the
same provider definition the user uses interactively. The model map keys
are renamed to `halo-8000` / `halo-8001` so the canonical
`provider/model` reference works without an alias indirection.
2026-05-13 15:08:18 +02:00