Commit graph

491 commits

Author SHA1 Message Date
Harald Hoyer
5b44e037a1 feat(halo): add song <URL> command to convert via song.link
Resolves the URL through the Odesli public API (api.song.link) and
replies with the canonical song.link page plus per-platform deep links
(Spotify, Apple Music, YouTube/YT Music, Tidal, Deezer, Amazon Music,
SoundCloud). Country is pinned to DE.
2026-05-20 09:42:11 +02:00
ac70c57c15 chore(halo): preload both llama models and tune preset
Preload Qwen3.6-27B and Qwen3.6-35B-A3B at startup (load-on-startup)
so both are warm immediately under --models-max 2, set parallel = 1
as the [*] fallback for any other model, and adjust per-model context
size and draft depth.
2026-05-20 07:14:26 +02:00
31e491e314 Revert "fix(halo): 27 only"
This reverts commit 72e7bf613f.
2026-05-20 07:05:27 +02:00
72e7bf613f fix(halo): 27 only 2026-05-20 02:14:08 +02:00
807a3d0d8e fix(halo): context 2026-05-20 01:21:10 +02:00
0edf975c30 feat(halo): serve multiple llama models via models.ini preset
Replace the per-model llama-server units with a single service that
uses llama-server's --models-preset (models.ini) and --models-max 2,
so the 35B-A3B and 27B models are loaded on demand from one config.

Drop the now-redundant 27B / 27B-MTP / coder-next variant files and
the unused CacheDirectory + slot-save-path KV-slot handling.
2026-05-20 00:23:50 +02:00
ae068cfd84 feat(mx): increase halo bot timeout 2026-05-19 23:52:46 +02:00
b4063fda66 feat(halo): MTP --parallel 2 2026-05-19 23:48:53 +02:00
Harald Hoyer
5b92ed1850 feat(rialo): add pi 2026-05-19 14:27:50 +02:00
8bd096ff8d feat(halo): inc. mtp to 6 2026-05-19 06:40:13 +02:00
0990389464 feat(m4): install defuddle-cli 2026-05-16 14:13:31 +02:00
492362fa31 feat(amd): enable Wake-on-LAN on enp7s0 2026-05-16 13:40:25 +02:00
38d2d4f4ae fix(halo): q6_k with mtp 2 2026-05-15 07:47:43 +02:00
1e3b2fc9a7 feat(halo): unsloth MTP 2026-05-13 19:42:54 +02:00
42c52bd87f refactor(mx): drive opencode bot via direct chat-completions API
The bot no longer shells out to `opencode run`. Instead it POSTs to the
OpenAI-compatible /chat/completions endpoint exposed by llama-server on
halo.hoyer.tail:8000 directly. This removes the Bun/sqlite cold-start
overhead per request, drops the pkgs.opencode runtime dependency, and
eliminates the ExecStartPre dance that materialized config.json into the
service's $HOME.

Conversation history is now stored as a proper OpenAI `messages` list
with system/user/assistant roles, instead of the XML blob that was
inlined into a single `opencode run` argument. The interactive opencode
setup (config/opencode/config.json) is unchanged — only the bot stops
depending on it.

The module gains a `modelBaseUrl` option; `model` is now the bare model
name (`halo-8000`) without the provider/ prefix that the opencode CLI
required.
2026-05-13 16:38:58 +02:00
aa3bc3c457 feat(pi): package @earendil-works/pi-coding-agent as pi
Vendors the npm tarball + lockfile and wraps the `pi` binary with `fd` and
`ripgrep` on PATH. Also installs it on the m4 darwin host.

`buildNpmPackage` is pulled from `inputs.unstable` because nixos-25.11's
`prefetch-npm-deps-0.1.0` panics on cacache index entries that contain
either multiple lines or JSON values with embedded spaces (npm's
`accept: application/...; q=1.0, ...` headers). For this lockfile,
`@esbuild/netbsd-arm64` and `@rollup/rollup-linux-x64-musl` trigger
both conditions and `--map-cache` fails with `EOF while parsing a
string at line 1 column 369`. Fixed upstream in nixos-unstable, which
now uses `lines()` + `split_once('\t')`.
2026-05-13 16:34:38 +02:00
d8e8293c0e feat(mx): add Nextcloud Talk opencode bot pointing at halo.hoyer.tail:8000
Mirrors the existing nextcloud-claude-bot setup but invokes `opencode run`
against the local `halo-8000` provider/model. The bot listens on
127.0.0.1:8086, is exposed via the `/_opencode-bot/` location on
nc.hoyer.xyz, and uses `@Halo` as its mention trigger in group chats.

The opencode config (config/opencode/config.json) is installed into the
service's $HOME/.config/opencode/ on each start, so the bot picks up the
same provider definition the user uses interactively. The model map keys
are renamed to `halo-8000` / `halo-8001` so the canonical
`provider/model` reference works without an alias indirection.
2026-05-13 15:08:18 +02:00
dadfb07914 fix(halo): set --alias halo-8000 2026-05-13 14:52:49 +02:00
f9a2e0d301 chore(x1,amd): disable cratedocs-mcp service
Keep it enabled only on sgx.
2026-05-13 11:35:59 +02:00
4ce7bcf354 fix(mx): make tailscale exit-node advertisement actually apply
tailscale set is strict about boolean flags and silently ignores
--advertise-exit-node without =true. Result: the tailscaled-set unit
ran cleanly but AdvertiseRoutes stayed null. Spell the value out so the
flag takes effect.
2026-05-13 09:28:20 +02:00
67b7c3a9fd feat(headscale): add ACL policy, isolate mx, make mx an exit node
Introduces a headscale ACL policy (file-mode) plus matching client config:

- New systems/x86_64-linux/attic/headscale-policy.hujson:
  * tag:llm restricts a node to talking only to halo:8000
  * all other harald@ nodes have full mesh access to each other
  * harald@ nodes can route internet traffic via approved exit nodes
  * autoApprovers.exitNode = [tag:llm] auto-approves the exit route
    advertised by any tag:llm node (currently mx)

- attic headscale.nix: wire policy.mode = "file" / policy.path to
  the .hujson above.

- mx default.nix: enable useRoutingFeatures = "server" (needed for IP
  forwarding) and add extraSetFlags = ["--advertise-exit-node"] so the
  flag is reapplied on every activation, not just initial login.

Operational steps after deploy:
  headscale nodes tag -i 10 -t tag:llm
2026-05-13 09:06:40 +02:00
87bdaf15da fix(attic): keep headscale domain as headscale.hoyer.xyz
Avoid breaking existing clients and the registered OIDC redirect URI by
keeping the original domain. Only the host backing it changes (mx -> attic);
DNS just needs to be repointed.
2026-05-13 08:48:37 +02:00
12c25bcde8 refactor(attic): move headscale from mx to attic
Headscale is moving off the mx mailserver onto the attic cache host.
The new public URL is https://headscale.hoyer.world.

- Switch from useACMEHost = "hoyer.xyz" (mx wildcard DNS-01) to
  enableACME = true, since attic only has HTTP-01 configured.
- Move headscale port to 8081 to avoid clashing with atticd on 8080.
- Drop the 192.168.178.254 LAN nameserver from dns.nameservers.global,
  which isn't reachable from the Hetzner instance.

Operational steps still required on attic:
- Provision /var/lib/headscale/client_secret
- Migrate the headscale state DB from mx
- Point headscale.hoyer.world DNS at attic
- Update the Nextcloud OIDC client's redirect URI
2026-05-13 08:42:46 +02:00
e440bf39fd feat(halo): llama-server-27B-MTP.nix 2026-05-12 16:16:15 +02:00
ca4ee90828 feat(halo): coder next 2026-05-11 12:22:34 +02:00
7b04b55ce8 feat(halo): cache-ram 0 2026-05-10 20:50:08 +02:00
04342222a2 fix(halo): 27b 2026-05-10 20:46:12 +02:00
689cdec28d feat(halo): activate qwen 27b 2026-05-10 20:44:38 +02:00
bef528e26a feat(halo): use qwen-35b-a3b 2026-05-10 20:44:38 +02:00
d47bb6e15b feat(halo): add different llama servers 2026-05-07 14:54:48 +02:00
b548126fb8 fix(halo): fix systemd description for llama 2026-05-07 14:40:18 +02:00
02b3c73376 fix(halo): fix systemd description for llama 2026-05-06 14:03:28 +02:00
7ebd97629d feat(halo): use am17an/Qwen3.6-27B-MTP-GGUF:Q8_0 with MTP spec 2026-05-06 14:01:31 +02:00
a95417da8b feat(halo): use unsloth/Qwen3.6-27B-GGUF:UD-Q8_K_XL 2026-05-06 13:02:20 +02:00
3a1cb7487a refactor(opencode): extract serve service into shared NixOS module
New `metacfg.services.opencode` module under modules/nixos/services/opencode/
with options for port, user, homeDir, sopsFile, and extraPackages. User and
homeDir default off `metacfg.user`. Host configs for amd and sgx reduce to
enabling the module and pointing at their respective sops file.

Service PATH gains jq, yq-go, python3, gh, gnutar, gzip, unzip, wget,
diffutils, patch, file, tree, bun, uv, ast-grep, claude-code, and tmux for
agent ergonomics.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 13:43:27 +02:00
da88a9b2d6 fix(halo): drop speculative HSA_OVERRIDE_GFX_VERSION from llama-server
Was set defensively without knowing the actual GPU arch; if ROCm
supports the card natively, the override is at best a no-op and at
worst masks the real arch. Add it back with the right value if the
service actually fails to detect the GPU.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:42:17 +02:00
b11e5c8356 feat(halo): add llama-server systemd unit for Qwen3.6-35B-A3B
Runs llama.cpp's ROCm build under DynamicUser, with the HF model cache
in StateDirectory (survives systemctl clean) and KV slot saves in
CacheDirectory. Listens on :8000.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 11:42:17 +02:00
624a72737c fix(opencode): narrow LD_LIBRARY_PATH to libstdc++ only
The full nix-ld library list shadowed nix's own curl, breaking
libnixstore.so with "CURL_OPENSSL_4 not found". The prebuilt node
watcher binding only needs libstdc++/libgcc_s, so use stdenv.cc.cc.lib
and let nix-built tools resolve their own deps via RUNPATH.
2026-05-04 08:58:37 +02:00
0d5fb73022 fix(amd): opencode 2026-05-03 16:31:02 +02:00
5693009488 fix(opencode): set LD_LIBRARY_PATH for prebuilt node bindings
The file watcher binding (and other node-precompiled .node modules
loaded via dlopen) failed with "libstdc++.so.6: cannot open shared
object file" because systemd services don't inherit the user shell's
LD path. Reuse the nix-ld library list so the service sees the same
common libraries unwrapped binaries get globally.
2026-05-03 16:29:24 +02:00
441df05d86 fix(opencode): add git and dev tools to service PATH
The opencode-serve unit ran with systemd's minimal default PATH, so
shell commands invoked by the agent (git, make, nix, node, rg, etc.)
were not found. Set systemd.services.opencode-serve.path on both sgx
and amd to a common dev toolset.
2026-05-03 16:09:31 +02:00
0e723e2da8 feat(amd): add opencode web server at opencode.amd.hoyer.world
Mirror of the sgx opencode setup: systemd service on port 4196 fronted
by nginx with a per-host ACME cert (DNS-01 via internetbs). Adds amd
key + path rule to .sops.yaml so secrets under .secrets/amd/ encrypt
for the host.
2026-05-03 15:55:15 +02:00
01f42c0851 feat(sops): trigger service restarts on secret rotation
Wire up restartUnits on secrets whose consumers cache them in memory
(daemons read at startup), so sops-nix restarts the affected unit on
activation when the decrypted content changes:

- firefly: app_key → phpfpm-firefly-iii;
  auto_import_secret + access_token → phpfpm-firefly-iii-data-importer
- searx: secret_key → uwsgi
- opencode: web password → opencode-serve
- mail: sasl_passwd → postfix
- forgejo: gitea_dbpass → forgejo; runner-token → gitea-runner-default

Secrets read on demand by oneshots/timers (firefly sparda_pin, ntfy
token, restic backup creds, acme dns creds, wg conf) are left as-is.
2026-05-03 15:23:40 +02:00
0989b8ae46 feat(sgx): add opencode web server 2026-05-03 14:57:49 +02:00
f74928ce5f chore: nix fmt 2026-05-03 14:57:49 +02:00
c99ea665d4 feat(sgx): add opencode 2026-05-03 13:47:39 +02:00
b2027bd283 sgx/network: open TCP 8000-8999 in firewall
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 13:47:39 +02:00
e96bf83dfd feat(halo): add python313Packages.huggingface-hub 2026-05-03 09:00:13 +02:00
73bf52dbaf sgx/firefly: bump fastcgi_read_timeout + PHP max_execution_time on both vhosts
Bulk imports of 100+ transactions per chunk hit the default 60s
fastcgi timeout on the main Firefly III vhost too — not just the
importer endpoint. The importer's per-transaction API call to Firefly's
/api/v1/transactions can take 20+s on a fresh DB without ANALYZE,
which compounds with the 30s PHP max_execution_time cap.

- nginx fastcgi_read_timeout=600s on both `firefly` and `firefly-import`
  vhosts
- php_admin_value[max_execution_time]=600 + memory_limit=512M on both
  PHP-FPM pools
- VANITY_URL on the importer now points to the main Firefly III URL
  (was wrongly pointing at the importer's own domain, breaking
  clickable transaction-show links in importer log messages)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 16:44:20 +02:00
491a7b38e4 sgx/firefly: switch Firefly III backend from sqlite to postgres
SQLite was slow under btrfs CoW, and the no-CoW migration path turned
out to be fragile (WAL deletion without checkpoint = data loss). Move
to PostgreSQL on Unix-socket peer auth — no password needed for the
local-host setup, NixOS provisions the database+user declaratively.

Drop the now-unused +C tmpfiles rule on the sqlite directory; the
leftover database.sqlite* files at /var/lib/firefly-iii/storage/database/
are harmless and can be removed manually after switch is verified.

Migration of existing Firefly III data is not preserved by this
commit — fresh-start path: re-register admin, re-issue PAT, re-POST
the bulk CSV through the importer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 21:49:08 +02:00