nixcfg/systems
Harald Hoyer 5ee2f65337 chore(halo): tune llama models.ini and drop 35B-A3B model
Serve only Qwen3.6-27B; remove the unused 35B-A3B preset.

Tuning:
- Move model-specific keys (spec-type, sampling temp/top-p/top-k/min-p)
  out of the [*] defaults into [Qwen3.6-27B] so they no longer leak onto
  other models; draft-mtp in particular only works on MTP-weighted models.
- Drop the duplicate parallel key from [*].
- Bump ubatch-size 256 -> 512 for faster iGPU prefill on Strix Halo.
- Add threads-batch = 16 to use all cores for prefill while keeping
  generation at threads = 8 under full GPU offload.
2026-05-20 14:23:42 +02:00
..
aarch64-darwin feat(rialo): add pi 2026-05-19 14:27:50 +02:00
aarch64-linux nix fmt 2026-02-24 13:25:42 +01:00
x86_64-darwin/mpro nix fmt 2024-11-19 10:31:29 +01:00
x86_64-linux chore(halo): tune llama models.ini and drop 35B-A3B model 2026-05-20 14:23:42 +02:00
nixbuild.nix chore: nix fmt 2026-05-03 14:57:49 +02:00