nixcfg

History

Harald Hoyer 5ee2f65337 chore(halo): tune llama models.ini and drop 35B-A3B model Serve only Qwen3.6-27B; remove the unused 35B-A3B preset. Tuning: - Move model-specific keys (spec-type, sampling temp/top-p/top-k/min-p) out of the [] defaults into [Qwen3.6-27B] so they no longer leak onto other models; draft-mtp in particular only works on MTP-weighted models. - Drop the duplicate parallel key from []. - Bump ubatch-size 256 -> 512 for faster iGPU prefill on Strix Halo. - Add threads-batch = 16 to use all cores for prefill while keeping generation at threads = 8 under full GPU offload.		2026-05-20 14:23:42 +02:00
..
default.nix	feat(halo): serve multiple llama models via models.ini preset	2026-05-20 00:23:50 +02:00
hardware-configuration.nix	feat(halo): verbose boot	2026-02-17 09:17:24 +01:00
llama-server.nix	feat(halo): serve multiple llama models via models.ini preset	2026-05-20 00:23:50 +02:00
models.ini	chore(halo): tune llama models.ini and drop 35B-A3B model	2026-05-20 14:23:42 +02:00
sound.nix	chore: nix fmt	2026-05-03 14:57:49 +02:00
wyoming.nix	nix fmt	2026-02-24 13:25:42 +01:00
xremap.nix	chore: nix fmt	2026-05-03 14:57:49 +02:00