nixcfg

harald/nixcfg

Fork 0

Commit graph

Author	SHA1	Message	Date
Harald Hoyer	a95417da8b	feat(halo): use unsloth/Qwen3.6-27B-GGUF:UD-Q8_K_XL	2026-05-06 13:02:20 +02:00
Harald Hoyer	da88a9b2d6	fix(halo): drop speculative HSA_OVERRIDE_GFX_VERSION from llama-server Was set defensively without knowing the actual GPU arch; if ROCm supports the card natively, the override is at best a no-op and at worst masks the real arch. Add it back with the right value if the service actually fails to detect the GPU. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:42:17 +02:00
Harald Hoyer	b11e5c8356	feat(halo): add llama-server systemd unit for Qwen3.6-35B-A3B Runs llama.cpp's ROCm build under DynamicUser, with the HF model cache in StateDirectory (survives systemctl clean) and KV slot saves in CacheDirectory. Listens on :8000. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 11:42:17 +02:00

Author

SHA1

Message

Date

Harald Hoyer

a95417da8b

feat(halo): use unsloth/Qwen3.6-27B-GGUF:UD-Q8_K_XL

2026-05-06 13:02:20 +02:00

Harald Hoyer

da88a9b2d6

fix(halo): drop speculative HSA_OVERRIDE_GFX_VERSION from llama-server

Was set defensively without knowing the actual GPU arch; if ROCm
supports the card natively, the override is at best a no-op and at
worst masks the real arch. Add it back with the right value if the
service actually fails to detect the GPU.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-05 11:42:17 +02:00

Harald Hoyer

b11e5c8356

feat(halo): add llama-server systemd unit for Qwen3.6-35B-A3B

Runs llama.cpp's ROCm build under DynamicUser, with the HF model cache
in StateDirectory (survives systemctl clean) and KV slot saves in
CacheDirectory. Listens on :8000.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-05 11:42:17 +02:00

3 commits