nixcfg/systems/x86_64-linux/halo
Harald Hoyer ab729a0720 feat(halo): serve bge-m3 embeddings alongside coder
Add a multilingual bge-m3 embedding model to the llama-server preset and
raise --models-max to 2 so it stays co-resident with the coder model.
This gives the RAG stack a local embeddings endpoint without a second
service, keeping all inference on halo. Embedding-specific overrides
(ubatch-size, context, pooling) are pinned since the global defaults
would truncate or misconfigure embedding requests.
2026-05-22 00:35:54 +02:00
..
default.nix feat(halo): serve multiple llama models via models.ini preset 2026-05-20 00:23:50 +02:00
hardware-configuration.nix feat(halo): verbose boot 2026-02-17 09:17:24 +01:00
llama-server.nix feat(halo): serve bge-m3 embeddings alongside coder 2026-05-22 00:35:54 +02:00
models.ini feat(halo): serve bge-m3 embeddings alongside coder 2026-05-22 00:35:54 +02:00
sound.nix chore: nix fmt 2026-05-03 14:57:49 +02:00
wyoming.nix nix fmt 2026-02-24 13:25:42 +01:00
xremap.nix chore: nix fmt 2026-05-03 14:57:49 +02:00