halo's llama-server now runs in router mode where the model field selects a preset (coder/fast/bge-m3); the old "halo-8000" name is no longer valid. Use the fast MoE model for the Talk bot's responses. |
||
|---|---|---|
| .. | ||
| aarch64-darwin | ||
| aarch64-linux | ||
| x86_64-darwin/mpro | ||
| x86_64-linux | ||
| nixbuild.nix | ||