halo's llama-server now runs in router mode where the model field selects a preset (coder/fast/bge-m3); the old "halo-8000" name is no longer valid. Use the fast MoE model for the Talk bot's responses. |
||
|---|---|---|
| .. | ||
| bot.py | ||
| default.nix | ||
| module.nix | ||
halo's llama-server now runs in router mode where the model field selects a preset (coder/fast/bge-m3); the old "halo-8000" name is no longer valid. Use the fast MoE model for the Talk bot's responses. |
||
|---|---|---|
| .. | ||
| bot.py | ||
| default.nix | ||
| module.nix | ||