Expose halo's [fast] MoE preset through the LiteLLM gateway and make it the rag CLI's default chat model (overridable via RAG_CHAT_MODEL), so query synthesis is quicker than the larger coder model. |
||
|---|---|---|
| .. | ||
| default.nix | ||
Expose halo's [fast] MoE preset through the LiteLLM gateway and make it the rag CLI's default chat model (overridable via RAG_CHAT_MODEL), so query synthesis is quicker than the larger coder model. |
||
|---|---|---|
| .. | ||
| default.nix | ||