Expose halo's [fast] MoE preset through the LiteLLM gateway and make it the rag CLI's default chat model (overridable via RAG_CHAT_MODEL), so query synthesis is quicker than the larger coder model. |
||
|---|---|---|
| .. | ||
| dcpl2530dw-cups | ||
| geekbench_6 | ||
| nixos-hosts | ||
| nixos-revision | ||
| pi | ||
| rag | ||
| rot8000 | ||
| zeroclaw | ||