fix(rag): send explicit encoding_format to avoid llama.cpp null error

When encoding_format is unset, LiteLLM forwards it to the backend as JSON null, and llama.cpp's embeddings endpoint rejects it with a 500 ("type must be string, but is null"). Pin encoding_format="float" so the gateway always relays a string.
2026-05-22 08:34:42 +02:00 · 2026-05-22 08:34:42 +02:00 · 6fd6060dd7
commit 6fd6060dd7
parent f0fe1d5b27
1 changed files with 3 additions and 1 deletions
--- a/packages/rag/default.nix
+++ b/packages/rag/default.nix
@ -34,7 +34,9 @@ writers.writePython3Bin "rag"


    def embed(texts):
-        resp = client.embeddings.create(model=EMBED_MODEL, input=texts)
+        # encoding_format is explicit: llama.cpp rejects a null value, and
+        # LiteLLM forwards an unset one as JSON null.
+        resp = client.embeddings.create(model=EMBED_MODEL, input=texts, encoding_format="float")
        return [d.embedding for d in resp.data]