fix(rag): send explicit encoding_format to avoid llama.cpp null error

When encoding_format is unset, LiteLLM forwards it to the backend as JSON
null, and llama.cpp's embeddings endpoint rejects it with a 500
("type must be string, but is null"). Pin encoding_format="float" so the
gateway always relays a string.
This commit is contained in:
Harald Hoyer 2026-05-22 08:34:42 +02:00
parent f0fe1d5b27
commit 6fd6060dd7

View file

@ -34,7 +34,9 @@ writers.writePython3Bin "rag"
def embed(texts): def embed(texts):
resp = client.embeddings.create(model=EMBED_MODEL, input=texts) # encoding_format is explicit: llama.cpp rejects a null value, and
# LiteLLM forwards an unset one as JSON null.
resp = client.embeddings.create(model=EMBED_MODEL, input=texts, encoding_format="float")
return [d.embedding for d in resp.data] return [d.embedding for d in resp.data]