nixcfg

harald/nixcfg

Fork 0

Commit graph

Author	SHA1	Message	Date
Harald Hoyer	95668b71a7	feat(sgx): add CLI RAG stack (Qdrant + embeddings gateway + rag tool) Stand up document retrieval as shared, client-agnostic primitives rather than locking it inside Open WebUI: - Qdrant as the LAN-reachable vector store - LiteLLM gains a bge-m3 route so sgx:4000 also serves /v1/embeddings - a thin `rag` CLI (ingest/query, optional coder synthesis) usable from any machine and from scripts Embeddings and synthesis run on halo via the gateway; the CLI is configured entirely through RAG_* env vars.	2026-05-22 00:35:54 +02:00
Harald Hoyer	fdefdf31b2	feat(litellm): add LiteLLM gateway on sgx fronting halo's llama-server Exposes an OpenAI-compatible endpoint on sgx:4000 (LAN-reachable) that routes the `coder` model to halo's llama-server, so clients get a stable gateway with per-key auth instead of hardcoding halo's address. Master key is sourced from a sops-encrypted env file.	2026-05-21 23:15:47 +02:00

Author

SHA1

Message

Date

Harald Hoyer

95668b71a7

feat(sgx): add CLI RAG stack (Qdrant + embeddings gateway + rag tool)

Stand up document retrieval as shared, client-agnostic primitives rather
than locking it inside Open WebUI:

- Qdrant as the LAN-reachable vector store
- LiteLLM gains a bge-m3 route so sgx:4000 also serves /v1/embeddings
- a thin `rag` CLI (ingest/query, optional coder synthesis) usable from
  any machine and from scripts

Embeddings and synthesis run on halo via the gateway; the CLI is
configured entirely through RAG_* env vars.

2026-05-22 00:35:54 +02:00

Harald Hoyer

fdefdf31b2

feat(litellm): add LiteLLM gateway on sgx fronting halo's llama-server

Exposes an OpenAI-compatible endpoint on sgx:4000 (LAN-reachable) that
routes the `coder` model to halo's llama-server, so clients get a stable
gateway with per-key auth instead of hardcoding halo's address. Master
key is sourced from a sops-encrypted env file.

2026-05-21 23:15:47 +02:00

2 commits