Commit graph

3 commits

Author SHA1 Message Date
argenis de la rosa
1fd51f1984 fix: resolve all clippy --all-targets warnings across 15 files
- gateway/mod.rs: move send_json before test module (items_after_test_module)
- memory/vector.rs: fix float_cmp, cast_precision_loss, approx_constant
- memory/chunker.rs: fix format_collect, format_push_string, write_with_newline
- memory/sqlite.rs: fix useless_vec
- heartbeat/engine.rs: fix format_collect, write_with_newline
- config/schema.rs: fix needless_raw_string_hashes
- tools/composio.rs: fix needless_raw_string_hashes
- integrations/registry.rs: fix uninlined_format_args, unused import
- tunnel/mod.rs: fix doc_markdown
- skills/mod.rs: allow similar_names in test module
- channels/cli.rs: fix unreadable_literal
- observability/mod.rs: fix manual_string_new
- runtime/mod.rs: fix manual_string_new
- examples/custom_memory.rs: add Default impl (new_without_default)
- examples/custom_channel.rs: fix needless_borrows_for_generic_args
2026-02-14 03:52:57 -05:00
argenis de la rosa
ce4f36a3ab test: 130 edge case tests + fix NaN/Infinity bug in cosine_similarity
Edge cases found 2 real bugs:
- cosine_similarity(NaN, ...) returned NaN instead of 0.0
- cosine_similarity(Infinity, ...) returned NaN instead of 0.0
Fix: added is_finite() guards on denom and raw ratio.

New edge case tests by module:
- vector.rs (18): NaN, Infinity, negative vectors, opposite vectors clamped,
  high-dimensional (1536), single element, both-zero, non-aligned bytes,
  3-byte input, special float values, NaN roundtrip, limit=0, zero weights,
  negative BM25 scores, duplicate IDs, large normalization, single item
- embeddings.rs (8): noop embed_one error, empty batch, multiple texts,
  empty/unknown provider, custom empty URL, no API key, trailing slash, dims
- chunker.rs (11): headings-only, deeply nested ####, long single line,
  whitespace-only, max_tokens=0, max_tokens=1, unicode/emoji, FTS5 special
  chars, multiple blank lines, trailing heading, no content loss
- sqlite.rs (23): FTS5 quotes/asterisks/parens, SQL injection, empty
  content/key, 100KB content, unicode+emoji, newlines+tabs, single char
  query, limit=0/1, key matching, unicode query, schema idempotency,
  triple open, ghost results after forget, forget+re-store cycle,
  reindex empty/twice, content_hash empty/unicode/long, category
  roundtrip with spaces/empty, list custom category, list empty DB

869 tests passing, 0 clippy warnings, cargo-deny clean
2026-02-14 00:28:55 -05:00
argenis de la rosa
0e7f501fd6 feat: full-stack search engine — FTS5, vector search, hybrid merge, embedding cache, chunker
The Full Stack (All Custom):
- Vector DB: embeddings stored as BLOB, cosine similarity in pure Rust
- Keyword Search: FTS5 virtual tables with BM25 scoring + auto-sync triggers
- Hybrid Merge: weighted fusion of vector + keyword results (configurable weights)
- Embeddings: provider abstraction (OpenAI, custom URL, noop fallback)
- Chunking: line-based markdown chunker with heading preservation
- Caching: embedding_cache table with LRU eviction
- Safe Reindex: rebuild FTS5 + re-embed missing vectors

New modules:
- src/memory/embeddings.rs — EmbeddingProvider trait + OpenAI + Noop + factory
- src/memory/vector.rs — cosine similarity, vec↔bytes, ScoredResult, hybrid_merge
- src/memory/chunker.rs — markdown-aware document splitting

Upgraded:
- src/memory/sqlite.rs — FTS5 schema, embedding column, hybrid recall, cache, reindex
- src/config/schema.rs — MemoryConfig expanded with embedding/search settings
- All callers updated to pass api_key for embedding provider

739 tests passing, 0 clippy warnings (Rust 1.93.1), cargo-deny clean
2026-02-14 00:00:23 -05:00