test: 130 edge case tests + fix NaN/Infinity bug in cosine_similarity

Edge cases found 2 real bugs:
- cosine_similarity(NaN, ...) returned NaN instead of 0.0
- cosine_similarity(Infinity, ...) returned NaN instead of 0.0
Fix: added is_finite() guards on denom and raw ratio.

New edge case tests by module:
- vector.rs (18): NaN, Infinity, negative vectors, opposite vectors clamped,
  high-dimensional (1536), single element, both-zero, non-aligned bytes,
  3-byte input, special float values, NaN roundtrip, limit=0, zero weights,
  negative BM25 scores, duplicate IDs, large normalization, single item
- embeddings.rs (8): noop embed_one error, empty batch, multiple texts,
  empty/unknown provider, custom empty URL, no API key, trailing slash, dims
- chunker.rs (11): headings-only, deeply nested ####, long single line,
  whitespace-only, max_tokens=0, max_tokens=1, unicode/emoji, FTS5 special
  chars, multiple blank lines, trailing heading, no content loss
- sqlite.rs (23): FTS5 quotes/asterisks/parens, SQL injection, empty
  content/key, 100KB content, unicode+emoji, newlines+tabs, single char
  query, limit=0/1, key matching, unicode query, schema idempotency,
  triple open, ghost results after forget, forget+re-store cycle,
  reindex empty/twice, content_hash empty/unicode/long, category
  roundtrip with spaces/empty, list custom category, list empty DB

869 tests passing, 0 clippy warnings, cargo-deny clean
This commit is contained in:
argenis de la rosa 2026-02-14 00:28:55 -05:00
parent 0e7f501fd6
commit ce4f36a3ab
4 changed files with 649 additions and 2 deletions

View file

@ -187,4 +187,66 @@ mod tests {
assert_eq!(p.name(), "openai"); // uses OpenAiEmbedding internally
assert_eq!(p.dimensions(), 768);
}
// ── Edge cases ───────────────────────────────────────────────
#[tokio::test]
async fn noop_embed_one_returns_error() {
let p = NoopEmbedding;
// embed returns empty vec → pop() returns None → error
let result = p.embed_one("hello").await;
assert!(result.is_err());
}
#[tokio::test]
async fn noop_embed_empty_batch() {
let p = NoopEmbedding;
let result = p.embed(&[]).await.unwrap();
assert!(result.is_empty());
}
#[tokio::test]
async fn noop_embed_multiple_texts() {
let p = NoopEmbedding;
let result = p.embed(&["a", "b", "c"]).await.unwrap();
assert!(result.is_empty());
}
#[test]
fn factory_empty_string_returns_noop() {
let p = create_embedding_provider("", None, "model", 1536);
assert_eq!(p.name(), "none");
}
#[test]
fn factory_unknown_provider_returns_noop() {
let p = create_embedding_provider("cohere", None, "model", 1536);
assert_eq!(p.name(), "none");
}
#[test]
fn factory_custom_empty_url() {
// "custom:" with no URL — should still construct without panic
let p = create_embedding_provider("custom:", None, "model", 768);
assert_eq!(p.name(), "openai");
}
#[test]
fn factory_openai_no_api_key() {
let p = create_embedding_provider("openai", None, "text-embedding-3-small", 1536);
assert_eq!(p.name(), "openai");
assert_eq!(p.dimensions(), 1536);
}
#[test]
fn openai_trailing_slash_stripped() {
let p = OpenAiEmbedding::new("https://api.openai.com/", "key", "model", 1536);
assert_eq!(p.base_url, "https://api.openai.com");
}
#[test]
fn openai_dimensions_custom() {
let p = OpenAiEmbedding::new("http://localhost", "k", "m", 384);
assert_eq!(p.dimensions(), 384);
}
}