Dense embeddings capture semantic relationships between words (retrieving conceptually similar content with different terminology), while sparse embeddings (like BM25) prioritize exact keyword matches for precise term retrieval. Combining them in hybrid search leverages both strengths, improving recall and precision in domains like legal or medical where queries demand both broad understanding and specific term matching.
Dense embeddings (e.g., from BERT, OpenAI) transform text into high-dimensional vectors that capture semantic meaning, enabling retrieval of documents that share concepts but may use different wording. Sparse embeddings (BM25, TF-IDF) use keyword-based scoring, excelling at exact term matches but missing semantic relationships. The complementary nature of these approaches drives hybrid search, which combines results from both retrieval pipelines.
Improved retrieval: Combines semantic understanding (dense) with exact term precision (sparse)[reference:7]
Infrastructure complexity: Requires two separate retrieval pipelines (vector DB for dense, inverted index for sparse)[reference:8]
Ranking fusion: Weighted sum or Reciprocal Rank Fusion (RRF) needed; tuning adds overhead[reference:9]
Latency increase: Two retrieval steps run in parallel or sequence; optimizations like caching can mitigate[reference:10]
Use case fit: Legal document retrieval, medical literature search, enterprise search with domain-specific terminology[reference:11]