A similarity score threshold filters retrieved documents by their relevance score, keeping only those with a score above the threshold to discard low-confidence results.
In vector retrieval, each document is assigned a similarity score (e.g., cosine similarity) indicating how close it is to the query vector. A similarity score threshold allows you to set a minimum relevance score for documents to be returned. This is useful for filtering out low-confidence results that might be irrelevant and could introduce noise into the LLM’s context.
The similarity_score_threshold search type works alongside score_threshold to discard documents with low similarity scores. This reduces noise in the retrieved context and helps control token usage when passing results to an LLM.