MultiQueryRetriever uses an LLM to generate multiple query variants from a user question, increasing retrieval recall and covering different interpretations.
Traditional single-query retrieval may miss relevant documents if the user’s phrasing doesn’t match the indexed text. MultiQueryRetriever addresses this by prompting an LLM to rewrite the query into several different phrasings or perspectives, retrieving results for each, and returning the unique set of documents.
Increases recall by covering multiple query formulations and perspectives.
Reduces dependence on the exact wording of the original question.
Slightly higher latency and token cost due to multiple retrieval calls.