next
Round
Technologies
Saved
Summary
Login
next
Round
Technologies
Saved
Summary
Login
Question Loading...
RAG
1. What is RAG and why is it preferred over fine-tuning for domain-specific knowledge in production applications?
Level: Expert | Frequency: High
2. What are the core components of a RAG pipeline in LangChain — Document Loaders, Text Splitters, Embeddings, Vector Stores, Retrievers, and Chains?
Level: Expert | Frequency: High
3. What is the difference between semantic search and keyword search and why does RAG rely on semantic similarity?
Level: Expert | Frequency: High
4. What is an Embedding in the context of RAG — what does it represent and why is cosine similarity used to compare them?
Level: Expert | Frequency: High
5. What is a Vector Store and how does it differ from a traditional relational or document database?
Level: Expert | Frequency: High
6. What is the difference between a Retriever and a Vector Store in LangChain — why is the abstraction separation important?
Level: Expert | Frequency: High
7. What is a Document object in LangChain — what are pageContent and metadata fields and why does metadata matter in RAG?
Level: Expert | Frequency: High
8. What are Document Loaders in LangChain and how do you choose the right loader for PDFs, web pages, Notion, Google Drive, or SQL databases?
Level: Expert | Frequency: High
9. What is the difference between RecursiveCharacterTextSplitter and CharacterTextSplitter — when would you use one over the other?
Level: Expert | Frequency: High
10. What is chunk size and chunk overlap in text splitting — how do you decide the right values for your use case?
Level: Expert | Frequency: High
11. How do you handle structured documents like tables, code blocks, or markdown files during the splitting phase to avoid breaking semantic meaning?
Level: Expert | Frequency: High
12. How do you load and split documents lazily (streaming) to handle very large files without running out of memory?
Level: Expert | Frequency: High
13. What is a SemanticChunker and how does it differ from fixed-size character-based splitting?
Level: Expert | Frequency: High
14. How do you preserve and propagate source metadata (filename, page number, URL, timestamp) through the loading and splitting pipeline?
Level: Expert | Frequency: High
15. How do you choose the right embedding model — what tradeoffs exist between OpenAI embeddings, Cohere, HuggingFace, and local models like nomic-embed?
Level: Expert | Frequency: High
16. What is the difference between dense embeddings and sparse embeddings (BM25) — when would you combine both in a hybrid search?
Level: Expert | Frequency: High
17. How do you handle embedding model upgrades in production — what happens to your existing vectors when you switch models?
Level: Expert | Frequency: High
18. How do you efficiently batch embed a large corpus of documents without hitting rate limits or memory constraints?
Level: Expert | Frequency: High
19. What are the tradeoffs between vector stores like Pinecone, Weaviate, Chroma, pgvector, and FAISS — how do you choose for production?
Level: Expert | Frequency: High
20. How do you implement namespace or tenant isolation in a vector store for a multi-tenant RAG application?
Level: Expert | Frequency: High
21. How do you handle incremental updates to a vector store — adding, updating, and deleting documents without full re-indexing?
Level: Expert | Frequency: High
22. What is HNSW indexing and why does it make approximate nearest neighbor search fast at scale?
Level: Expert | Frequency: High
23. What is a similarity score threshold in retrieval and how do you use it to filter out low-confidence results?
Level: Expert | Frequency: High
24. What is MMR (Maximal Marginal Relevance) retrieval and how does it balance relevance with diversity of results?
Level: Expert | Frequency: High
25. What is a MultiQueryRetriever and how does it improve recall by generating multiple phrasings of the same question?
Level: Expert | Frequency: High
26. What is Contextual Compression in LangChain retrieval and how does it reduce noise in retrieved chunks?
Level: Expert | Frequency: High
27. What is a ParentDocumentRetriever — how does it index small chunks but return larger parent chunks to the LLM?
Level: Expert | Frequency: High
28. What is HyDE (Hypothetical Document Embedding) and how does it improve retrieval for vague or abstract queries?
Level: Expert | Frequency: High
29. What is Self-Query Retrieval and how does it allow the LLM to generate structured metadata filters alongside the semantic query?
Level: Expert | Frequency: High
30. How do you implement hybrid search combining dense vector search with BM25 keyword search using EnsembleRetriever?
Level: Expert | Frequency: High
31. What is a Re-ranker (cross-encoder) and where does it fit in the RAG pipeline after initial retrieval?
Level: Expert | Frequency: High
32. What is the difference between Stuff, MapReduce, Refine, and MapRerank document chain strategies — when do you use each?
Level: Expert | Frequency: High
33. How do you build a Conversational RAG chain that maintains chat history and reformulates follow-up questions into standalone queries?
Level: Expert | Frequency: High
34. What is query decomposition and how do you break a complex multi-part question into sub-queries for better retrieval?
Level: Expert | Frequency: High
35. How do you implement Step-Back Prompting in a RAG pipeline to improve retrieval for highly specific questions?
Level: Expert | Frequency: High
36. What is CRAG (Corrective RAG) and how does it add a grading step to decide whether retrieved docs are relevant before answering?
Level: Expert | Frequency: High
37. What is Self-RAG and how does the LLM decide when to retrieve, whether retrieved docs are relevant, and whether the answer is grounded?
Level: Expert | Frequency: High
38. How do you implement a fallback strategy when retrieval returns no relevant documents — how do you avoid hallucination in this case?
Level: Expert | Frequency: High
39. How do you implement RAG evaluation — what metrics like faithfulness, answer relevancy, and context recall do you measure using RAGAS?
Level: Expert | Frequency: High
40. How do you detect and mitigate hallucination in RAG outputs — what role does citation and source grounding play?
Level: Expert | Frequency: High
41. How do you build a citation system that maps each sentence in the LLM's answer back to the exact source chunk it came from?
Level: Expert | Frequency: High
42. How do you handle multilingual RAG — embedding and retrieving documents in multiple languages for a global user base?
Level: Expert | Frequency: High
43. How do you optimize retrieval latency in production — what caching, pre-fetching, or index optimization strategies do you apply?
Level: Expert | Frequency: High
44. How do you implement access control at the retrieval layer — ensuring users only retrieve documents they are authorized to see?
Level: Expert | Frequency: High
45. How do you handle long context RAG — when retrieved chunks exceed the LLM's context window, what strategies do you apply?
Level: Expert | Frequency: High
46. How do you design a RAG pipeline with LangGraph — turning retrieval, grading, and generation into discrete stateful graph nodes?
Level: Expert | Frequency: High
All Topics
Basics
Agents
Models
Messages
Tools
Middleware
Memory
MCP
Multi Agents
RAG
Context Engineering
Human in the loop