Implement a hybrid strategy combining sliding window memory (keep last N turns) with conversation summarization of older messages, always preserving the system prompt and any critical instructions.
The optimal strategy is to use a sliding window that keeps the most recent 8-10 turns (the most relevant context) while summarizing earlier messages into a concise SystemMessage. This preserves high-level context without exceeding token limits. LangGraph provides built-in support for both approaches. You can implement a conditional node that triggers summarization when the message count exceeds a threshold, replacing messages older than the window with a summary generated by a cheap model (e.g., GPT-4o-mini). This ensures that the agent retains memory of the conversation's key points without keeping every raw message.