Implement conversation summarization using a node that triggers when the message count exceeds a threshold, compresses older messages into a summary using an LLM, and replaces them with a single SystemMessage containing the summary.
Summarization is a common technique for managing long conversations without losing critical context. You can add a conditional edge in your LangGraph that checks if the number of messages has exceeded a limit (e.g., 20 messages). If so, a summarization node is invoked. This node takes all messages except the most recent few (e.g., last 5), sends them to an LLM with a summarization prompt, and replaces those messages with a new SystemMessage containing the summary. The remaining recent messages are kept intact. This reduces token usage while preserving high-level context.