Model
📋 Token budget sections
⚡ Optimization tips
- Truncation: Trim older messages or long documents to fit.
- Summarization: Replace lengthy context with condensed summaries.
- Sliding window: Keep only recent N messages for conversation.
- RAG chunking: Retrieve only relevant document chunks.
- System prompt: Keep concise; move examples to few-shot.