Claude's API is stateless, so every turn re-sends the whole conversation. This calculator models that compounding context and shows what a back-and-forth chat actually costs in input and output tokens.
| Turn | Input tok | Cum. cost |
|---|
Because the Messages API holds no server-side memory, your application must resend the full transcript on every request. That means the prompt grows each turn and you pay for the same earlier tokens again and again. The cost is not linear in the number of turns — it is quadratic.
Let S be the system-prompt tokens, U the tokens in each user message, O the tokens in each assistant reply, and N the number of turns. At turn k the request carries the system prompt, every prior exchange, and the new user message:
input(k) = S + (k − 1)(U + O) + U
Summing turn 1 through N gives the closed form this tool evaluates:
total input = N·S + N·U + (U + O)·N(N − 1)/2
total output = N·O
Each total is divided by 1,000,000 and multiplied by the model's per-million input and output rates, then summed. The N(N − 1)/2 term is the compounding history — it is why a 40-turn chat can cost four times a 20-turn chat rather than twice. The per-turn table makes that curve visible: watch the input column climb every row even though your message size never changes.
Two levers shrink the quadratic term. Prompt caching charges the repeated prefix at roughly one-tenth the input rate after the first write, which mostly cancels the growth on the stable system prompt and early history. Context trimming or compaction caps how much history rides along, flattening the curve once the conversation passes a threshold. This calculator shows the uncached baseline so you can size that gap before deciding whether caching is worth wiring in.
Prompt Caching Calculator Token Budget Planner Prompt Cost Comparison