Work Out the Real Cost of a Multi-Turn Claude Chat

Claude's API is stateless, so every turn re-sends the whole conversation. This calculator models that compounding context and shows what a back-and-forth chat actually costs in input and output tokens.

Model & pricing Prices are input / output per 1,000,000 tokens.

System prompt tokens (sent every turn) Instructions, tools, persona — resent on each request.

Tokens per user message

Tokens per assistant reply

Number of turns (exchanges)

One turn = one user message plus one assistant reply.

$0.00Total conversation cost

Input cost

$0.00

Output cost

$0.00

Total input tokens

Total output tokens

Last turn input

Avg cost / turn

$0.00

Turn	Input tok	Cum. cost

How the calculation works

Because the Messages API holds no server-side memory, your application must resend the full transcript on every request. That means the prompt grows each turn and you pay for the same earlier tokens again and again. The cost is not linear in the number of turns — it is quadratic.

Let S be the system-prompt tokens, U the tokens in each user message, O the tokens in each assistant reply, and N the number of turns. At turn k the request carries the system prompt, every prior exchange, and the new user message:

input(k) = S + (k − 1)(U + O) + U

Summing turn 1 through N gives the closed form this tool evaluates:

total input = N·S + N·U + (U + O)·N(N − 1)/2
total output = N·O

Each total is divided by 1,000,000 and multiplied by the model's per-million input and output rates, then summed. The N(N − 1)/2 term is the compounding history — it is why a 40-turn chat can cost four times a 20-turn chat rather than twice. The per-turn table makes that curve visible: watch the input column climb every row even though your message size never changes.

Two levers shrink the quadratic term. Prompt caching charges the repeated prefix at roughly one-tenth the input rate after the first write, which mostly cancels the growth on the stable system prompt and early history. Context trimming or compaction caps how much history rides along, flattening the curve once the conversation passes a threshold. This calculator shows the uncached baseline so you can size that gap before deciding whether caching is worth wiring in.

Work Out the Real Cost of a Multi-Turn Claude Chat

How the calculation works

Related Tools