Total tokens (with cache)—
📋 Cost breakdown per turn
| Turn | Input cached? | Input tokens | Output tokens | Cost (USD) |
💻 Enable prompt caching (API example)
// Claude API request with cache control (system prompt)
const response = await fetch("https://api.anthropic.com/v1/messages", {
method: "POST",
headers: { "x-api-key": "sk-...", "anthropic-version": "2023-06-01" },
body: JSON.stringify({
model: "claude-sonnet-4-20250514",
system: [
{ "type": "text", "text": "You are a helpful assistant.", "cache_control": { "type": "ephemeral" } }
],
messages: [{ "role": "user", "content": "Hello" }]
})
});
⏳ Cache TTL 5 minutes inactivity
📏 Min. cacheable size 1024 tokens (system prompt)
⚡ First turn full system prompt cost
🔁 Subsequent turns cached system prompt (reduced cost)