Claude Code Snippets Library

Searchable collection of production-ready Claude API patterns. Filter by category, switch between Python and JavaScript, and copy any snippet with one click. Every example uses the latest SDK and API conventions.

By Michael Lip · May 25, 2026

What This Library Provides

The Claude Code Snippets Library is a curated collection of copy-paste code patterns that cover every major feature of the Anthropic Messages API. Each snippet is a self-contained, production-ready example that you can drop directly into your Python or JavaScript application. Rather than searching through documentation pages and piecing together code from different sections, this library gives you the complete pattern in one place, including imports, client setup, API call, and response handling.

Every snippet in this library has been tested against the live Anthropic API and updated to use the latest SDK version and model identifiers. The examples use claude-sonnet-4-20250514 as the default model, which provides an excellent balance of capability and speed for most use cases. You can substitute any model identifier based on your requirements. The Python snippets use the official anthropic package, and the JavaScript snippets use the official @anthropic-ai/sdk package.

How to Use the Snippets

Start by typing a keyword in the search box to filter snippets instantly. The search matches against snippet titles, descriptions, and code content, so you can search for specific function names, parameters, or concepts. Use the category pills to narrow results to a specific API feature area. The language toggle switches all visible snippets between Python and JavaScript simultaneously, so you can find the pattern in your preferred language without scrolling.

Click any snippet header to expand it and reveal the full code. The copy button in the top-right corner of each code block copies the entire snippet to your clipboard. Every snippet includes inline comments explaining the key parts of the pattern, so you understand what each line does and can modify it for your specific use case. For more complex patterns, the description above the code block explains the concept, when to use it, and any important caveats.

Core Message Patterns

The foundation of every Claude integration is the messages endpoint. The basic message pattern sends a single user message and receives a text response. From there, you build up to multi-turn conversations by maintaining a messages array with alternating user and assistant roles. Each message in the conversation must include the role and content fields. The content can be a simple string or an array of content blocks for richer inputs that combine text, images, and tool results.

Multi-turn conversations require you to include the full conversation history in every API call. Claude does not maintain server-side state between requests. This means your application is responsible for storing and sending the complete message history. For long conversations, you need to manage the context window by truncating older messages or summarizing previous exchanges. The max_tokens parameter controls the maximum length of Claude's response, not the total conversation length.

Streaming Patterns

Streaming delivers Claude's response as a series of server-sent events instead of waiting for the complete response. This dramatically reduces perceived latency because the user sees the first words within milliseconds instead of waiting seconds for the full response. The streaming API uses the same parameters as the non-streaming API, but returns events instead of a complete response object. The Python SDK provides a convenient stream() method that handles the event parsing for you.

The streaming event types include message_start (initial metadata), content_block_start (beginning of a text or tool_use block), content_block_delta (incremental text chunks), content_block_stop (end of a block), and message_stop (end of the response). For most applications, you only need to handle the text_delta events to display streaming text. The SDK's text_stream iterator simplifies this to a single for loop. For tool use with streaming, you also need to handle input_json_delta events to accumulate the tool input JSON.

Tool Use Patterns

Tool use allows Claude to call functions defined in your application. You provide tool definitions with a name, description, and JSON Schema for the input parameters. Claude decides when to call a tool based on the user's message and the tool descriptions. The response includes a tool_use content block with the function name and arguments. Your application executes the function and returns the result in a tool_result content block, then Claude generates the final response incorporating the tool output.

Advanced tool use patterns include parallel tool calls (Claude calling multiple tools in a single response), forced tool use (requiring Claude to use a specific tool), and multi-step tool chains where Claude calls tools iteratively to solve complex problems. The tool_choice parameter controls whether Claude can choose to use tools, must use tools, or must use a specific tool. For building tool definitions visually, the ClaudKit Tool Builder generates the JSON schema for you.

Vision and Image Patterns

Claude's vision capabilities allow you to send images alongside text prompts. Images can be provided as base64-encoded data or as URLs. The content array can contain multiple image blocks and text blocks in any order, allowing you to send multiple images with instructions about each one. Claude can analyze, describe, compare, extract text from, and reason about image contents. Supported formats include JPEG, PNG, GIF, and WebP with a maximum file size of 20MB per image.

For base64 encoding, read the image file and encode it to a base64 string, then include it in the content array with the correct media_type. For URL-based images, Claude fetches the image at request time, so the URL must be publicly accessible. When sending multiple images, place them before your text prompt to give Claude the visual context before reading your instructions. Vision is particularly useful for document analysis, chart reading, screenshot debugging, and any task that requires understanding visual content.

Error Handling and Retry Patterns

Production applications must handle API errors gracefully. The most common errors are 429 (rate limited), 529 (API overloaded), and 400 (bad request). Rate limit errors should be retried with exponential backoff, starting with a 1-second delay and doubling up to a maximum of 60 seconds. Overloaded errors indicate temporary capacity issues and should be retried with a longer initial delay of 5 to 10 seconds. Bad request errors indicate a problem with your request parameters and should not be retried. Instead, log the error details and fix the request.

The official SDKs provide built-in retry logic with configurable max retries, but you may want custom retry behavior for specific error types. Implement a wrapper function that catches exceptions, classifies the error type, and applies the appropriate retry strategy. Always set a maximum retry count to avoid infinite loops. For monitoring API errors across your application, KickLLM provides error tracking dashboards that categorize failures by type and frequency.

Advanced Patterns

Advanced patterns include prompt caching (reducing costs for repeated system prompts), batch processing (sending multiple requests efficiently), conversation branching (exploring different response paths), and JSON mode (forcing structured output). Prompt caching uses cache_control blocks to tell the API to cache specific parts of your request, reducing processing time and cost for subsequent requests that share the same cached content. This is particularly effective for long system prompts used across many requests.

Batch processing allows you to submit up to 10,000 requests at once with a 50% cost reduction. The batch API accepts JSONL files with individual requests and returns results asynchronously. Use this for bulk classification, content generation, and data processing tasks where latency is not critical. JSON mode uses response_format with type "json_object" to ensure Claude returns valid JSON, which is essential for applications that parse structured output programmatically. For testing these advanced patterns, use the API Request Builder.

Frequently Asked Questions

How do I send a basic message to the Claude API in Python?

Install the SDK with pip install anthropic, then create a client with anthropic.Anthropic(). Call client.messages.create() with model, max_tokens, and a messages array containing a dict with role "user" and your content string. The response object has a content array where the first element's text property contains Claude's reply. Set your ANTHROPIC_API_KEY environment variable before running.

How do I stream responses from the Claude API?

Use client.messages.stream() in Python with a with block, then iterate over stream.text_stream to get text chunks as they arrive. In JavaScript, use client.messages.stream() and listen for "text" events or use the async iterator. Streaming reduces time-to-first-token and improves perceived latency for long responses.

How do I use Claude's vision capabilities to analyze images?

Send images in the content array as objects with type "image" and a source object. For base64, set source.type to "base64", include the media_type (image/png, image/jpeg, etc.), and the base64-encoded data string. For URLs, set source.type to "url" and provide the url. Place image blocks before your text prompt in the content array.

How do I handle Claude API errors and implement retries?

Catch anthropic.APIError and check the status_code. 429 means rate limited — wait and retry with exponential backoff. 529 means the API is overloaded — retry after a longer delay. 400 is a bad request that should not be retried. 401 means invalid API key. The SDK provides specific exception classes like RateLimitError and APIStatusError for granular handling.

How do I use system prompts with the Claude API?

Pass a system parameter as a string or array of content blocks in your messages.create() call. System prompts set Claude's behavior, persona, and instructions. They are separate from the messages array and are placed before the conversation. System prompts support text and cache_control blocks for prompt caching. Keep system prompts concise but specific for best results.

Developer and creator of the Zovo Tools network. Building free, privacy-first developer tools that run entirely in the browser. No tracking, no sign-ups, no server-side processing. Open source on GitHub.