tools array, so anything you’ve written against OpenAI, Anthropic, or other compat SDKs works unchanged. It’s available across all 300+ models.
The round trip
A tool call always follows the same shape:- You send a chat completion request with a
toolslist. - The model responds with a
tool_use(it wants to call one). - You run the tool in your code.
- You send the result back to the model as a
tool_resultmessage. - The model uses the result to write its final response.
A working example
A weather assistant.Control when tools fire
Thetool_choice parameter tells the model how aggressively to use tools.
| Value | Behavior |
|---|---|
"auto" (default) | The model decides. Calls a tool when it makes sense, otherwise answers in text. |
"none" | Tools are ignored. The model has to answer from what it knows. |
"required" | The model must call one of the tools. Use when the answer can only come from a tool. |
{"type": "function", "function": {"name": "X"}} | Force a specific tool. |
Parallel calls
The model can ask for several tool calls in one turn. Thetool_calls array on the response can have more than one entry. Run them all (in parallel if they’re independent), then send back one tool message per call, each with its tool_call_id.
Streaming tool arguments
When you stream a tool call withstream: true, the arguments arrive as JSON fragments in delta.tool_calls[].function.arguments. Concatenate them as they come in, then parse once the call is complete. Useful for showing “I’m calling search…” UI as the call assembles. See Streaming.
Tools vs structured output
Tools and structured output look similar but do different things.| Tools | Structured output | |
|---|---|---|
| Goal | Have the model trigger an action | Get a single typed JSON object back |
| Round-trips | At least two (call, result, follow-up) | One |
| Multiple results per turn | Yes, parallel calls allowed | No, one object |
| Best for | Agents, search, data lookups, side effects | Extraction, classification, parsing |
What’s next
Server-side tools
Provider-run tools — Anthropic web_search, OpenAI code_interpreter, Google grounding — no round-trip needed.
Conversations
Multi-turn chat with message history.
Streaming
Stream tokens and tool arguments as they arrive.
Structured output
Get JSON back without the tool round-trip.