The simplest Chat API call: a list of messages goes in, a single response comes out. Same shape every model supports. If you haven’t set up your SDK yet, start with Drop-in SDKs.Documentation Index
Fetch the complete documentation index at: https://docs.opper.ai/llms.txt
Use this file to discover all available pages before exploring further.
A minimal call
Pick a model
Themodel field is always provider-prefixed: openai/gpt-5.5, anthropic/claude-sonnet-4-6, gemini/gemini-2.5-pro. Browse the full set on the Models page.
You can call any model from any SDK. The Anthropic SDK can talk to a Google model, the OpenAI SDK can talk to a Claude model. The provider prefix decides where the call routes. The SDK only sets the request shape.
If you don’t pass model, the call falls through to your Route rule (if any), then to model preference hints.
Common parameters
| Parameter | What it does |
|---|---|
temperature | 0 to 2. Lower is more deterministic. |
top_p | 0 to 1. Nucleus sampling. Don’t use with temperature at the same time. |
max_tokens | Cap the response length. |
stop | A string or array of strings. The model stops as soon as it sees one. |
frequency_penalty / presence_penalty | -2 to 2. Penalize repeated tokens. |
n | How many response choices to generate. Default 1. |
reasoning_effort: "low" | "medium" | "high" to control how much the model “thinks” before answering.
The response shape
Response
choices[0].message.contentis the assistant’s replychoices[0].finish_reasonis why the model stopped (stop,length,tool_calls,content_filter)
What’s next
Conversations
Multi-turn chat with message history.
Tool calling
Let the model call your code.
Streaming
Stream tokens as they’re generated.
Structured output
Get JSON back instead of free text.