A conversation is just a list of messages. You keep adding to it, send the whole list every time, and the model has full context of everything that came before.Documentation Index
Fetch the complete documentation index at: https://docs.opper.ai/llms.txt
Use this file to discover all available pages before exploring further.
The message shape
Every message has arole and content.
| Role | What it is |
|---|---|
system | Instructions for the model. Usually one message at the start of the conversation. |
user | What the user typed. |
assistant | What the model said. You append these from previous responses. |
tool | The result of a tool call (when using Tool calling). |
A working multi-turn
System prompt
The firstsystem message sets the model’s persona and ground rules. It stays in the conversation, and you only write it once.
Python
Stop reasons
The response tells you why the model stopped:finish_reason | Meaning |
|---|---|
stop | Model finished its answer naturally. |
length | Hit max_tokens. Response may be truncated. |
tool_calls | Model wants to call a tool (see Tool calling). |
content_filter | A Guard rule blocked or modified the output. |
Keeping history under control
Messages add up fast. A few options:- Trim older turns when the conversation gets long. Keep the system message and recent N turns.
- Summarize older turns into a single assistant message (“earlier you discussed: X, Y, Z”).
- Use a separate JSON API call to extract just the bits worth remembering, then drop the rest. See JSON API.
What’s next
Tool calling
Let the model invoke your tools mid-conversation.
Streaming
Stream the response token-by-token.
Vision & PDFs
Send images and documents as message content.
Drop-in SDKs
The auth and base URL setup.