Conversations

A conversation is just a list of messages. You keep adding to it, send the whole list every time, and the model has full context of everything that came before.

The message shape

Every message has a role and content.

Role	What it is
`system`	Instructions for the model. Usually one message at the start of the conversation.
`user`	What the user typed.
`assistant`	What the model said. You append these from previous responses.
`tool`	The result of a tool call (when using Tool calling).

A working multi-turn

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.opper.ai/v3/compat",
    api_key=os.environ["OPPER_API_KEY"],
)

messages = [
    {"role": "system", "content": "You are a concise assistant."},
    {"role": "user", "content": "What's a vector?"},
]

# Turn 1
r = client.chat.completions.create(model="openai/gpt-5-mini", messages=messages)
reply = r.choices[0].message.content
messages.append({"role": "assistant", "content": reply})
print("Assistant:", reply)

# Turn 2 — the model remembers turn 1 because we sent the whole list
messages.append({"role": "user", "content": "Give me an example in 3D."})
r = client.chat.completions.create(model="openai/gpt-5-mini", messages=messages)
print("Assistant:", r.choices[0].message.content)

System prompt

The first system message sets the model’s persona and ground rules. It stays in the conversation, and you only write it once.

Python

{"role": "system", "content": "You are a customer support agent for Acme Corp. Always cite a ticket number when referring to a past issue."}

Keep it short. Long system prompts cost more on every turn and tend to be ignored.

Stop reasons

The response tells you why the model stopped:

`finish_reason`	Meaning
`stop`	Model finished its answer naturally.
`length`	Hit `max_tokens`. Response may be truncated.
`tool_calls`	Model wants to call a tool (see Tool calling).
`content_filter`	A Guard rule blocked or modified the output.

Keeping history under control

Messages add up fast. A few options:

Trim older turns when the conversation gets long. Keep the system message and recent N turns.
Summarize older turns into a single assistant message (“earlier you discussed: X, Y, Z”).
Use a separate JSON API call to extract just the bits worth remembering, then drop the rest. See JSON API.

What’s next

Tool calling

Let the model invoke your tools mid-conversation.

Streaming

Stream the response token-by-token.

Vision & PDFs

Send images and documents as message content.

Drop-in SDKs

The auth and base URL setup.

​The message shape

​A working multi-turn

​System prompt

​Stop reasons

​Keeping history under control

​What’s next