This is the shortest path to a working voice loop. Pick the tab that matches where your code runs, copy the snippet, and swap in your API key. For the full protocol (events, config fields, per-provider notes, billing, transcription, tool flow), see the Realtime protocol.Documentation Index
Fetch the complete documentation index at: https://docs.opper.ai/llms.txt
Use this file to discover all available pages before exploring further.
Run your first session
Get an API key
Create a project-scoped runtime API key in the Opper dashboard.
Open a session
- Server (Node.js)
- Browser (ephemeral ticket)
Server-side clients connect directly with a bearer token. This is the quickest way to check that the endpoint works.
Watch the events
On a successful open:From here, stream audio in with
audio.append (base64-encoded PCM16 at input_sample_rate), and the assistant’s audio comes back in audio.delta frames at output_sample_rate. If a tool fires you get a tool.call and reply with tool.result. When the session ends you get session.terminating followed by a clean WebSocket close.Switch providers
The protocol is the same across providers. Change one string:| Provider | Model id | Notes |
|---|---|---|
| OpenAI | openai/gpt-realtime-2 | Reasoning effort supported. 24 kHz symmetric. |
| xAI | xai/grok-voice-latest | Per-minute billing. 24 kHz symmetric. |
| Gemini | gemini/gemini-3.1-flash-live-preview | Asymmetric sample rates: 16 kHz in, 24 kHz out. |
What’s next
Realtime protocol
Every config field, event, and per-provider note.
Models
The full list of supported realtime model IDs.
Mint endpoint
POST /v3/realtime-sessions request and response schema.Cookbook example
A complete browser voice app with microphone capture and tool calls.