> ## Documentation Index
> Fetch the complete documentation index at: https://docs.opper.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Realtime quickstart

> Open your first voice-to-voice session in under five minutes.

This is the shortest path to a working voice loop. Pick the tab that matches where your code runs, copy the snippet, and swap in your API key.

For the full protocol (events, config fields, per-provider notes, billing, transcription, tool flow), see the [Realtime protocol](/build/realtime/protocol).

## Run your first session

<Steps>
  <Step title="Get an API key">
    Create a project-scoped runtime API key in the [Opper dashboard](https://platform.opper.ai).
  </Step>

  <Step title="Open a session">
    <Tabs>
      <Tab title="Server (Node.js)">
        Server-side clients connect directly with a bearer token. This is the quickest way to check that the endpoint works.

        ```bash theme={null}
        npm install ws
        ```

        ```typescript theme={null}
        import WebSocket from "ws";

        const ws = new WebSocket("wss://api.opper.ai/v3/realtime", {
          headers: { Authorization: `Bearer ${process.env.OPPER_API_KEY}` },
        });

        ws.on("open", () => {
          ws.send(JSON.stringify({
            type: "session.start",
            config: {
              model: "openai/gpt-realtime-2",
              voice: "marin",
              instructions: "You are a concise voice assistant.",
            },
          }));
        });

        ws.on("message", (raw) => {
          const ev = JSON.parse(raw.toString());
          if (ev.type === "session.started") {
            console.log(`live: ${ev.session_id} @ ${ev.output_sample_rate}Hz`);
          }
          if (ev.type === "audio.delta") {
            // ev.audio is base64-encoded PCM16. Pipe to your audio player.
          }
        });
        ```
      </Tab>

      <Tab title="Browser (ephemeral ticket)">
        Browsers can't set an `Authorization` header on the native `WebSocket` constructor. Mint a single-use ticket from your backend, then redeem it from the browser.

        **Step A. Your backend mints the ticket:**

        ```typescript theme={null}
        // POST from your trusted server (Node, Python, anything)
        const resp = await fetch("https://api.opper.ai/v3/realtime-sessions", {
          method: "POST",
          headers: {
            Authorization: `Bearer ${process.env.OPPER_API_KEY}`,
            "Content-Type": "application/json",
          },
          body: JSON.stringify({
            config: {
              model: "openai/gpt-realtime-2",
              voice: "marin",
              instructions: "You are a concise voice assistant.",
            },
          }),
        });
        const { client_secret } = await resp.json();
        // Return client_secret to the browser.
        ```

        **Step B. The browser redeems it via subprotocol header:**

        ```typescript theme={null}
        const ws = new WebSocket(
          "wss://api.opper.ai/v3/realtime",
          [`opper-ticket.${clientSecret}`],
        );

        ws.onopen = () => {
          // Bound fields (model, voice, instructions) are already locked in
          // by the ticket. Send any remaining session.start fields here.
          ws.send(JSON.stringify({ type: "session.start", config: {} }));
        };

        ws.onmessage = (e) => {
          const ev = JSON.parse(e.data);
          // Handle session.started, audio.delta, tool.call, etc.
        };
        ```

        Tickets are single-use and expire in 60 seconds by default. Whatever fields you populate in the mint request are **locked**, so the browser can't override them. See [Pre-binding for security](/build/realtime/protocol#pre-binding-for-security).
      </Tab>
    </Tabs>
  </Step>

  <Step title="Watch the events">
    On a successful open:

    ```json theme={null}
    { "type": "session.started", "session_id": "sess_...", "input_sample_rate": 24000, "output_sample_rate": 24000, "audio_format": "pcm16" }
    ```

    From here, stream audio in with `audio.append` (base64-encoded PCM16 at `input_sample_rate`), and the assistant's audio comes back in `audio.delta` frames at `output_sample_rate`. If a tool fires you get a `tool.call` and reply with `tool.result`. When the session ends you get `session.terminating` followed by a clean WebSocket close.
  </Step>
</Steps>

## Switch providers

The protocol is the same across providers. Change one string:

| Provider | Model id                               | Notes                                               |
| -------- | -------------------------------------- | --------------------------------------------------- |
| OpenAI   | `openai/gpt-realtime-2`                | Reasoning effort supported. 24 kHz symmetric.       |
| xAI      | `xai/grok-voice-latest`                | Per-minute billing. 24 kHz symmetric.               |
| Gemini   | `gemini/gemini-3.1-flash-live-preview` | **Asymmetric** sample rates: 16 kHz in, 24 kHz out. |

See [Per-provider notes](/build/realtime/protocol#per-provider-notes) for voice lists and quirks.

## What's next

<CardGroup cols={2}>
  <Card title="Realtime protocol" icon="diagram-project" href="/build/realtime/protocol">
    Every config field, event, and per-provider note.
  </Card>

  <Card title="Models" icon="brain" href="https://opper.ai/models">
    The full list of supported realtime model IDs.
  </Card>

  <Card title="Mint endpoint" icon="ticket" href="/v3-api-reference/realtime/create-realtime-session">
    `POST /v3/realtime-sessions` request and response schema.
  </Card>

  <Card title="Cookbook example" icon="github" href="https://github.com/opper-ai/opper-cookbook/tree/main/examples/brainstorm-time">
    A complete browser voice app with microphone capture and tool calls.
  </Card>
</CardGroup>
