Skip to main content
The AI Gateway is the request path. Your app sends a call, the gateway picks a model, runs it, and returns the result. One endpoint, one API key, 300+ models behind it, all hosted in the EU. It replaces wiring up each provider yourself, so there are no per-provider SDKs, keys, billing, or rate limits to manage. To swap a model, change a string.

What the gateway gives you

One API, every model

Reach OpenAI, Anthropic, Google, Mistral, and 300+ more through a single endpoint and key.

Every modality

Not just text — generate images, speech, and video, transcribe audio, and run realtime voice.

Drop-in SDKs

Keep the OpenAI, Anthropic, or Google AI SDK you already use. Change the base URL, nothing else.

Switch models without code

Set a default model per project with a Route rule. Change it from the platform, no deploy.

Bring your own keys

Register your own provider deployments and API keys. They work like any other model.

EU by default

Restrict calls to EU-only or zero-retention providers. Enforced at the gateway.

One bill, every provider

Every response carries its cost. Spend across all providers in one place.

Observability built in

Every call is metered: cost, latency, and tokens. Turn on retention for full traces.

One API, every model

Address any model as provider/model. The same call works whether you point it at OpenAI, Anthropic, or a model hosted in the EU.
openai/gpt-5            # OpenAI
anthropic/claude-opus-4-7   # Anthropic
mistral/mistral-large   # Mistral, EU-hosted
Some models are served by more than one provider (for example the same Claude model via Anthropic, Azure, or Bedrock). Opper balances across them for you, so a single name keeps working even when one provider is busy. To call a model, point your SDK at https://api.opper.ai/v3/compat — see Drop-in SDKs. The same endpoint covers text, structured output, tool calling, and multimodal input, with the same routing, governance, and tracing on every call. The gateway isn’t text-only. Dedicated endpoints generate images (POST /v3/images), speech and transcripts (POST /v3/audio/*), and video (POST /v3/videos), and realtime voice runs over a WebSocket — all behind the same key and governance. See Multimodality.

Pin and switch models without touching code

Keep model names out of your application. Set a default once and change it from the platform:
  • Route pins a default model per organization or project. Edit one rule to move a project to a different model.
  • Custom models and aliases let you register your own deployments and point a stable name (like production/main) at whatever model you choose.
Callers can still pass an explicit model for per-call trade-offs, as long as Comply allows it.

Bring your own keys

You can go beyond Opper’s hosted models. Register your own provider deployments and API keys (for example a private Azure OpenAI deployment), and they show up in the catalog alongside built-in models with the same routing, governance, and tracing.
curl -X POST https://api.opper.ai/v2/custom-models \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPPER_API_KEY" \
  -d '{
    "name": "example/my-gpt",
    "provider": "azure",
    "model_id": "gpt-production",
    "api_key": "your-provider-key",
    "params": {"api_base": "https://your-deployment.openai.azure.com/", "api_version": "2024-06-01"}
  }'
Call it like any other model: example/my-gpt. See Custom models for the full setup.

Keep data in the EU

The gateway is hosted exclusively in the EU. The only thing that leaves is the model call itself, and you control where that goes. With a Comply rule you can restrict every call to:
  • EU-only providers (Mistral, Azure EU, and others)
  • Specific regions or countries
  • Zero-retention providers for the strictest workloads
If you need to store nothing at all, a Zero Data Retention rule keeps request and response content off disk entirely. Only the usage counters needed for billing are kept. Anything that breaks a rule is rejected at the gateway before it reaches a provider. See Security for more.

One bill, full visibility

Every response includes its cost, computed the same way across all 300+ models. The gateway aggregates spend and usage across providers, so you get one bill and one place to watch it. Every call is metered (cost, latency, tokens) and shows up in Analytics for spend over time. Turn on a retention rule and you also get the full trace for each call, with inputs, outputs, and every step.

How it fits with the control plane

The gateway runs your requests. The Control Plane sets the rules it follows.
AI GatewayControl Plane
RoleRuns every requestSets the rules
You use it byMaking callsWriting rules in the platform
ExamplesModel access, routing, EU enforcement, costObserve, Route, Guard, Comply, Steer
In short, the Control Plane is where you configure, and the Gateway enforces those rules on every call.

Security

The gateway also enforces data residency. The next page covers how Opper is hosted and protected.

Security

EU hosting, sub-processors, encryption, and deletion.

Models

The full catalog. EU-hosted models marked.

Build on Opper

Ready to build? Text, multimodal, and voice.

Drop-in SDKs

Use the OpenAI, Anthropic, or Google AI SDK you already have.