Web search - Opper

Server-side web search comes in two flavors at Opper. Provider-native shapes forward verbatim to the routed provider: web_search_20250305 on Anthropic, {type:"web_search"} on OpenAI Responses, {googleSearch:{}} on Google. The opper:web_search tool documented here is the portable cross-provider shape. It gives you one tool entry, identical response artifacts, and no per-provider branching in your code. Pick the canonical shape when you want a single tool entry that works regardless of which model the request lands on. Pick a native shape when you want the model’s provider-specific search behavior and are happy to author per-provider request bodies.

Quick start

curl https://api.opper.ai/v3/compat/v1/messages \
  -H "Authorization: Bearer $OPPER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "What were the top tech headlines this week?"}],
    "tools": [{
      "type": "opper:web_search",
      "freshness": "w",
      "max_uses": 3
    }]
  }'

curl https://api.opper.ai/v3/compat/chat/completions \
  -H "Authorization: Bearer $OPPER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.1",
    "messages": [{"role": "user", "content": "What were the top tech headlines this week?"}],
    "tools": [{
      "type": "opper:web_search",
      "freshness": "w",
      "max_uses": 3
    }]
  }'

curl https://api.opper.ai/v3/compat/responses \
  -H "Authorization: Bearer $OPPER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-5.1",
    "input": "What were the top tech headlines this week?",
    "tools": [{
      "type": "opper:web_search",
      "freshness": "w",
      "max_uses": 3
    }]
  }'

curl "https://api.opper.ai/v3/compat/v1beta/models/gemini/gemini-2.5-flash:generateContent" \
  -H "Authorization: Bearer $OPPER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"role": "user", "parts": [{"text": "What were the top tech headlines this week?"}]}],
    "tools": [{
      "type": "opper:web_search",
      "freshness": "w",
      "max_uses": 3
    }]
  }'

Engine selection

The engine field controls how Opper routes the search.

Value	Behavior
`auto` (default)	Use the model’s native server tool if the routed model+endpoint supports one. Fall back to Opper’s engine otherwise.
`native`	Require native. Returns `400` if the routed model has no native web search on the called endpoint.
`opper`	Always use Opper’s engine, regardless of native availability. Opper picks the backend. Useful when you want a uniform response shape across every model.
`jina`	Pin Opper’s engine to the Jina backend.
`exa`	Pin Opper’s engine to the Exa backend. Exa supports a true publish-date `freshness` window.

jina and exa are backend pins within Opper’s engine: like opper they always run the server-side search (never native), but force a specific provider. Availability depends on server-side configuration. Pinning a backend that isn’t configured returns a tool-call error. Today, native web search is available on:

Anthropic Claude on /v3/compat/v1/messages (both direct anthropic/ and Vertex gcp/claude-* routes)
OpenAI on /v3/compat/responses for Responses-API models (gpt-5* family, gpt-5-search-api, etc.)
Google Gemini on /v3/compat/v1beta/models/{model}:generateContent

engine: "auto" will route to native on those combinations and to Opper’s engine for everything else (Mistral, DeepSeek, Alibaba Qwen, Groq, etc.).

Parameters

Parameter	Type	Description
`type`	string	Required. Must be `"opper:web_search"`.
`engine`	string	`auto` \| `native` \| `opper` \| `jina` \| `exa`. Defaults to `auto`.
`freshness`	string	Recency filter. `d` (24h), `w` (7d), `m` (30d), `y` (1y). Empty = no filter.
`max_results`	int	Results per individual search. 0 uses the engine default.
`max_total_results`	int	Cumulative cap across every search in the request. 0 disables.
`max_uses`	int	Maximum number of searches the model may run in this request.
`search_context_size`	string	Per-result snippet length in characters. `very_low` (~1k) \| `low` (~5k) \| `medium` (~10k, default) \| `high` (~30k) \| `full` (full extracted page, capped ~50k).
`allowed_domains`	string[]	Restrict results to these domains.
`excluded_domains`	string[]	Exclude results from these domains.
`country`	string	Two-letter ISO-3166 alpha-2 code (e.g. `"se"`, `"us"`) to bias results regionally.

Citations and where they land

The response shape matches whatever endpoint you called, so your code reads citations from the same place regardless of engine.

Anthropic-shape: content[] contains server_tool_use, web_search_tool_result (URLs + snippets), and text blocks whose citations[] (web_search_result_location) reference the sources. See server-side tools for the field-by-field shape.
OpenAI-shape (Responses): output[] contains web_search_call items with the search queries and a message whose content[].output_text.annotations[] carry url_citation entries.
OpenAI-shape (Chat Completions): choices[0].message.annotations[] carry the url_citation entries, mirroring the Responses shape.
Google-shape: candidates[0].groundingMetadata carries webSearchQueries, groundingChunks[].web (URI + title), and groundingSupports.

On the native route (engine: "auto" landing on a model with a native server tool) citations are anchored per statement: each one points at the exact span of prose the model grounded on, exactly as the provider emits them.On Opper’s engine route (engine: "opper", "jina", or "exa") citations are block-level: every source the search returned is attached to the answer text. You get the same fields in the same place, but without per-sentence character offsets, and the Google searchEntryPoint (provider-hosted “Search Suggestions” HTML) is not emitted.

Cost and usage

Every response includes a tool-cost breakdown under usage.opper.cost.tools.web_search:

"usage": {
  "input_tokens": 2224,
  "output_tokens": 511,
  "server_tool_use": { "web_search_requests": 3 },
  "opper": {
    "cost": {
      "tokens": 0.0255,
      "tools": {
        "total": 0.03,
        "web_search": { "count": 3, "cost": 0.03, "unit": 0.01 }
      },
      "total": 0.0555
    }
  }
}

count is the number of searches the model actually ran, unit is the per-search cost, cost is the total. engine is also surfaced when the request routed through Opper’s engine.

Pricing

Opper’s engine is billed per search. Native searches (engine: "auto" landing on a model with a native tool, or engine: "native") are billed at the routed provider’s rate and passed through — the exact cost still shows up under usage.opper.cost.tools.web_search.

Engine	Search index	Data residency	Price
`opper` / `jina`	Jina	EU — search processing stays in-region	$0.01 / search
`exa`	Exa	Global (US-based)	$0.007 / search
`native` → Anthropic	Brave ¹	Provider infrastructure ²	provider rate, ≈ $10 / 1,000 searches — see Anthropic
`native` → OpenAI	Bing ¹	Provider infrastructure ²	provider rate, ≈ $10 / 1,000 calls — see OpenAI
`native` → Gemini	Google Search	Provider infrastructure ²	provider rate, ≈ $14 / 1,000 queries (Gemini 3) — see Google

¹ The underlying index for the native providers is as publicly reported, not officially confirmed by the provider. Native provider prices change independently of Opper; the linked pages are authoritative. More Opper engine backends will be added over time. ² Data residency reflects where the search runs. Native searches execute on the provider’s own infrastructure — Opper doesn’t control their region, and only the Jina backend offers EU residency. This is separate from result localization: use the country parameter (above) to bias results toward a geography, which does not change where the search runs.

In the trace

Each server-side search appears in the trace view as a server_side_tool step nested under the turn that requested it, showing the query, the results it returned, and the per-search cost — so the search is never a black box.

Full example

The quick start keeps the tool entry minimal. This example sets every parameter on one request so you can see them in one place. Only type is required; everything else is optional and falls back to the engine or handler default.

curl https://api.opper.ai/v3/compat/v1/messages \
  -H "Authorization: Bearer $OPPER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-5",
    "max_tokens": 1024,
    "messages": [{ "role": "user", "content": "What shipped in the latest Go release?" }],
    "tools": [
      {
        "type": "opper:web_search",
        "engine": "auto",
        "freshness": "m",
        "max_results": 5,
        "max_total_results": 20,
        "max_uses": 3,
        "search_context_size": "medium",
        "allowed_domains": ["go.dev"],
        "excluded_domains": ["reddit.com"],
        "country": "us"
      }
    ]
  }'

The OpenAPI spec (CanonicalWebSearchTool schema) is the authoritative definition of every field, its type, and its default.

​Quick start

​Engine selection

​Parameters

​Citations and where they land

​Cost and usage

​Pricing

​In the trace

​Full example

​See also