OCR

POST /v3/ocr turns a document into per-page markdown synchronously. It’s provider-agnostic: model selects the engine, the document source is normalized for you, and anything model-specific goes in parameters. Pass a model and a document. The document is one of four shapes, selected by type:

`document.type`	Fields	Use for
`document_url`	`document_url`	A PDF or image at an https URL
`image_url`	`image_url`	An image at an https URL
`base64`	`content`, `document_name`	Inline bytes you already hold
`file`	`file_id`	A `file_<id>` from Files

A file_id lets you OCR a document you already have on Opper without re-sending the bytes. Upload it to Files with purpose: ocr_input (PDFs and images), then reference it here. Files respect your project’s retention and storage quota.

curl -sX POST https://api.opper.ai/v3/ocr \
  -H "Authorization: Bearer $OPPER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral/mistral-ocr-latest",
    "document": { "type": "document_url", "document_url": "https://example.com/report.pdf" }
  }'

import os, requests

r = requests.post(
    "https://api.opper.ai/v3/ocr",
    headers={"Authorization": f"Bearer {os.environ['OPPER_API_KEY']}"},
    json={
        "model": "mistral/mistral-ocr-latest",
        "document": {"type": "document_url", "document_url": "https://example.com/report.pdf"},
    },
)
r.raise_for_status()
pages = r.json()["pages"]
print(pages[0]["markdown"])

curl -sX POST https://api.opper.ai/v3/ocr \
  -H "Authorization: Bearer $OPPER_API_KEY" -H "Content-Type: application/json" \
  -d '{ "model": "docling/docling-latest", "document": { "type": "file", "file_id": "file_abc123" } }'

Response

{
  "id": "ocr_...",
  "model": "mistral/mistral-ocr-latest",
  "pages": [
    { "index": 0, "markdown": "# Report\n\nThe quarter...", "dimensions": { "width": 612, "height": 792 } }
  ],
  "usage": { "cost": 0.002, "pages_processed": 1 }
}

Field	What it does
`model`	Required. The OCR model, e.g. `mistral/mistral-ocr-latest` or `docling/docling-latest`.
`document`	Required. The source to read — a `document_url`, `image_url`, `base64` content, or a `file_id`.
`pages`	0-based page indices to process; omitted reads every page.
`include_image_base64`	Return images embedded in the document as base64.
`parameters`	Opaque per-provider passthrough — e.g. Docling’s `ocr_engine`, `lang`, `table_mode`, `do_formula_enrichment`, or Mistral OCR 4’s `include_blocks` and `confidence_scores_granularity` (see Structured blocks).

OCR is billed per page processed. The extracted markdown preserves structure — headings, tables, and lists.

Languages (Docling)

Docling accepts canonical ISO 639-1 language codes in parameters.lang (e.g. ["sv", "en"]) with parameters.ocr_engine set to tesseract or easyocr; Opper maps them to each engine’s own codes:

curl -sX POST https://api.opper.ai/v3/ocr \
  -H "Authorization: Bearer $OPPER_API_KEY" -H "Content-Type: application/json" \
  -d '{
    "model": "docling/docling-latest",
    "document": { "type": "file", "file_id": "file_abc123" },
    "parameters": { "ocr_engine": "tesseract", "lang": ["sv", "en"] }
  }'

Structured blocks (Mistral OCR 4)

mistral/mistral-ocr-4-0 (also reachable as mistral/mistral-ocr-latest) can return the page’s layout in reading order, not just markdown. Opt in through parameters:

`parameters` key	Effect
`include_blocks`	Return a `blocks` array per page — each block has a `type` (`title`, `table`, `equation`, `signature`, `text`, …), a bounding box (`top_left_x`, `top_left_y`, `bottom_right_x`, `bottom_right_y`), and its `content`.
`confidence_scores_granularity`	`"page"` for an aggregate page score, `"word"` for per-word scores. Adds a `confidence_scores` object to each page.

curl -sX POST https://api.opper.ai/v3/ocr \
  -H "Authorization: Bearer $OPPER_API_KEY" -H "Content-Type: application/json" \
  -d '{
    "model": "mistral/mistral-ocr-4-0",
    "document": { "type": "document_url", "document_url": "https://example.com/report.pdf" },
    "parameters": { "include_blocks": true, "confidence_scores_granularity": "page" }
  }'

Response (excerpt)

{
  "pages": [
    {
      "index": 0,
      "markdown": "# Report\n\n...",
      "blocks": [
        { "type": "title", "top_left_x": 64, "top_left_y": 41, "bottom_right_x": 166, "bottom_right_y": 56, "content": "Report" }
      ],
      "confidence_scores": { "average_page_confidence_score": 0.97, "minimum_page_confidence_score": 0.32 }
    }
  ]
}

blocks and confidence_scores are only present when requested. Earlier Mistral models (mistral/mistral-ocr-2512) and Docling ignore these keys and respond as before.

Discover models

GET /v3/ocr/models lists the OCR models available, each with its price_per_page:

curl -s "https://api.opper.ai/v3/ocr/models" \
  -H "Authorization: Bearer $OPPER_API_KEY"

What’s next

Files

Upload once with purpose: ocr_input, reuse by file_id.

Vision & PDFs

Reason over a document with an LLM instead of extracting it.

Models

Which models do OCR.

Control Plane

Govern providers, regions, and spend on every call.

Get started

Platform

Build

Control Plane

Tutorials

Tooling

Languages (Docling)

Structured blocks (Mistral OCR 4)

Discover models

What’s next

Files

Vision & PDFs

Models

Control Plane

​Languages (Docling)

​Structured blocks (Mistral OCR 4)

​Discover models

​What’s next

Files

Vision & PDFs

Models

Control Plane

Languages (Docling)

Structured blocks (Mistral OCR 4)

Discover models

What’s next