POST /v3/ocr turns a document into per-page markdown synchronously. It’s provider-agnostic: model selects the engine, the document source is normalized for you, and anything model-specific goes in parameters.
Pass a model and a document. The document is one of four shapes, selected by type:
document.type | Fields | Use for |
|---|---|---|
document_url | document_url | A PDF or image at an https URL |
image_url | image_url | An image at an https URL |
base64 | content, document_name | Inline bytes you already hold |
file | file_id | A file_<id> from Files |
A
file_id lets you OCR a document you already have on Opper without re-sending the bytes. Upload it to Files with purpose: ocr_input (PDFs and images), then reference it here. Files respect your project’s retention and storage quota.Response
| Field | What it does |
|---|---|
model | Required. The OCR model, e.g. mistral/mistral-ocr-latest or docling/docling-latest. |
document | Required. The source to read — a document_url, image_url, base64 content, or a file_id. |
pages | 0-based page indices to process; omitted reads every page. |
include_image_base64 | Return images embedded in the document as base64. |
parameters | Opaque per-provider passthrough — e.g. Docling’s ocr_engine, lang, table_mode, do_formula_enrichment. |
Languages (Docling)
Docling accepts canonical ISO 639-1 language codes inparameters.lang (e.g. ["sv", "en"]) with parameters.ocr_engine set to tesseract or easyocr; Opper maps them to each engine’s own codes:
Discover models
GET /v3/ocr/models lists the OCR models available, each with its price_per_page:
What’s next
Files
Upload once with
purpose: ocr_input, reuse by file_id.Vision & PDFs
Reason over a document with an LLM instead of extracting it.
Models
Which models do OCR.
Control Plane
Govern providers, regions, and spend on every call.