Skip to main content
POST /v3/ocr turns a document into per-page markdown synchronously. It’s provider-agnostic: model selects the engine, the document source is normalized for you, and anything model-specific goes in parameters. Pass a model and a document. The document is one of four shapes, selected by type:
document.typeFieldsUse for
document_urldocument_urlA PDF or image at an https URL
image_urlimage_urlAn image at an https URL
base64content, document_nameInline bytes you already hold
filefile_idA file_<id> from Files
A file_id lets you OCR a document you already have on Opper without re-sending the bytes. Upload it to Files with purpose: ocr_input (PDFs and images), then reference it here. Files respect your project’s retention and storage quota.
curl -sX POST https://api.opper.ai/v3/ocr \
  -H "Authorization: Bearer $OPPER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistral/mistral-ocr-latest",
    "document": { "type": "document_url", "document_url": "https://example.com/report.pdf" }
  }'
Response
{
  "id": "ocr_...",
  "model": "mistral/mistral-ocr-latest",
  "pages": [
    { "index": 0, "markdown": "# Report\n\nThe quarter...", "dimensions": { "width": 612, "height": 792 } }
  ],
  "usage": { "cost": 0.002, "pages_processed": 1 }
}
FieldWhat it does
modelRequired. The OCR model, e.g. mistral/mistral-ocr-latest or docling/docling-latest.
documentRequired. The source to read — a document_url, image_url, base64 content, or a file_id.
pages0-based page indices to process; omitted reads every page.
include_image_base64Return images embedded in the document as base64.
parametersOpaque per-provider passthrough — e.g. Docling’s ocr_engine, lang, table_mode, do_formula_enrichment.
OCR is billed per page processed. The extracted markdown preserves structure — headings, tables, and lists.

Languages (Docling)

Docling accepts canonical ISO 639-1 language codes in parameters.lang (e.g. ["sv", "en"]) with parameters.ocr_engine set to tesseract or easyocr; Opper maps them to each engine’s own codes:
curl -sX POST https://api.opper.ai/v3/ocr \
  -H "Authorization: Bearer $OPPER_API_KEY" -H "Content-Type: application/json" \
  -d '{
    "model": "docling/docling-latest",
    "document": { "type": "file", "file_id": "file_abc123" },
    "parameters": { "ocr_engine": "tesseract", "lang": ["sv", "en"] }
  }'

Discover models

GET /v3/ocr/models lists the OCR models available, each with its price_per_page:
curl -s "https://api.opper.ai/v3/ocr/models" \
  -H "Authorization: Bearer $OPPER_API_KEY"

What’s next

Files

Upload once with purpose: ocr_input, reuse by file_id.

Vision & PDFs

Reason over a document with an LLM instead of extracting it.

Models

Which models do OCR.

Control Plane

Govern providers, regions, and spend on every call.