Skip to main content
POST
/
ocr
Python
import requests

# Process a PDF document from URL
response = requests.post(
    "https://api.opper.ai/v2/ocr",
    headers={
        "Authorization": "Bearer YOUR_API_KEY",
        "Content-Type": "application/json",
    },
    json={
        "model": "mistral/mistral-ocr-latest",
        "document": {
            "type": "document_url",
            "document_url": "https://example.com/sample.pdf",
        },
    },
)

result = response.json()

print(f"Model: {result['model']}")
print(f"Pages processed: {result['usage_info']['pages_processed']}")

for page in result["pages"]:
    print(f"\n--- Page {page['index']} ---")
    print(page["markdown"])
{
  "id": "<string>",
  "pages": [
    {
      "index": 123,
      "markdown": "<string>",
      "dimensions": {
        "dpi": 123,
        "height": 123,
        "width": 123
      },
      "images": [
        {
          "id": "<string>",
          "top_left_x": 123,
          "top_left_y": 123,
          "bottom_right_x": 123,
          "bottom_right_y": 123,
          "image_base64": "<string>"
        }
      ]
    }
  ],
  "model": "<string>",
  "usage_info": {
    "pages_processed": 123,
    "doc_size_bytes": 123
  },
  "cost": {
    "generation": 123,
    "platform": 123,
    "total": 123
  }
}

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json

Request model for OCR processing.

model
string
required

The OCR model to use

Example:

"mistral/mistral-ocr-latest"

document
OCRDocument · object
required

The document to process

pages
integer[] | null

Specific page indices to process (0-based). If not specified, all pages are processed.

include_image_base64
boolean
default:false

Whether to include base64-encoded images in the response

image_limit
integer | null

Maximum number of images to extract per page

Required range: x >= 1
image_min_size
integer | null

Minimum size (width or height in pixels) for images to be included

Required range: x >= 1
mistral_extra
MistralOCRExtra · object

Mistral-specific OCR parameters

Response

Successful Response

Response model for OCR processing.

id
string
required

Unique identifier for this OCR request

pages
OCRPageResult · object[]
required

Processed page results

model
string
required

The model used for OCR

usage_info
OCRUsageInfo · object
required

Usage information

cost
OCRCost · object

Cost information for this OCR request