messages array, with a richer content field.
Not every model supports every input type. See the models catalog for which models accept images and which accept PDFs.
Images
Two ways to send an image: a hosted URL or inline base64.PDFs
PDFs work the same way. The model reads both the text and any embedded images (charts, diagrams, scanned pages).Python
Chat vs JSON for media
| Need | Reach for |
|---|---|
| Show an image and ask a free-text question about it | Chat API (this page) |
| Extract structured fields from an image or PDF (a receipt, an invoice, a form) | JSON API with output_schema |
| Run a multi-turn conversation about an uploaded document | Chat API (this page) |
| Batch process documents into a database | JSON API |
What’s next
JSON API: schemas
Multimodal input with typed JSON output.
Conversations
Multi-turn chat. Works with image and PDF messages too.
Models
Which models accept which input types.