file_<id> — a reusable handle you can pass as input to later calls instead of re-uploading or re-encoding bytes. One uploaded image can seed a video; one generated image can be edited by the next call; a generated audio clip can be fed straight to transcription.
Two ways files appear
- You upload them.
POST /v3/files(multipart) returns afile_<id>for reference media — an image to animate, a source video to edit, an audio clip to transcribe. - Generations store them. The image, audio, and video endpoints save their output to Files by default (
store: true) and return afile_idalongside the result. Setstore: falseto opt out.
Using a file_id as input
A file_id is accepted anywhere a media source is — next to an http(s) URL or a data-URI:
| Endpoint | Fields that accept a file_id |
|---|---|
POST /v3/images | image, mask, reference_images |
POST /v3/audio/transcriptions | audio |
POST /v3/videos | image, video, reference_images |
Lifecycle and retention
Files expire after a default TTL, capped by your project’s retention. On zero-data-retention projects, storing is skipped — uploads and stored outputs degrade gracefully, and generation responses tell you when an output wasn’t persisted. Delete files you no longer need withDELETE /v3/files/{id}.
Quotas
Each organization has a storage quota — a total byte budget (it can vary by plan) and a cap on the number of files. Uploads and stored generation outputs draw from the same budget.- Uploads that would exceed the quota are rejected with
413. - Generated outputs (
store: true) degrade gracefully when the quota is full: the call still succeeds and returns the result inline, it just isn’t persisted — the response signals the skip rather than failing.
Operations
| Operation | Endpoint |
|---|---|
| Upload a file | POST /v3/files |
| List files | GET /v3/files |
| Get metadata | GET /v3/files/{id} |
| Get a download URL | GET /v3/files/{id}/content |
| Delete a file | DELETE /v3/files/{id} |
What’s next
Images
Generate and edit images; feed a
file_id for image-to-image.Video
Seed a video from an uploaded or generated image.
Audio
Transcribe an audio
file_id.Multimodality
How the modalities fit together.