How does OpenAPI handle binary data and file downloads? #
Not all API responses are JSON objects. Many APIs need to serve binary data: downloadable files, images, PDFs, exports, archives, and raw byte streams. Similarly, many APIs accept file uploads as part of their input. OpenAPI provides specific conventions for describing these cases, and understanding them is essential for accurately documenting file-centric API operations.
Response Content Types for Binary Data #
The key to describing binary responses in OpenAPI is the combination of the content media type and the schema format. For binary file downloads, the media type describes what kind of file is being returned, and the schema uses type: string with format: binary:
paths:
/reports/{reportId}/download:
get:
summary: Download a report as a PDF
parameters:
- name: reportId
in: path
required: true
schema:
type: string
responses:
"200":
description: The report PDF file
headers:
Content-Disposition:
schema:
type: string
description: Attachment disposition with filename
example: 'attachment; filename="report-2026.pdf"'
Content-Length:
schema:
type: integer
description: Size of the file in bytes
content:
application/pdf:
schema:
type: string
format: binary
The format: binary indicates that the string represents raw binary data (a byte sequence), not a text string. Tools understand that this means the response body is the raw file content, not a JSON-encoded value.
Common Media Types for File Downloads #
Different file types use different media types in the content key:
responses:
"200":
description: File download
content:
application/pdf: # PDF documents
schema:
type: string
format: binary
application/zip: # ZIP archives
schema:
type: string
format: binary
image/png: # PNG images
schema:
type: string
format: binary
image/jpeg: # JPEG images
schema:
type: string
format: binary
application/octet-stream: # Generic binary (unknown type)
schema:
type: string
format: binary
text/csv: # CSV exports (text-based)
schema:
type: string
application/octet-stream is the catch-all binary media type used when the content type is not known in advance or when a generic binary download is intended.
Multiple Content Types (Content Negotiation) #
Some endpoints can return a file in multiple formats depending on the client’s request. This is expressed using multiple entries under content:
paths:
/users/export:
get:
summary: Export the user list
responses:
"200":
description: User list export
content:
text/csv:
schema:
type: string
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/User'
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet:
schema:
type: string
format: binary
The client uses the Accept request header to indicate which format it prefers. The server responds with the matching content type.
Documenting the Content-Disposition Header #
File download endpoints typically set the Content-Disposition header to tell the browser (or HTTP client) to treat the response as a file attachment with a specific filename. This header should be documented in the OpenAPI response:
responses:
"200":
description: Downloaded file
headers:
Content-Disposition:
description: Instructs the client to download the response as a named file
schema:
type: string
example: 'attachment; filename="export-2026-04-30.csv"'
Content-Type:
description: MIME type of the file
schema:
type: string
example: text/csv
File Uploads with Multipart Form Data #
File upload endpoints typically use multipart/form-data as the request body content type. In OpenAPI, this is expressed using the requestBody object with a multipart/form-data media type:
paths:
/documents:
post:
summary: Upload a document
requestBody:
required: true
content:
multipart/form-data:
schema:
type: object
required: [file]
properties:
file:
type: string
format: binary
description: The file to upload
name:
type: string
description: Optional display name for the document
example: "Q4 Report"
tags:
type: array
items:
type: string
description: Classification tags
responses:
"201":
description: Document uploaded successfully
content:
application/json:
schema:
$ref: '#/components/schemas/Document'
The file property uses type: string with format: binary. This signals to tools and code generators that this form field represents a binary file, not a text value.
Encoding Metadata for Multipart Fields #
For finer control over multipart encoding — such as setting the content type of a specific part or specifying that an array should send separate parts — use the encoding object on the media type:
requestBody:
content:
multipart/form-data:
schema:
type: object
properties:
profileImage:
type: string
format: binary
metadata:
type: object
properties:
description:
type: string
encoding:
profileImage:
contentType: image/png, image/jpeg
metadata:
contentType: application/json
The encoding object lets the spec explicitly declare what content types are valid for individual parts of a multipart request.
Base64-Encoded Binary #
Some APIs transmit binary data embedded in JSON responses using base64 encoding. In OpenAPI, this is expressed with type: string and format: byte (not format: binary):
components:
schemas:
Document:
type: object
properties:
id:
type: string
content:
type: string
format: byte
description: Base64-encoded file content
mimeType:
type: string
example: application/pdf
The distinction between format: binary and format: byte is important:
format: binary— raw binary data in the HTTP body (file download, file upload)format: byte— base64-encoded binary data embedded inside a text format (e.g., JSON)
Documenting File Size Constraints #
When a file upload endpoint enforces size limits, document this in the description and, if using OpenAPI 3.1, in the schema contentEncoding or using maxLength:
properties:
file:
type: string
format: binary
description: "The file to upload. Maximum size: 10 MB. Accepted formats: PDF, DOCX, XLSX."
Explicit size and format constraints in descriptions help developers understand limits without needing to discover them through errors.
Code Generation Considerations #
Code generators treat format: binary fields specially:
- In generated TypeScript clients, binary upload fields become
File | Blobtypes - In generated Python clients, they become
IO[bytes]orbytestypes - In generated Java clients, they become
java.io.Fileorbyte[]types - Download operations return
bytes,Buffer, orStreamobjects depending on the language
A well-documented binary endpoint produces usable generated client code. A missing format: binary annotation results in a generated client that treats the field as a plain string, breaking upload and download functionality.
Conclusion #
OpenAPI handles binary data through the combination of appropriate media types (application/pdf, image/jpeg, application/octet-stream, multipart/form-data) and schema formats (format: binary for raw binary, format: byte for base64-encoded content). Documenting Content-Disposition headers, file size limits, accepted formats, and encoding details produces accurate, usable API documentation and high-quality generated clients that handle file uploads and downloads correctly from the first compile.
Last updated on April 30, 2026.