An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
The model response indicates that the PDF isn’t actually being passed into the model’s input content, even though the file upload itself succeeds. For the Responses API, files must be both uploaded correctly and referenced in the request in a way that vision-capable models can consume.
Key points to verify:
- Use a vision-capable model
- Only models that support both text and image inputs can accept PDF files as input.
- Ensure the deployment is one of the vision-enabled models listed under the Responses API (for example,
gpt-4oor other models explicitly documented as supporting image/PDF input). - If a non-vision model is used, the PDF will not be interpreted and the model will behave as if no file was attached.
- Respect PDF size and request limits
- Each PDF must be under 50 MB.
- The combined size of all files in a single request must also be under 50 MB.
- If limits are exceeded, the file will not be processed even if upload appears successful.
- Use the correct
purposewhen uploading files
- For PDF input with the Responses API, the file must be uploaded with
purpose="assistants". - A
purposeofuser_datais currently not supported for PDFs; usinguser_datawill prevent the file from being usable by the Responses API.
- Ensure the file is actually referenced in the Response request
- For inline Base64: the PDF must be included in the request content as a base64 data URI so that the model receives both extracted text and rendered page images.
- For uploaded files: the file ID returned from the Files API must be referenced in the Responses API request in the way shown in the documentation (for example, as a file input in the content/tool configuration). If the file is only uploaded but never referenced by ID in the Response creation call, the model will not see it.
- Combine file input with text
- When using file inputs, include a text prompt in the same request (for example, “Summarize the attached PDF”). Some flows require that there is text content along with the file reference so the model has both the instruction and the file in a single call.
- Confirm the model supports PDFs (not just images)
- PDF support is specifically called out for models with vision capabilities. If using a model or version that does not yet support PDF input, the file will be ignored.
If all of the above are satisfied (vision-enabled model, purpose="assistants", size limits respected, file ID or base64 correctly embedded in the Response request, and a text instruction included), the model should receive both the extracted text and rendered images of each PDF page and be able to answer questions about the file instead of replying that it cannot see it.
References: