Responses API
POST /v1/responses — OpenAI's newer endpoint that unifies chat,
reasoning and multimodal inputs. The one Lazu endpoint that dereferences
file_id automatically.
When to use Responses vs Chat
| Need | Use |
|---|---|
| Quick chat, function calling | /v1/chat/completions |
PDF / document with file_id | /v1/responses |
Image with file_id (uploaded via Files) | /v1/responses |
Reasoning models (o1, o3, gpt-5) | /v1/responses (preferred) |
Basic call
curl https://api.lazu.ai/v1/responses \
-H "Authorization: Bearer $LAZU_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": [{
"role": "user",
"content": [
{"type": "input_text", "text": "Say hi"}
]
}]
}'Response (abridged):
{
"id": "resp_...",
"object": "response",
"output": [
{
"type": "message",
"role": "assistant",
"content": [{ "type": "output_text", "text": "Hi! How can I help?" }]
}
],
"usage": {
"input_tokens": 10,
"output_tokens": 8,
"total_tokens": 18
}
}With a file
Upload via the Files API, then reference:
{
"model": "gpt-4o",
"input": [
{
"role": "user",
"content": [
{ "type": "input_text", "text": "Summarize this PDF" },
{ "type": "input_file", "file_id": "file-lazu-..." }
]
}
]
}For images uploaded with purpose=vision, use "type": "input_image"
instead of input_file.
Reasoning models
o1, o3, gpt-5 family use Responses natively with reasoning effort
controls:
{
"model": "o3-mini",
"reasoning": {"effort": "medium"},
"input": [...]
}Tool calling
Same shape as chat completions — pass tools, get tool_calls back.
Reasoning models can invoke tools mid-reasoning.
Limits
- Single
file_idcontent ≤ Files API purpose limit (512 MB user_data, 20 MB vision) - Total dereferenced files in one Responses call ≤ 64 MB
If you reference too many large files at once, Lazu returns 400 with
file_too_large.