Embeddings
POST /v1/embeddings — OpenAI-compatible embeddings endpoint.
Basic call
curl https://api.lazu.ai/v1/embeddings \
-H "Authorization: Bearer $LAZU_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "BAAI/bge-m3",
"input": "Hello world"
}'Response:
{
"object": "list",
"data": [
{
"object": "embedding",
"embedding": [-0.032, 0.023, -0.041, ...],
"index": 0
}
],
"model": "BAAI/bge-m3",
"usage": { "prompt_tokens": 4, "total_tokens": 4 }
}Python (OpenAI SDK)
from openai import OpenAI
client = OpenAI(base_url="https://api.lazu.ai/v1", api_key="...")
resp = client.embeddings.create(
model="text-embedding-3-small",
input=["doc one", "doc two"],
)
for item in resp.data:
print(item.embedding[:5], "...")Batch inputs
Pass an array of strings; the response data array preserves order.
{
"model": "BAAI/bge-m3",
"input": ["first doc", "second doc", "third doc"]
}For very large batches, chunk client-side — single request body must be under reasonable size (typically 1 MB).
Available models
Common embedding models on Lazu:
| Model | Dim | Notes |
|---|---|---|
BAAI/bge-m3 | 1024 | Multilingual, runs on Cloudflare Workers AI |
text-embedding-3-small | 1536 | OpenAI's cheap default |
text-embedding-3-large | 3072 | OpenAI's high-quality |
text-embedding-ada-002 | 1536 | Older OpenAI; for legacy compat |
gemini-embedding-001 | 768 | Google's default |
Read the catalog at runtime — supported_endpoint_types
contains embeddings for any model you can call here.
Dimensions
For models that support truncation (OpenAI text-embedding-3-*), you can
request fewer dimensions:
{
"model": "text-embedding-3-large",
"input": "...",
"dimensions": 256
}Lower dimensions = smaller vectors, lower storage / search cost, slightly lower quality. Not all models support this — check the catalog.
See also
- Pricing & lanes — embeddings are typically direct-only
- Chat completions
- Errors