Are you an LLM? Read llms.txt for a summary of the docs, or llms-full.txt for the full context.
Skip to content

Embeddings

POST /v1/embeddings — OpenAI-compatible embeddings endpoint.

Basic call

curl https://api.lazu.ai/v1/embeddings \
  -H "Authorization: Bearer $LAZU_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "BAAI/bge-m3",
    "input": "Hello world"
  }'

Response:

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "embedding": [-0.032, 0.023, -0.041, ...],
      "index": 0
    }
  ],
  "model": "BAAI/bge-m3",
  "usage": { "prompt_tokens": 4, "total_tokens": 4 }
}

Python (OpenAI SDK)

from openai import OpenAI
client = OpenAI(base_url="https://api.lazu.ai/v1", api_key="...")
 
resp = client.embeddings.create(
    model="text-embedding-3-small",
    input=["doc one", "doc two"],
)
for item in resp.data:
    print(item.embedding[:5], "...")

Batch inputs

Pass an array of strings; the response data array preserves order.

{
  "model": "BAAI/bge-m3",
  "input": ["first doc", "second doc", "third doc"]
}

For very large batches, chunk client-side — single request body must be under reasonable size (typically 1 MB).

Available models

Common embedding models on Lazu:

ModelDimNotes
BAAI/bge-m31024Multilingual, runs on Cloudflare Workers AI
text-embedding-3-small1536OpenAI's cheap default
text-embedding-3-large3072OpenAI's high-quality
text-embedding-ada-0021536Older OpenAI; for legacy compat
gemini-embedding-001768Google's default

Read the catalog at runtime — supported_endpoint_types contains embeddings for any model you can call here.

Dimensions

For models that support truncation (OpenAI text-embedding-3-*), you can request fewer dimensions:

{
  "model": "text-embedding-3-large",
  "input": "...",
  "dimensions": 256
}

Lower dimensions = smaller vectors, lower storage / search cost, slightly lower quality. Not all models support this — check the catalog.

See also