How pricing works
Lazu's billing is prepaid and per-token (or per-call for certain image/audio models). You top up balance, every API call deducts at the model's lane price, and your dashboard shows usage in real time.
The basic formula
final_charge_microUSD =
(input_tokens × input_price_per_mtok)
+ (output_tokens × output_price_per_mtok)
+ (cache_read_tokens × cache_read_price_per_mtok)
+ (audio_tokens × audio_price_per_mtok)
+ (image_input_tokens × image_input_price_per_mtok)
+ (per_call_charge if any)All amounts internally are microUSD (1 USD = 1,000,000 microUSD) so we can be precise even on cents-fractional tokens. Your dashboard displays USD.
Prices come from model_sell_prices[model_name, channel_group] — see
Pricing & lanes for the per-lane breakdown.
Refund rules
| Upstream outcome | What you pay |
|---|---|
2xx success | Full price by upstream-reported usage |
| Streamed, then disconnect, with usage in trailer | Pay for the tokens actually streamed |
| Streamed, then disconnect, no usage trailer | Refund — you pay 0 |
5xx / timeout / network error | Refund — you pay 0 |
4xx (content policy / bad request) | Refund — you pay 0 even if upstream charged us internally |
| Per-call image / audio model error | Charged at full per-call rate (see "Known edge case" below) |
In practice this means: if Lazu returned 200 to you, you pay; if Lazu
returned 4xx, 5xx or a timeout, you don't.
Top-up
Top up with a card via Stripe in the console. Funds appear instantly in your balance.
- Minimum top-up: $5
- No expiration on credits
- Refunds: open a ticket within 7 days for failed-but-charged calls
Funded accounts are verified and immediately move from Unverified
(5 RPM cap) to the tier matching their lifetime top-up total. See
Rate limits.
Free credits
New accounts get a small free trial credit ($X, see console for current amount). This is enough to test 100-ish basic chat calls. Free credit:
- Counts as balance — you can use it on any model in any lane
- Does not verify the account — to escape the 5 RPM cap, complete a real top-up
- Does not expire, but if you cap out without topping up, the account stays rate-limited
Where to see usage
- Console → Usage: per-day, per-model, per-key breakdown
- Console → Billing: invoices, top-up history, current balance
- API:
GET /api/usage/...(see API reference)
What's NOT layered on top
Lazu's bill is just input × price + output × price (etc., per the formula above). There is no:
- "Premium tier discount" stacking on top of lane price
- "Loyalty multiplier" that reduces price over time
- Hidden margin per cache read or per audio token beyond the listed per-mtok rate
- Surcharge on weekends, regions, or model size
If you see a charge that doesn't match tokens × listed_price, that's a
bug — open a ticket.
Streaming partial usage
When you call with stream: true:
- Tokens flush to your balance in real time as they're generated.
- If the client disconnects mid-stream, Lazu still bills for what was
delivered (provided upstream reported it in a final
usagetrailer). - If upstream errors before any tokens reach you, you're refunded.
This means an aborted stream of 1,000 tokens after the user clicked "Cancel" still costs roughly 1,000 × output_price. The model already did the work; the client just stopped reading.
Enterprise / volume contracts
For workloads sustained above $1,000/month, contact sales via lazu.ai — volume terms negotiated case by case.