TRUST & AUDIT
How billing works — computed, inspectable, idempotent
Our billing is post-settlement: we forward the request, **read the upstream's actual usage**, then deduct credits. The charged amount is always tied to real upstream usage — never an estimate. Here's every step.
1. Formula
Every model has a context_tiers config in cc_models — up_tokens thresholds map to credit costs. BillingHandler reads upstream's prompt_tokens + completion_tokens, classifies by input_tokens, and deducts the matching credit tier:
// internal/service/model_registry.go
func (m ModelConfig) CreditsForTokens(inputTokens int) int {
for _, tier := range m.ContextTiers {
if tier.UpTokens == 0 || inputTokens <= tier.UpTokens {
return tier.Credits
}
}
return m.ContextTiers[len(m.ContextTiers)-1].Credits
}
// Example — claude-sonnet-4-6 with 18,000 input tokens:
// tier 1: up_tokens=32000 credits=12 ← matches
// tier 2: up_tokens=200000 credits=36
// tier 3: up_tokens=0 credits=84 (terminal "anything bigger")
// → 12 credits deducted, regardless of completion length.Completion tokens don't change which tier you land in — so a long answer doesn't surprise-bill you.
2. Full request lifecycle
- 01Your client hits api.clawfeeder.ai with Authorization: Bearer cf-sk-...
- 02After JWT/API Key middleware, BillingHandler reads the request body to extract the model field
- 03Looks up cc_models registry; rejects with 400 unsupported_model if not present
- 04Balance + trial gate check — insufficient or unauthorized → 402 immediately
- 05Forward to a real upstream via the channel chain; 5xx auto-falls-back to the next channel
- 06**After upstream responds**, the streaming usage scanner extracts the usage field on the fly
- 07Once usage is known, CreditsForTokens computes the deduction
- 08Write one row to cc_credits_ledger (reason=api_use, delta=negative, model, latency_ms, status_code, ref_id=trace_id)
- 09ref_id UNIQUE constraint guarantees idempotency — same trace_id can't be settled twice
3. Where to inspect each charge
Every request can be cross-checked across 4 independent surfaces:
- ✓ X-Request-ID response headerServer returns a UUID per request; same value as the ledger ref_id and the log trace_id. If anything goes wrong, send us this ID and we can find it.
- ✓ X-Clawfeeder-Model response headerThe actual settled model. With model=auto, this header shows which concrete model was picked. Billing follows this header, never the literal 'auto'.
- ✓ Dashboard usage page/dashboard/usage shows every charge (model, tokens, credits, ref_id, status). Full 30-day history.
- ✓ GET /api/credits/historyProgrammatic access to the same data — JSON output, for your reconciliation scripts.
4. Idempotency
cc_credits_ledger.ref_id has a UNIQUE partial index. Each request has one trace_id; if BillingHandler retries (graceful shutdown mid-flight, SSE reconnect, etc.) a second settle write hits the DB and gets rejected. You cannot be double-charged.
-- migration 013 (deployed 2026-04-20): CREATE UNIQUE INDEX cc_credits_ledger_ref_id_idx ON cc_credits_ledger(ref_id) WHERE ref_id IS NOT NULL;
Async video billing uses the same mechanism — pre-deduct ref_id=video:<task_id>, settle ref_id=video-settle:<task_id>, refund ref_id=video-refund:<task_id>. A task can't be settled twice.
5. What we store and what we don't
✓ We store
- · Request time, model, token counts
- · Credits charged, ref_id, upstream HTTP status
- · Upstream latency (latency_ms)
- · User ID and API key fingerprint
✗ We do not store
- · Prompt content
- · Response content
- · Any request / response body excerpt
- · (Except Playground threads — those you opt-in to save)
This is a technical invariant, not a promise. BillingHandler has no code path that writes bodies to DB — it streams through a usage scanner that finds the usage field and discards the rest.
6. What happens at zero balance
Post-settlement billing means deduction happens after the request. We allow a -100 credit soft floor so a request that lands at exactly $0 balance doesn't get its usage swallowed. Once balance ≤ -100, new requests return 402 until you top up.
Floor of -100 credits ≈ $0.30 — the worst-case overrun is always tiny.
7. Credit validity & renewal
Credits granted with a subscription (plan top-ups, redeem codes) live as long as the subscription period — at expiry, whatever's unspent in that batch is cleared. That's standard for subscriptions; what matters is that renewal works in your favour.
- · Renew early, days stack — the new period is added onto your current expiry, not restarted from today. Renewing early never costs you the days you have left.
- · Unused credits roll forward — on renewal, any credits you haven't spent are extended to the new expiry date. Nothing is lost.
- · Gifted credits never expire — invite, referral, and promotional credits carry no expiry and stay valid indefinitely.
Example: your membership expires 7-15 with 3,200 credits left; you renew 30 days on 7-1 → validity stacks to 8-14, and those 3,200 credits roll forward to 8-14 too.
8. Data consistency checks
A reconcile job runs daily at 03:17 UTC with 4 read-only checks:
- ·
duplicate_ref_ids— Scan ledger for duplicate ref_ids - ·
no_usage_trend— Check zero-charge ledger trend (possible usage parse failure) - ·
tier_distribution— Watch for unusual concentration in a tier - ·
balance_drift— User balance must equal SUM(delta)
Results retained in Redis for 30 days. Any anomaly writes WARN/ERROR logs that downstream ops pipelines can alert on.
Want to inspect a charge yourself?
Sign in, then visit Dashboard usage for every ledger row in the last 30 days.