Changelog

Agent Ergonomics Upgrade: Cost Estimates, Auto-Routing, Structured Errors, and /docs

Ten new patterns shipped across the Sats4AI API — every one of them designed to shave a decision or a round-trip off an agent's loop. Here's what changed and why each pattern exists.

April 13, 2026

TL;DR

New /docs page — L402, MCP, and WebLN quickstarts side-by-side, plus every cross-cutting feature in one scroll.
GET /api/estimate-cost — exact sat quote before you pay.
{ "model": "auto" } — let the router pick; check X-Route-Model.
GET /api/error-codes — machine-readable catalog for error branching.
Standardized async 202 shape: { status, job_id, poll_url, poll_interval_ms }.
Opt-in HMAC-signed webhooks via callback_url + callback_id — skip the polling loop.
Request dedup via X-Dedup header (best-effort, 30s window).
CORS now exposes every protocol header for browser-side agents.

Why This Upgrade

When an agent hits a paid API for the first time, it has to answer four questions before it spends a sat: how much will this cost, which model should I use, what do I do if it fails, and how do I track the job if it's async? Historically, answering those required reading docs, inspecting response bodies, or just guessing.

This ship resolves all four at the protocol level. The patterns below are inspired partly by llm402.ai and partly by what we've seen orchestrators trip over in the wild. Every pattern is optional; no existing client breaks.

The 10 Patterns

1. /docs — one long page, everything in it

A single scroll with a sticky active-section sidebar. Quickstart shows L402, MCP, and WebLN flows side by side. Every other section on this list is documented there. The per-service /l402/* pages are still around for SEO; /docs is the “see everything at once” surface.

2. `GET /api/estimate-cost` — quote before you buy

No auth, no invoice, no side effects. Pass a service id and the relevant params, get an exact sat amount back:

$ curl 'https://sats4ai.com/api/estimate-cost?service=text-to-speech&model=omnivoice-global&chars=1500'
{
  "service": "text-to-speech",
  "amount_sats": 15,
  "currency": "BTC",
  "breakdown": { "type": "per-character", "chars": 1500, "chars_per_sat": 100, "model": "omnivoice-global" },
  "error_code": "L402_ESTIMATE_ONLY"
}

Hit GET /api/estimate-cost with no params for the full catalog of supported services.

3. Auto-routing with `"model": "auto"`

Don't want to pick a model? Send "model": "auto" and the server picks the best default for the category (text / image / audio). The 402 response echoes the choice in X-Route-Model.

Gotcha: on the paid retry, send the concrete model id from X-Route-Model — not "auto" — to avoid price mismatches if the default changes between your quote and your retry.

4. URL-path model selection

For clients that prefer URLs over JSON bodies:

POST /api/m/text-generation/gpt-oss-120b
Content-Type: application/json

{ "prompt": "..." }

Forwards to the normal service handler with the model injected. Body model still wins if both are present.

5. Machine-readable error codes

Every post-payment error now carries an error_code agents can branch on, and the full catalog is public:

$ curl https://sats4ai.com/api/error-codes
{
  "TIMEOUT": "Upstream provider timed out. Automatic refund issued.",
  "CONTENT_FILTERED": "Content filtered by provider (e.g. NSFW, copyright).",
  "RATE_LIMITED": "Provider rate limit hit. Retry later.",
  "L402_REFUND_ISSUED": "Refund LNURL-withdraw link issued in response payload.",
  "L402_AUTO_ROUTED": "Request was routed via model:\"auto\".",
  ...
}

6. Standardized async responses

Every long-running job (audiobook, video, voice call, etc.) returns the same 202 shape. One polling loop in your agent works for all of them:

HTTP/1.1 202 Accepted
X-Job-Id: abc123
X-Poll-Url: https://sats4ai.com/api/models/epub-audiobook/status?id=abc123
X-Poll-Interval-Ms: 3000

{
  "status": "queued",
  "job_id": "abc123",
  "poll_url": "...",
  "poll_interval_ms": 3000,
  "estimated_completion_ms": 60000
}

6b. Opt-in webhooks (HeyGen-style)

Attach callback_url + callback_id to any async job and we'll POST your endpoint when it finishes — no polling loop required. Body is HMAC-signed (X-Sats4AI-Signature: sha256=...) with a per-job secret returned in the 202. URLs are SSRF-validated (HTTPS only, no loopback / RFC1918). callback_id is opaque — treat it as a correlation string, never as a place to put PII. Polling keeps working either way.

# Submit with a webhook
curl -X POST https://sats4ai.com/api/models/epub-audiobook \
  -H "Authorization: L402 ..." \
  -F "file=@book.epub" -F "voice=Ashley" \
  -F "callback_url=https://your-app.example.com/hooks/sats4ai" \
  -F "callback_id=user-42-job-7"

# Later, your endpoint receives:
POST /hooks/sats4ai
X-Sats4AI-Signature: sha256=<hex>
{
  "job_id": "abc123",
  "callback_id": "user-42-job-7",
  "status": "completed",
  "result_url": "https://sats4ai.com/uploads/book-audiobook.m4b",
  "timestamp": "2026-04-13T..."
}

7. Request dedup (best-effort)

Identical POSTs from the same IP within 30 seconds hit an in-memory cache and return the prior response with X-Dedup: hit. Useful when an agent retries on a transient error. Send X-No-Cache: true to bypass. This is explicitly best-effort — don't rely on it for correctness, only for avoiding double-charges on quick retries.

8. CORS for browser agents

Every L402 response now sets Access-Control-Expose-Headers covering WWW-Authenticate, Payment-Receipt, X-Route-Model, X-Job-Id, X-Dedup, and X-Error-Code. Browser-side WebLN agents can now read everything they need.

9. Public macaroon caveat docs

The macaroon caveats section of /docs spells out exactly what we verify (RequestPath, ExpiresAt, model tier) — no more guessing whether your proof-of-payment is still valid for the route you're calling.

10. Discovery manifests advertise all of the above

/.well-known/l402-services bumped to v0.3.0 with error_codes_url, estimate_cost_url, docs_url, and a features object. /.well-known/mcp and /api/mcp/discovery mirror the same metadata. Agents discovering us for the first time see the full capability set upfront.

Nothing breaks

Every pattern above is opt-in. Existing clients that don't send model: "auto", don't read X-Route-Model, don't hit /api/estimate-cost, and don't parse error_code keep working exactly as before. The response shapes for paid success paths are unchanged.

Read the full protocol docs

L402, MCP, WebLN quickstarts + every feature from this post in one scroll.

Open /docs MCP Quick Connect L402 API Docs