Skip to main content
OCR & Text Extraction

How to Extract Text from PDFs with Bitcoin Lightning

Upload a scanned PDF or image, pay 10 sats per page, get clean Markdown back in seconds. No account, no credit card, files never stored.

10 sats/pageNo account required30+ languages + handwritingL402 / MCP / OpenClaw

For Humans

Use cases for individuals, researchers, and professionals

Digitize scanned documents

Turn scanned contracts, receipts, old papers, and forms into editable, searchable text. The AI reads printed text accurately even from low-quality scans — no need for a flatbed scanner or specialized software.

Sensitive documents, privately

Legal contracts, medical records, and financial statements shouldn't be uploaded to services tied to your identity. Here there's no account, no email, and files are never stored — pay with Lightning and the transaction leaves no trail.

Research and study

Convert scanned textbooks, academic papers, and archived articles into text you can search, highlight, and copy. Then feed it straight into AI Chat to summarize, translate, or ask questions about the content.

Accessible documents

Extract text from a printed document, then convert it to spoken audio with text-to-speech. Turn any physical document — instructions, menus, letters — into a listenable format.

AI OCR Engine

AI OCR

10 sats/page
~$0.01 per page
30+ languages
Incl. Arabic, CJK, Cyrillic
Handwriting
Notes, forms, letters
Tables
Preserved as Markdown
6 formats
PDF, PNG, JPEG, TIFF, WebP, BMP

How It Works

1

Upload your document

Go to sats4ai.com/ocr. Upload a PDF or image file (PNG, JPEG, TIFF, WebP, BMP) by clicking the upload area or dragging and dropping.

2

Pay the Lightning invoice

Click Extract Text. A QR code appears priced at 10 sats × number of pages. Scan with any Lightning wallet — Phoenix, Breez, Wallet of Satoshi, or a WebLN browser wallet.

3

Get your Markdown

The AI processes your document and returns clean Markdown with headings, lists, tables, and paragraphs matching the original layout. Copy or download directly from the page.

For Agents

Use cases for AI agents, automation pipelines, and developers

No API key needed. The OCR endpoint uses L402 — pay-per-page via Bitcoin Lightning. Agents send a document, receive an invoice sized to the page count, pay it, and get Markdown back. Process one page or ten thousand, paying only for what you use.

Document ingestion pipelines

Automatically process incoming scanned documents — invoices, purchase orders, applications — extracting structured text for downstream database storage or workflow routing. Pay per document, scale with volume.

RAG knowledge base building

Extract text from large PDF libraries to build retrieval-augmented generation (RAG) corpora. Feed the Markdown output directly into a vector store for semantic search over document collections.

Legal and compliance review

Extract text from scanned contracts, filings, and regulatory documents for automated review. Chain with AI Chat to flag clauses, extract key dates, or produce structured summaries without manual reading.

Archive digitization

Batch-process thousands of scanned historical documents or records. The pay-per-page model is far more economical than subscription OCR for large one-time digitization projects.

L402 Authentication Flow

1
Send the request without authPOST the document to the endpoint. The server responds with HTTP 402 and a Lightning invoice priced by page count.
2
Pay the invoiceThe agent pays the Lightning invoice using a connected wallet. Confirms in milliseconds.
3
Resend with payment proofResend the request with the payment preimage in the Authorization header. The server returns extracted text as Markdown.
terminal
# Step 1: Send request (returns 402 + invoice)
curl -X POST https://sats4ai.com/api/l402/extract-document \
  -H "Content-Type: application/json" \
  -d '{"document":"data:application/pdf;base64,<BASE64_PDF>"}'

# Step 2: Pay the Lightning invoice (10 sats × pages)

# Step 3: Resend with payment proof
curl -X POST https://sats4ai.com/api/l402/extract-document \
  -H "Content-Type: application/json" \
  -H "Authorization: L402 <token>:<preimage>" \
  -d '{"document":"data:application/pdf;base64,<BASE64_PDF>"}'
# Returns: clean Markdown text

MCP & OpenClaw

Chain with Other Services

Extract text from a document, then ask the AI to summarize it, translate it, extract key clauses, or answer questions about its content.

Extract text from a printed document, then convert it to audio. Turn any physical document into a listenable format for accessibility.

Convert an incompatible image format (HEIC, TIFF) to JPEG or PDF first, then run OCR on the result.

Extract text from scanned pages, then combine with other PDF processing — merge, convert, or repackage the resulting documents.

Extract Text from PDFs — No Signup Required

10 sats per page. Upload any scanned document and get clean Markdown in seconds. Files are never stored. All you need is a Lightning wallet.