Skip to main content

How to Extract Text from PDFs with Bitcoin Lightning (OCR)

The Problem with OCR Today

You have a scanned PDF, a photo of a document, or an old paper you need digitized. The options are not great. Free online OCR tools butcher formatting and struggle with anything beyond simple text. ABBYY FineReader costs $200+ for a license. Adobe Acrobat runs $13-23/month. Tesseract is free but requires technical setup and produces inconsistent results.

Google Docs can do basic OCR, but it requires a Google account and uploads your files to Google's cloud. For sensitive documents — legal contracts, medical records, financial statements — you may not want them sitting on someone else's servers tied to your identity.

What Sats4AI Offers

Sats4AI provides AI-powered OCR for 10 sats per page (roughly $0.01) using Mistral OCR (mistral-ocr-latest). Upload any document — PDF, PNG, JPEG, TIFF, WebP, or BMP — and get clean Markdown output with tables, headings, and document structure preserved.

Files are processed and never stored. No account required. No credit card. Supports 30+ languages and handwriting recognition. Pay with Bitcoin over the Lightning Network and get your extracted text in seconds.

Who This Is For

  • Anyone with scanned documents — Digitize old papers, receipts, forms. Make them searchable and editable.
  • Legal and compliance teams — Extract text from scanned contracts, court filings, regulatory documents. Privacy-preserving: no cloud storage, no account linking your identity to the documents you process.
  • Students and researchers — Convert scanned textbooks or academic papers to text for note-taking, citation, or analysis.
  • Businesses — Digitize paper invoices, forms, handwritten notes. Feed extracted text into databases or workflows.
  • Developers — Build OCR into your pipeline. Process thousands of pages programmatically at $0.01/page via the L402 API.

What It Actually Costs

Sats4AIABBYY FineReaderAdobe AcrobatGoogle Cloud Vision
Price10 sats/page (~$0.01)$200+ one-time$13-23/month$1.50/1000 pages
SignupNoneLicense keyAccount + CCGoogle Cloud account
PrivacyFiles never storedLocal processingCloud storedCloud stored
Output formatMarkdownMultiplePDF text layerJSON

OCR Capabilities

  • Scanned documents — Extract text from scanned PDFs, photographed pages, and image files. The AI reads printed text with high accuracy even from low-quality scans.
  • Table preservation — Tables are extracted and formatted as Markdown tables, maintaining rows, columns, and alignment.
  • Handwriting recognition — The model reads handwritten text including notes, forms, and letters across a variety of handwriting styles.
  • 30+ languages — Supports text extraction in over 30 languages including English, Spanish, French, German, Chinese, Japanese, Arabic, and many more.
  • Clean Markdown output — Results are delivered as structured Markdown with headings, lists, tables, and paragraphs preserved from the original document layout.
  • Multiple formats — Accepts PDF, PNG, JPEG, TIFF, WebP, and BMP files.

Step-by-Step Guide

  1. Go to sats4ai.com/ocr.
  2. Upload your PDF file or image by clicking the upload area or dragging and dropping your file.
  3. Click Extract Text. A Lightning invoice will appear as a QR code. The price is calculated based on the number of pages (10 sats per page).
  4. Scan the QR code with any Lightning wallet such as Phoenix, Mutiny, Alby, Zeus, or Wallet of Satoshi.
  5. The AI processes your document and displays the extracted text in Markdown format. Tables, headings, and structure are preserved.
  6. Copy the extracted text or download it directly from the page. Use it in your documents, spreadsheets, or any other application.

Automate from Code

The web interface works for one-off documents, but if you need to process documents programmatically, Sats4AI offers three integration paths.

L402 API

The L402 API lets you extract text from documents using any HTTP client. Send a request, receive a Lightning invoice in the 402 response, pay it, and resubmit with the payment preimage. Standard HTTP flow, no SDK required. Works with curl, Python, Node.js, or any language that can make HTTP requests.

MCP Server

The MCP server integrates directly with AI assistants like Claude. Your AI assistant can process documents on your behalf with automatic Lightning payments. Point it at a scanned PDF and it handles the upload, payment, and text extraction without manual intervention.

OpenClaw

OpenClaw is an open-source Python and TypeScript client for the L402 protocol. It manages the payment flow automatically — request, pay, retry — so you can call Sats4AI endpoints like a normal API. Useful for building batch OCR pipelines or integrating document processing into existing applications.

Combine with Other Services

  • OCR + AI Chat — Extract text from a document, then ask AI questions about it, summarize it, or translate it. Useful for quickly digesting long legal or technical documents.
  • OCR + Text-to-Speech — Extract text from a document, then convert it to spoken audio. Accessibility use case: turn any printed document into a listenable format.
  • OCR + File Conversion — Convert an image format first if needed (e.g., HEIC to JPEG), then run OCR to extract the text.
  • OCR + PDF Convert — For PDFs that are partially text and partially scanned images, use PDF conversion alongside OCR to capture everything.

Why Pay with Bitcoin

  • Privacy — No account, no email, no credit card. Especially important when processing sensitive legal, medical, or financial documents. Nothing ties you to the documents you process.
  • Micropayments — 10 sats per page is a natural fit for Lightning micropayments. Process one page or a thousand. Pay only for what you use, with no minimums or monthly fees.
  • No gatekeepers — No payment processor can decide your OCR request is “suspicious” and freeze your account. Bitcoin is permissionless.
  • Instant settlement — Lightning payments confirm in milliseconds. No pending charges, no holds, no chargebacks.

Quick Reference

ModelMistral OCR (mistral-ocr-latest)
Price10 sats/page (~$0.01)
Supported formatsPDF, PNG, JPEG, TIFF, WebP, BMP
OutputStructured Markdown
Languages30+
SignupNone
PaymentBitcoin Lightning

Try It Now — No Signup Required

Extract text from your first document in under a minute. All you need is a Lightning wallet.