Turn scans into searchable, machine-readable text
A hosted REST API that extracts full text from PDFs, TIFFs, and images and returns a searchable PDF — with auto-deskew, orientation correction, and Indian and Arabic script support built in.
Full text out, searchable PDF back
Send a document, get clean extracted text plus a fully searchable PDF copy of the original.
Extraction that handles real-world scans
Upload PDF, TIFF, JPG, PNG, or raw scanned images — single page or multi-page. Pre-processing cleans the page before recognition, so skewed, rotated, or low-contrast scans still read accurately.
- Inputs: PDF, TIFF, JPG, PNG, and scanned images, multi-page
- Auto-deskew, orientation correction, denoise, and contrast normalization
- Multi-language recognition including Hindi, Tamil, Telugu, and Arabic
- Searchable PDF output — text layer over the original page image
Where OCR Full-Text fits
From archive digitization to the first step of a structured extraction pipeline.
Digitize, search, and feed downstream
Make decades of paper usable: convert physical archives into text you can index, search, and process. Use the searchable PDF as a drop-in replacement for image-only scans, then hand the extracted text to a structured-extraction step when you need named fields.
- Digitize paper archives into indexable, searchable records
- Add a text layer so existing scans become full-text searchable
- Feed clean text into Extract Basic for named-field extraction
Rs 0.25 per page
about $0.003 / page — flat, pay-as-you-go. No surge pricing, no tiered overages, no per-call ceiling.
Start with free credit
Every organisation gets a one-time Rs 500 credit on the free trial — no card required, enough for 1,000+ OCR pages. You only pay for the pages you process, and credit doesn't expire while the account is active. Pay in INR · Razorpay in India, or USD via Stripe globally.
Everything you need in one call
The same platform foundation ships with every Abscode Document AI API.
Broad input support
PDF, TIFF, JPG, PNG, and scanned images — single or multi-page, processed in one request.
Structured JSON
Responses return extracted text with bounding boxes and confidence — easy to parse and audit.
Sync or async
Synchronous responses for small docs, or async with a webhook callback for large batches.
Secure by default
TLS 1.3 in transit, AES-256 at rest, and documents auto-purged after processing. We never train on your data.