Image to Text (OCR)

Extract text from photos, screenshots, and scanned documents — 13 languages, runs entirely in your browser.

Drop or tap to extract text

PNG, JPG, WebP, BMP, GIF, TIFF

📄Scanned documents
📸Photos of text
🖥️Screenshots
📋Handwritten notes
📊Whiteboards
Your image never leaves your device — OCR runs 100% in the browser
Runs entirely in your browser. No uploads. Your files stay private.

Run Tesseract OCR Entirely In The Browser

Image OCR runs Tesseract.js v5 — a WebAssembly port of the open-source Tesseract engine originally built at HP and now maintained by Google. The library is loaded from a JSDelivr CDN bundle on first use; the WebAssembly binary plus the language data file for your selected language are cached by the browser, so subsequent runs start almost instantly.
Tesseract uses an LSTM-based recurrent neural network as its core recognition model, with a separate page layout analyser that finds text blocks, lines, and word boundaries before character recognition begins. The combination handles multi-column text, mixed font sizes, and rotated lines reasonably well — better than many older template-matching OCR engines.
Recognition runs on a Web Worker so the UI stays responsive. Your image is decoded into a Canvas, the pixel buffer is handed to the Tesseract worker, and progress is reported back as a percentage from layout analysis through final recognition. A typical 1080-pixel screenshot finishes in two to five seconds on a desktop, ten to twenty seconds on a phone for a full A4 scan.
Thirteen languages are supported in this build: English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Simplified Chinese, Japanese, Korean, Arabic, and Hindi. Each language has its own data file (the English file is roughly 5 MB; CJK languages are larger). Select the right language before running OCR — using English on a French document drops accuracy noticeably because the language model influences character disambiguation.
Output includes the recognised text and a confidence score per line and per word. Confidence above 80 percent is usually accurate; below 70 percent the result needs a human pass. For best accuracy, pre-process your image: crop to just the text region, increase contrast in Image Enhancer, and rotate the image upright before running OCR.
Handwriting support is limited. Tesseract was designed for printed text and handles neat block printing acceptably; it struggles with cursive, ornate scripts, and overlapping handwritten lines. For genuine handwriting recognition, a model trained specifically on handwriting is required.
All processing is local. The image is read into a Canvas, the pixel buffer goes to a Worker that runs Tesseract.js, and the recognised text is held in React state. There is no network call carrying image bytes — only the one-time CDN fetch for the engine and language data, which the browser then caches.

Common Use Cases

01

Digitising receipts and invoices

Photograph a paper receipt and extract the line items as editable text ready to paste into a spreadsheet or expense tracker.

02

Extracting text from screenshots

Pull text from UI screenshots, error messages, and social media images where copy-paste is not available, then drop it into a bug ticket or Slack message.

03

Scanning printed documents

Turn scanned book pages, contracts, or letters into searchable plain text without installing desktop OCR software.

04

Reading whiteboards

Snap a phone photo of a meeting whiteboard or planning board and convert the captured text into a typed transcript for the meeting notes.

Frequently Asked Questions

Tesseract.js v5, the WebAssembly port of the open-source Tesseract engine originally built at HP and now maintained by Google. The recognition core is an LSTM-based recurrent neural network.
PNG, JPG/JPEG, WebP, BMP, GIF, and TIFF. For best accuracy, use PNG or high-quality JPEG. Animated GIFs use the first frame only.
Clean, high-contrast printed text usually scores 95 percent or higher. Stylised fonts, low-resolution scans, and handwriting score lower. The displayed per-line confidence gives a quick quality check on each result.
Only neat block-style handwriting reasonably well. Cursive and ornate scripts are largely outside Tesseract's training distribution. Use a handwriting-specific model if cursive accuracy matters.
On the first run the browser fetches the Tesseract WebAssembly binary and the selected language's training data (about 5 MB for English, more for CJK). Both are then cached, so subsequent runs in the same browser start almost instantly.
No. Tesseract.js runs in your browser. The image is decoded into a Canvas locally, the pixel buffer is passed to a Web Worker, and the recognised text is rendered in your tab. Only the one-time CDN fetch for the engine touches the network.
Thirteen: English, Spanish, French, German, Italian, Portuguese, Dutch, Russian, Simplified Chinese, Japanese, Korean, Arabic, and Hindi. Select the correct language before recognising — wrong-language data hurts accuracy.
Crop to just the text, run Image Enhancer to add contrast, and make sure the page is rotated upright. Tesseract does not auto-rotate by default — a sideways scan recognises poorly.
Not in this tool. Use PDF OCR for searchable-PDF generation. Image OCR is single-image only.
About 50 MB or 50 megapixels per image; mobile Safari is tighter. Larger files exhaust the WebAssembly heap and fail to recognise.

Step-by-step guide

How to extract text from an image (OCR)

Walk through every step with screenshots, format-specific tips, and the platform-by-platform limits you need to know.

Advertisement