Question 1

Will repair always succeed?

Accepted Answer

No. Files where the content streams have been overwritten with zeros, where the file has been truncated below ~200 bytes, or where encryption keys are missing are unrecoverable. The engine reports which stage failed and why.

Question 2

Does text remain selectable after repair?

Accepted Answer

Yes if recovery succeeds at Stage 1 or Stage 2 — those preserve the original text, fonts, and hyperlinks. Stage 3 rasterizes each page to a JPEG, so text is no longer selectable in the output. Run the result through PDF OCR to restore searchability.

Question 3

Is my file uploaded anywhere?

Accepted Answer

No. All three stages — pdf-lib parsing, lenient reconstruction, and pdfjs-dist rendering — run in-browser, with nothing transmitted. Files stay in tab memory and are never transmitted.

Question 4

Can it fix password-protected PDFs?

Accepted Answer

No — pdf-lib cannot decrypt protected PDFs, so encrypted content streams stay unreadable regardless of stage. Unlock the file first with the PDF Password tool, then run the unlocked copy through repair.

Question 5

Why is the rasterized output much larger?

Accepted Answer

Stage 3 embeds every page as a JPEG image, which is far less compact than compressed PDF text and vector data. A 2 MB text PDF might balloon to 20–40 MB after Stage 3. Run the output through PDF Compressor to reduce size.

Question 6

Will my form fields and annotations survive?

Accepted Answer

Stage 1 and Stage 2 preserve form fields, annotations, and hyperlinks if the underlying objects are intact. Stage 3 flattens everything into the page image, losing interactivity.

Question 7

Why does Stage 1 succeed even though my viewer refused to open the PDF?

Accepted Answer

Many viewers have stricter validation than the actual PDF spec requires. Stage 1 uses pdf-lib's parser plus a clean re-serialization, which produces a standards-compliant copy even when the source had cosmetic non-compliance issues.

Question 8

How long does repair take?

Accepted Answer

Stage 1 and 2 are near-instant for most files. Stage 3 takes proportional to page count — about half a second per page on a modern laptop. A 50-page rasterization run takes 25–30 seconds.

Question 9

Are there file-size limits?

Accepted Answer

There is no fixed limit in code, but everything runs in your tab's memory. Small and mid-size PDFs repair quickly; very large or image-heavy files can be slow and may run out of memory during Stage 3, since each page is rendered to a full-resolution canvas before encoding. If a big file struggles, try splitting it first.

Question 10

Does repair work on PDFs with custom font subsets?

Accepted Answer

Stage 1 and 2 preserve embedded font subsets exactly. Stage 3 rasterizes pages, so the font rendering is captured as pixels — the result looks identical but no longer carries the embedded font data.

PDF Repair

Three-stage repair engine

Next steps

PDF Compressor

PDF Merger

PDF Password

Edit metadata

How PDF Repair Works — Three-Stage Recovery from xref to Rasterization

Common Use Cases

Recover a truncated download

Fix bit-rot on old archives

Salvage a partial save

Open non-compliant PDFs

Frequently Asked Questions