Question 1

Why is the extracted text empty?

Accepted Answer

Your PDF is likely image-only — a scan without an embedded text layer. The page looks like text to you but is actually a picture. Run OCR (optical character recognition) first to make the text extractable.

Question 2

Will tables come out cleanly?

Accepted Answer

Tables are challenging — PDFs don't carry table structure, just positioned text. Simple grid tables often extract reasonably; complex tables with merged cells or visual borders may need manual cleanup.

Question 3

Are images and figures included?

Accepted Answer

Only the alt-text or label, if present. For the actual images, use the Extract Images tool. For text inside images (charts, diagrams), OCR the PDF first.

Extract Data

About this tool

When to use it

What to expect

Frequently asked questions

Related PDF tools