About this tool
Extract every hyperlink from a PDF — every URL, every cross-document reference, every clickable link — into a clean list. Useful for auditing what a PDF links to before publishing, archiving the references in a research paper, or migrating link structures from a PDF into a spreadsheet or database.
When to use it
- Auditing a contract or report for any external links before publishing
- Archiving the URL list from a research paper for reference checking
- Migrating link structures from a PDF report into a spreadsheet
- Verifying that all links in a PDF still resolve (you'll need a separate link checker for that)
- Producing a bibliography of online references from a research document
What to expect
Only PDF link annotations are extracted (the clickable areas). Plain-text URLs that aren't formatted as actual hyperlinks won't be detected — you'd extract those via the Extract Text tool and then regex out the URLs. Internal cross-references (e.g., to figures) are included alongside external URLs.
Frequently asked questions
Will plain-text URLs in the document body be extracted?
Only if they're hyperlinked — i.e., clickable. URLs typed into the body without being made into actual links won't be detected by link extraction. For those, use Extract Text and then a URL regex.
Does this include internal cross-references?
Yes. Internal links (e.g., a clickable chapter reference or figure cross-reference) are included alongside external URLs, with a label indicating they're internal.
Can I check whether the links are still alive?
Not from this tool — that requires HTTP requests to each URL, which we don't do. Once you have the list, use a link-checker tool or a quick script to verify.