UAC
πŸ’»Developer Tools

Extract Every URL From Any Text or HTML Instantly

Extract, deduplicate, and analyze every URL from any text or HTML instantly.

What This Does

URLs are embedded everywhere β€” in HTML source code, in markdown documents, in CSV exports, in log files, in copied web content. Finding all the links in a block of content manually is slow and error-prone; tools that do it automatically usually give you raw, unprocessed output that still requires significant cleanup. The URL Extractor goes substantially beyond simple link extraction. It finds every URL in your pasted text β€” in href attributes, src attributes, action attributes, CSS url() values, plain-text URLs with or without http/https, and bare domain references β€” then deduplicates them, normalizes them, and categorizes each link by type: internal links, external links, images, scripts, stylesheets, documents, API endpoints, social media links, and more. The domain breakdown shows you which sites are linked to most frequently. The protocol breakdown identifies HTTP vs HTTPS links so you can find insecure references. And the full-text filter lets you search and narrow the extracted list before exporting. This is dramatically more useful than basic URL extractors that give you a raw deduplicated list with no categorization, no domain analysis, and no insight into the structure of the link set. Whether you're auditing a website's external link profile, extracting API endpoints from documentation, building a sitemap, or doing competitive link analysis, the URL Extractor gives you a complete picture rather than just a list. All processing runs entirely in your browser β€” your content never leaves your device.

Assumptions
  • Β·Internal vs external classification based on first detected absolute domain in input
  • Β·Deduplication is case-insensitive for protocol and domain, case-sensitive for path
  • Β·Categorization uses extension and path pattern matching, not HTTP content-type headers
  • Β·All processing is client-side β€” no data transmitted to any server
When Should You Use This?
  • β†’Extracting all links from HTML source to audit a page's internal and external link structure
  • β†’Finding all image URLs, script sources, or stylesheet hrefs in a web page's source code
  • β†’Pulling all URLs from a markdown document, README, or text file for link checking
  • β†’Extracting API endpoint URLs from documentation, Postman exports, or code files
  • β†’Building a list of external links from a piece of content for competitive or SEO analysis
  • β†’Deduplicating and cleaning a URL list from multiple sources before importing to a tool
Example Scenario

Marcus is doing an SEO audit for a client's website. He copies the full HTML source of the client's homepage (12,000 lines) and pastes it into the URL Extractor. Result in under a second: 847 total URL matches β†’ 312 unique URLs after deduplication. Breakdown: 189 internal links (61%), 91 external links (29%), 24 images (8%), 8 scripts (2%). Top external domain: google.com (analytics + tag manager, 12 links). 3 HTTP (non-HTTPS) external links flagged. 14 links to social media platforms. He exports the external links as a CSV and the image URLs as a JSON array for further processing.

πŸ”’

100% private. All processing runs in your browser β€” your HTML or text never leaves your device.

Paste HTML, markdown, CSV, or any text with links

URL Extractor β€” Extract, Categorize & Export Links

Paste any HTML, markdown, CSV, or text content and instantly extract every URL. The extractor finds links in href/src/action attributes, CSS url() values, plain-text URLs, and markdown link syntax β€” then deduplicates, categorizes each URL (images, scripts, stylesheets, documents, API endpoints, social media, mailto, internal, external), and provides a full domain breakdown. Export as plain text, comma-separated, JSON array, or CSV with metadata. All processing is client-side.

What makes this better than basic URL extractors?

  • Smart categorization β€” automatically identifies images, scripts, stylesheets, documents, API endpoints, social links, and more
  • Domain breakdown β€” bar chart + pie chart showing which sites are linked to most, with percentage breakdown
  • HTTP/HTTPS detection β€” flags insecure HTTP links that may cause mixed-content issues
  • Category filter β€” click any category to instantly filter the list to that type only
  • CSV export with metadata β€” export URL, category, domain, and protocol columns for further analysis
  • Inline open β€” hover any URL to open it in a new tab or copy it individually
  • 100% private β€” your HTML never leaves your browser

FAQs

What sources does this extract from?

href, src, action, data-src, data-href, poster attributes; CSS url() values; plain http/https URLs; protocol-relative // URLs; markdown [text](url) syntax; mailto: links.

How does it detect internal vs external links?

The first absolute domain found in your input is treated as the base domain. Relative URLs and URLs matching that domain are 'internal'; all others are 'external'.

What is the CSV export format?

Four columns: url, category, domain, protocol (https/http/other). Useful for importing to spreadsheets or further automated processing.

Is there a size limit?

No β€” runs entirely in your browser. Inputs up to several MB process in under a second on modern devices.

Related Calculators

Browse all
Frequently Asked Questions

Related Tools

All calculators