Academic researchers
Reference lists, supplementary materials, replication packages.
DOI rot and dead publisher links creep into manuscripts between drafts.
Verify that every URL in your bibliography, dataset, source list, or citation export actually resolves — and if it doesn't, find the closest archived snapshot. Built for academics, librarians, journalists, and UX researchers.
¹ Implements HEAD → GET fallback, follows up to 10 redirects, 8s timeout per URL, 8 concurrent workers. DOI, arXiv, PubMed, and Wayback Machine links are recognized automatically.
Reference lists, supplementary materials, replication packages.
DOI rot and dead publisher links creep into manuscripts between drafts.
Subject guides, journal references, institutional repositories.
Quarterly link audits across thousands of curated URLs.
Source lists, survey panels, competitor URLs, research repos.
Stakeholders open the report a quarter later — half the citations 404.
Source verification, archived evidence, investigation links.
Adversaries delete content; you need to catch and archive it fast.
What happens when you submit a batch — start to finish, no black boxes.
Paste URLs (one per line), upload a CSV/TXT, or drop in BibTeX/RIS — we extract the URL fields. Bare DOIs and arXiv IDs are auto-prefixed.
Each entry is tagged as URL, DOI, arXiv, PubMed, or Wayback via deterministic regex — no network calls, no LLM, no privacy footprint.
Server-side fetch: HEAD first, then GET with Range fallback when servers reject HEAD. Follows redirects, 8s timeout, browser-like UA, 8 concurrent workers.
Every failing URL ships with a one-click Wayback Machine snapshot lookup. Export the broken-link report as CSV or Markdown footnotes for supplementary materials.
¹ HEAD requests are tried first because they're bandwidth-cheap; many CDNs return 403/405 to HEAD (Cloudflare, Akamai), in which case we retry GET with a 1KB Range header. Soft-404 pages that return 200 OK with empty bodies are not yet detected — that is a known limitation, documented in the FAQ.
Six stateless tools covering the full reference-integrity workflow — from extraction to verification to archival recovery.
Verify every link in a reference list
Paste or upload up to 500 URLs, DOIs, or arXiv IDs. Each one is fetched server-side and classified by HTTP status, with Wayback fallback links for anything broken.
Pull URLs from BibTeX or RIS
Drop a .bib or .ris file and extract every URL field into a clean, deduplicated list ready to feed into the URL Checker.
Catch DOIs that resolve to a 404
DOIs can resolve successfully but land on a missing publisher page. We follow the full chain and flag the silent failures.
Best archive.org snapshot per URL
For any URL, find the closest Wayback Machine snapshot to a target date — useful for replacing rotted links in manuscripts.
Compare two reference lists
Diff two bibliographies and see what was added, removed, or changed between manuscript versions or review rounds.
Map sources to archives & repos
Group source URLs by hosting platform (GitHub, OSF, Zenodo, Dataverse, journal) so you can audit reproducibility coverage at a glance.
No. The URL Checker runs a stateless server function — your input is fetched, classified, and discarded once results are returned. Nothing is logged, persisted, or sent to third parties.
Bare DOIs (10.xxxx/...) and arXiv IDs (e.g. arXiv:2301.01234) are auto-prefixed to https://doi.org/ and https://arxiv.org/abs/ respectively before checking. The full doi.org redirect chain is followed, so you see the final publisher URL and its status.
Yes — every failing URL surfaces a one-click 'Find archive' link that opens the Wayback Machine's snapshot history for that URL in a new tab. We don't yet call the Wayback Availability API to embed a specific snapshot; that's planned.
Not in a single request — 500 is the per-batch cap. For larger bibliographies, split the list in two and run them sequentially. The 8-worker concurrency pool finishes a full 500 in well under a minute on average.
Two things. First, classification — we recognize DOIs, arXiv, PubMed, and Wayback URLs and treat them appropriately. Second, recovery — broken results come with archive lookups and Markdown-footnote exports ready to paste into a manuscript.
Not currently. We only inspect HTTP status codes and redirect chains. A page that returns 200 with the publisher's 'article not found' template will be marked OK. Detecting soft-404s requires content scraping and is on the roadmap.
BibTeX/RIS extraction is in the URL Checker today as a paste tab — it pulls URL fields with regex. A proper parser handling exotic entries (the dedicated Citation Extractor tool) is coming.
Not yet. If you have a reproducibility audit workflow that would benefit from one, get in touch via the about page.
Paste your list, get a structured report in under a minute, and an archive link for everything broken.
Open URL Checker