Templates /
Scraping Broken Links with SEO Spider and Ahrefs

Scraping Broken Links with SEO Spider and Ahrefs

A step-by-step guide to quickly scraping broken links which you can contact the site owner about and get your link listed instead.
Initial setup:
Finding seed broken links with Google:
Create a new Google Sheet template
Create a list of search terms
Google your first search term
Search each resulting URL in SEO Spider
Note all broken links in your sheet
Repeat until you have your target
Finding sites hosting these broken links with Ahrefs:
Search the link in Ahrefs Site Explorer
Check the backlinks pointing to the broken URL
Add these backlinks to 'Also linked from' columns
Repeat for every seed backlink you found in the first section
Finding the site owner's contact information:
Fill in the site owner's email addresses

Initial setup:

You’re going to need:

Create a new Google Sheet template

Use this template: https://docs.google.com/spreadsheets/d/1jQirKt3N1seiV-2KWHcTXI2iDywyg0U_IxggtQWe1E0/pub?output=csv

Make a copy in your own account.

You’ll notice it has 2 tabs, one for process and one for unprocessed.

You will be working first in the unprocessed tab, dumping in the information quickly before moving it into the processed section and completing it with email address and other specifics.

Feel free to edit your copy of the template so it includes more branched links (instances of a single broken URL found linked in multiple places on the web).

Attached below is the template in case you’d rather use it in Excel.

Create a list of search terms

Your search terms should be based on the content you want to give backlinks to, so should be relevant. If you’re trying to rank your productivity app, for example, you might want to look for lists of productivity tools and see if any of the tools have been discontinued. For this, you’ll build a list of potential queries like this:

  • intitle:"productivity apps"
  • “productivity apps” site:.org intitle:”other resources” -inurl:pdf -inurl:ppt -inurl:doc
  • "productivity apps" resources
  • "productivity apps" site:.edu -inurl:pdf -inurl:ppt -inurl:doc
  • "productivity apps" site:.org -inurl:pdf -inurl:ppt -inurl:doc
  • "productivity apps" site:.com -inurl:pdf -inurl:ppt -inurl:doc

Read more about queries for broken link building here — they suggest queries such as:

  • keyword "resources"
  • keyword "suggested sites"
  • keyword "links"
  • keyword  intitle:links
  • keyword  intitle:resources
  • keyword  intitle:recommended sites
  • site:.gov keyword  "resources"
  • site:.edu keyword  "links"
  • site:.co.uk keyword  "suggested sites"
  • site:.com.au keyword  "recommended sites"

Google your first search term

Broken link scraping is about covering a ton of ground, not about being careful. Get to it by shoving your first query into Google, opening the first page up and copying the URLs into a text document.

For a quicker method, put your query in Ninja SEO Tools’ Google URL Scraper and get all of the results as a text file for less time spent copy-pasting.

Search each resulting URL in SEO Spider

Now you should have several Google results (99 in one go if you used the Ninja SEO Tool I mentioned).

Open up your list next to SEO Spider, and start pasting in.

Make sure you have SEO Spider configured in ‘Spider’ mode, and select the ‘4xx’ filter from the righthand menu:

Repeat until you have your target

Around 100 seed links should do for a campaign.

Finding the site owner’s contact information:

Fill in the site owner’s email addresses

Now you’ll be left with a sheet with hundreds of broken links, it’s time to move them over to the ‘Processed’ sheet and find the contact details for every site owner with a broken link you want to replace.

For every broken link, make a new column in the Processed tab and build it out until it’s done. Then, use the process I’m going to link below to find their email addresses:

How to Find Almost Anyone’s Email Address

After you’ve filled in the emails, go ahead and progress to the next template:

Broken Link Building: Outreach

Take control of your workflows today.