Department of Engineering

IT Services

Link Checker

This facility lets you check a web page to see whether its links lead to valid targets. It checks 1 link per second, so for some pages output may be slow - I suggest you do something else while you're waiting! For help with interpreting the output, see the notes at the bottom of the page.

Options



,

Notes

The link-checker depends on the web page being written in correct HTML - it will sometimes silently fail to check the rest of a file once it finds an HTML inaccuracy. If the list of checked links is smaller than expected, use the HTML-checker that is offered (see the common problems listed on our HTML4 page).

The checker reports several types of problems, some of which may be transient. It displays in red the problems it considers important. The numbers below refer to the HTTP codes associated with problem situations.

  • the server doesn't exist
  • 500 - the server machine exists, but doesn't respond. The machine might no longer be a web server.
  • 404 - the server exists, but the page doesn't. This is the standard "broken link" situation
  • 403 - the page exists, but you aren't allowed to read it
  • 301 - temporary forwarding (redirection). Temporary forwardings are sometimes installed prior to an old URL disappearing. The new page it leads to might be the new version of the page (so link directly to that) or a generic warning page (so remove the link).
  • 302 - permanent forwarding (redirection). No change required, probably
Note that
  • there are some problems with links that the link-checker can't warn you about. For example, a page that you link to might not disappear, but its subject matter might completely change. So beware.
  • the program doesn't check pages that are protected from automated reading by "robots exclusion rules". You'll need to check those links manually.
  • If you get a
       Error: 500 Can't locate object method "new" via package "LWP::Protocol::https::Socket" 
    message you're probably trying to check a page that is in some way protected from general access.