Persistent Links, One Solution to a Common Problem

by David Bigwood

Avoiding the Potholes

Broken links on the Internet are a major problem and getting worse. A recent survey found 7% of links produced a 404 Document Not Found error, a few years ago the survey found 5% had that result. This is a problem for users, Web designers and companies using the Internet. Consider the following scenarios.

A local band has a Web page. Over three years the site has moved three times. Any stationary and business cards they have had printed would have to be changed to reflect the moving site. Old cards and stationary already distributed would now contain the wrong information. Not only that but any links from other sites would be broken. Business could be lost.

A popular Web site, determined by the number of other sites linking to it, changes its URL (Uniform Resource Locator). All those other sites which link to it must now individually change their page. Each site must go through the same process of editing the link. This is a waste of human resources.

A government agency publishes an item in print and also on the Web. It is cataloged by the Library of Congress and the Government Printing Office. The URL is included in the catalog record. The cataloging is distributed across the country only to have the agency change the URL. How to track where the record has gone and which libraries now have a broken link in the catalog?

The solution to each of these problems is the same, to use a Persistent URL (PURL). A PURL would allow the band to keep the same address on their stationary while changing the URL. The sites linked to the popular site could all have their correction made with one change at the PURL resolver. The GPO has no need to track where the bibliographic record has gone, a change at their PURL resolver would fix all links. That is just what they have been doing. This is current technology, established and in use.

The official definition of a PURL

A PURL is a Persistent Uniform Resource Locator. Functionally, a PURL is a URL. However, instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service. The PURL resolution service associates the PURL with the actual URL and returns that URL to the client. The client can then complete the URL transaction in the normal fashion. In Web parlance, this is a standard HTTP redirect.

In less technical terms this means a PURL points to a machine that redirects it to the destination. It uses the HTTP redirect command to accomplish this. The URL pointed to can change while the PURL remains stable. It is a simple solution to a common problem.

One advantage of the PURL resolver is that multiple people can have access to change the URL. That way when the change is first detected the change is made and all others using the PURL are redirected to the new destination.

Before having any access to the resolver it is necessary to register. This is a useful feature, since if someone else finds a PURL that no longer reaches a destination they may let the creator know. After registration one has access to all the features. Access to the PURL is limited to the creator and anyone else they authorize. This security feature helps prevent malicious changing of links.

The ability of more than one person the have access to the PURL is important. That allows anyone of a group when a broken link is discovered to fix it for everybody. A distributed responsibility for maintaining links is reminiscent of the idea of OCLC to have distributed responsibility in cataloging.

Creating a PURL is very simple. After registration, go to the Create a PURL page. Enter your user name, password, the URL being pointed to and the last part of the PURL. Additional maintainers may be added. Once submitted the PURL and associated URL are displayed. The URL is an active link, so it is possible to check that it is the correct page. The PURL is created when the confirm button is hit.

The PURL resolver does more than just redirect requests and allow the creation of PURLs. The software allows searching for registered users and PURLs, validation of PURLs, and modification of existing PURLs. There is also a FAQ, instructions for joining the PURL-L mailing list, a selected list of other PURL resolvers and instructions and details.

The software is available for download. If a site administrator needs the ability to create and manage many PURLs this may be the preferred method. The software is UNIX and requires a basic knowledge to install and configure.This will not be the ultimate solution to the problem of broken links.

That awaits the implementation of URNs. But, until that time this is an easily implemented solution that is available to us now.

Further Information

The Home PURL resolver is located at: purl.oclc.org

Gardner, Elizabeth. “Keeping Users Hot on Your Site’s Trail” WebWeek 2(6) (May 20, 1996):48. www.webweek.com/96May20/undercon/webweaver.html or purl.oclc.org/NET/Cite1

Payette, Sandra. “Persistent Identifiers on the Digital Terrain” RLG DigiNews 2(2) (1997). www.rlg.org/preserv/diginews/diginews22.html#Identifiers or purl.oclc.org/NET/Cite2

David Bigwood, Librarian

Bigwood@lpi.jsc.nasa.gov


E-mail me at mfoster@hal-pc.org with any comments you have and tell me what you want to see here.

Back to the Magazine Home Page

Last modified: 1999:06:01