|
| ||||||||||
|
| ||||||||||
|
|
This page contains detailed hints for configuring Linklint for web sites that needs to make use of most of the features of Linklint. Some sections here may not be applicable to your site. These hints are intended as suggestions to help you quickly get started checking links. Your mileage my vary.
- Use the /@ linkset to check your entire site.
- Use -limit NNN to check more than 500 HTML files.
- Always use the -doc dir option.
- Consider using the -docbase option if you are doing a local site check.
- Read all the documentation.
Create a Command File |
Before checking your site, take the time to put some information about your site in a command file. This will avoid a lot of retyping (and possible typos) later on. It often makes sense to name the command file after your host name. The command file should look like:# general command file for hostname -host www.hostname.com -root /absolute/path/to/your/htmlrootdirectory -doc linkdoc -http -limit 1000Resist the temptation to include any linksets in this command file. The reason will become clear when you start tracking down broken links. If you need to use a large list of linksets, another option is to include these in their own separate command file.You can check your home page with linklint @hostname. You can check your entire site with linklint @hostname /@.
Often the easiest way to understand some of the many features that Linklint has to offer is to try them out. Linklint is very fast and it is easy to try play around with it on just a few pages. Start with a simple command file like the example above and then add features and options as needed. In the spirit of Perl, Linklint has been designed to "do the right thing".
Resolving Memory Problems |
Here are some things you can do to reduce the amount of memory that Linklint uses.
Check Your Site in Sections If you have a very large site (thousands of pages), it might make sense to check your site up into several sections and check each section separately. One way to do this is to check all the files in the root directory and then check files in each subdirectory.
Note: Linklint is designed so that all links between the sections will be checked correcty. Currently, the output files for each section will not be merged.
Create a command file named root for checking the root directory:
# root directory command file @hostname -doc rootdoc /#For each subdirectory (or group of subdirectories) create a command file name subdir:# command file for subdir @hostname -doc subdirdoc /subdir/@Now you can check just your root directory with linklint @root and each subdirectory with linklint @subdir and the results will be kept in separate output directories.
Run Linklint Twice If you have a large site, don't use the -net when you are checking your site. Instead, after you check your site (without the -net command) run Linklint again as:
linklint -doc doc_dir @@
You will end up with the same results as with a single pass of Linklint but the memory requirements will have been eased.
Use the -no_anchor option Since there are often many named anchors on a single page, the list of named anchors that Linklint generates and checks can be larger than the list of HTML pages. You can use the -no_anchors option to tell Linklint to ignore named anchors which should reduce memory consumption.
Add Passwords |
If you get warning messages that say need password for "realm", you will have to provide Linklint with a username and password for each password protected realm. Add these lines to your hostname file:-password "realm1" username1:password1 -password "realm2" username2:password2The realms are double quoted in the warning messages. You will have to use double quotes in the command file if the realm contains any space characters. You can also use the realm "DEFAULT" to provide a default username and password. The default will be tried only if a password for the specific realm was not given. Once you have made these changes to your command file, check the site again to make sure that you entered all the information correctly. You will get warning messages for invalid username/password combinations.Note: The HTTP protocol uses a named realm to identify a set of pages that share a common set of username/password combinations. This system was created so that visitors only need to be prompted for their username and password once (per session) in order to browse any number of pages in a given realm. Realms are often used to protect all the files under a particular subdirectory, but they can be used in other configurations.
Add Server-Side Image Maps |
If your site makes use of server-side images maps, you may have to add a -map option to your command file so Linklint knows how to find your .map files. See Server-Side Image Maps for a detailed explanation. You may have to add one of the following lines to your hostname file:-map /cgi-bin/imagemap -map /cgi-bin/imagemap.exe -map /cgi-bin/htimageYou will also need to have the -root directory specified so Linklint knows where to look for map files locally on your machine.
Tracking Down Errors |
Sometimes the error messages generated by Linklint do not provide sufficient information for figuring out why an error was reported. In these cases it can be useful to look at the HTML tags that caused the errors. One way to see these tags is to use the -db3 flag. This flag causes all HTML tags that contain links to be printed out followed by the fully expanded links.Here is one strategy for tracking down errors:
If you use this technique frequently, you can avoid repeated typing by making a debug command file:
- Look in the errorF.txt or errorX.txt file to find the file that caused the error. Let's call its full (URL) path: /some/file.html
- Run linklint @hostname /some/file.html -db3 -doc dbdoc
This will cause all the tags containing links in /some/file.html to be printed out in dbdoc/log.txt.- Examine the dbdoc/log.txt file to see the HTML tags found by Linklint and the links that were extracted from these tags.
# debug command file @hostname -db3 -doc dbdocYou can use this file to debug an HTML page with the command:
linklint @debug /some/file.html.
Server Redirection |
One of the worst causes of confusion in debugging broken links is server redirection. Some http servers are programmed to deliver a different page than the one a visitor asks to see.The most benign form of redirection is when the server program sends back a moved status code (301 or 302), telling the browser that the page requested has moved, along with a new url. Linklint follows these links and reports all moved urls in the file mapped.txt.
Sometimes a server is programmed to serve up the contents of a page that is different from the page requested without giving any hints to the browser (or to Linklint) that a switch has been made. Take a simple example where fileA.html is mapped to fileB.html. Linklint will tell you fileA.html is missing whenever fileB.html is missing even if fileA.html exists!
Since the server is not providing any clues that this switch has been made, there is nothing Linklint can do to alleviate the situation. I can only suggest that you minimize your use of this type of server redirection and familiarize yourself with which links on your site have been mapped this way.
|
| ||||||||||
|
| ||||||||||
|
|
© Copyright 1997 - 2001 James B. Bowlin |