Original Page:
Website Verification - Profile
Profile contains settings for website verification and last verification results. Select Action -> Edit profile from the main menu to bring up the profile window.
STARTING URLS
Starting URL:
Address of a web page that serves as a starting point for
the verification process.
Note: Starting URLs can be imported from a text file with a list of URLs or from an HTML file. To import starting URL from file, click on the Import button, then select the source type (List of URLs or HTML file), and then select the source file in the window that appears.
Retrieve links from the specified page only:
If this option is selected, Web Link Validator will check the links within
the starting URL(s) only.
Verify external links:
This setting determines the program's behavior concerning the external links
validation (the links located on other servers or above the starting page in
the directory tree). If this option is selected, the program will verify
the external links.
Case sensitivity:
Select this option when case-sensitivity is important. Usually,
activating this option makes sense only when the website is hosted
on a Unix-like or similar operating system.
Spell checker:
Toggles the built-in spell checker to spell-check words within all found
internal pages. If any spelling errors are found, they will be reported on the link
properties pane and in the report.
INTERNAL LINKS
Internal links are the links located within the Starting URL's folder or its subfolders.
Example:
Starting URL:
http://www.mywebsite.com/folder/
The following links are considered external:
http://www.anothersite.com
http://www.google.com
And also:
http://www.mywebsite.com
http://www.mywebsite.com/page1.htm
These links are EXTERNAL since they are "outside" the starting URL's area, i.e. they are not contained within the http://www.mywebsite.com/folder/ folder or its subfolders.
The following links are considered internal:
http://www.mywebsite.com/folder/page1.htm
http://www.mywebsite.com/folder/page2.htm
...and so on.
These links are INTERNAL since they are "inside" the starting URL's area, i.e. they are contained within the http://www.mywebsite.com/folder/ folder or its subfolders.
Internal Links in Web Link Validator
The Internal Links settings define links and masks to be considered as internal.
The Starting URLs data defines the starting point to start the validation at and the links to be validated. The program automatically processes the links located within the Starting URL's folder or its subfolders (internal links) and automatically skips full processing the links located outside the Starting URL's folder or its subfolders (external links).
To validate links that are outside the Starting URL's folder or its subfolders, these links or their masks should be listed in the internal links area, which is marked as "Treat external link as internal if it matches one of the following masks (one per line)".
To exclude specific links from the list of internal links that you have just specified above, list such links or their masks in the external links area, which is marked as "...but treat link as external anyway if it matches one of the following masks (one per line)".
Notes:
To specify masks, you may use special characters, such as:
'*' - stands for any text;
'?' - stands for any character.
Example:
Starting URL:
http://www.mywebsite.com/folder2/
Treat an external link as internal only if it matches one of the following masks (one per line):
*www.mywebsite.com/folder/*
...but treat link as external anyway if it matches one of the following masks (one per line):
*www.mywebsite.com/folder/temp/*
The program treat these links as internal:
http://www.mywebsite.com/folder/
http://www.mywebsite.com/folder2/
http://www.mywebsite.com/folder/page.htm
Yet the links that in their URL contain www.mywebsite.com/folder/temp/ shall be considered as external and therefore will not be fully processed by the program.
AUTHENTICATION
If your site contains pages that require a user name and password, enter them here. When Web Link Validator encounters a page that requires authentication it will automatically use the user name and password you provided.
These settings work only for the web server-based authentication. If your website uses the HTML-based authentication - you may use the POST Method in the Starting URLs section.
Note: Only one user name and password can be used for the entire site.
Related topics:
How to Check Password Protected Websites
How to Check Password Protected Websites with HTML-based Authentication
HTML ANALYSIS
Searching for links within the following tags - Only selected HTML tags will be analyzed. If some tags are not marked, links found in them will be ignored.
SCRIPT ANALYSIS
Allows to find links inside scripts - like JavaScript or CSS - and non-HTML documents, such as Word, PDF, Excel, etc.
Search for links in JavaScript:
Attempts to find links in JavaScript inside each page's source code.
Analyze DHTML Events - Attempts to find links in event-handling commands like onClick, onMouseOver etc.
Analyze parameter "VALUE=..." - Attempts to find links in the value parameter. Applicable to HTML tags: option, input, param.
Analyze <SCRIPT>...</SCRIPT> sections - Attempts to find links in code enclosed in the <SCRIPT> and </SCRIPT> tags. By default, the application analyzes both code contained within the main web page and code in external JavaScript files. To disable analyzing external JavaScript files, drop the option Download .JS files and search for links.
Search for links in CSS, <STYLE>...</STYLE> section:
Attempts to find
links in cascaded style sheets code and style formatting enclosed in
the <STYLE> and </STYLE>. Please note: By default, the application both style code contained within the main web page
and code in external style sheet files. To disable analyzing external
CSS files, drop the option Download .CSS files and search for links
.
Search for links in Adobe Flash files:
Attempts to find links in found Adobe Flash files.
Search for links in XML/RSS
Attempts to find links in found XML/RSS files.
Search for links in DOC
Attempts to find links in found Word documents.
Search for links in RTF
Attempts to find links in found rich-text files.
Search for links in PDF
Attempts to find links in found Adobe Acrobat files.
Search for links in XLS
Attempts to find links in found Excel spreadsheets.
VERIFY LINKS
In this section of the profile settings you can set additional links verification parameters.
Verify external links:
This setting determines the program's conduct in relation to the external
links verification (the links placed on other servers or above the starting
page in the directory tree). If this item is marked, the program will verify
the external links.
Mark links found in the <FORM> tag as "Unsupported":
This setting determines the program's conduct in relation to the links to
executable files placed in the <FORM> tag parameters. Usually it's a link to
the script that processes some kind of form. If this item is marked, the
program will define these links as Unsupported.
Waiving verification for links matching one of the following masks:
This setting allows you to specify the file masks that won't be verified by
the program. But these links will be added to the list as Not verified.
Verify links matching one of the following masks:
The links appropriate to these masks will be verified by the program.
Notes:
To specify masks, you may use special characters, such as:
'*' - stands for any text;
'?' - stands for any character.
Examples:
The mask "http://www.site.com/part/*.zip" excludes the verification of the link "http://www.site.com/part/one.zip", but doesn't exclude the verification of the link "http://www.site.com/part/one.pdf".
The mask "*/cgi-bin/enter.cgi" excludes the link "http://site.com/cgi-bin/enter.cgi", but doesn't exclude the link "http://site.com/enter.cgi".
EXCLUDE LINKS
Use the Exclude Links section to tell Web Link Validator which links or sections of your website are to be ignored during the verification.
Exclude all external links
Exclude
all links that are outside the server specified in starting page or placed in
the higher level directories of the same server.
Example:
With the starting URL http://www.example.com/products/, links to all folders other than "products" and other domains will be excluded from the verification.
Exclude links matching one of the following masks - Exclude these links from the verification.
Example:
*/images/* - will exclude all links that contain the word "images".
Do not exclude links matching one of the following masks - This field allows you to include URLs that would normally be excluded from the verification as specified in the Exclude links matching... field.
Example:
The "/images/"; folder and all of its subfolders, according to the rule above, are to be excluded from the verification. However, if you want to keep the "/images/new/" subfolder verified, you can enter the following mask in this field: */images/new/*.
DIRECTORY INDEX
Use settings in this section to avoid duplicate checking of an index file by removing its name from the URL. For example, the URLs "http://www.myhost.com/" and "http://www.myhost.com/index.html" have different syntax but link to one and the same file. So, we can remove the 'index.html' substring from the second URL without losing any information.
Note: This feature applies to internal links only.
Examples:
index.html
index.php
default.asp
ERROR PAGES
Parameters in this section help identify broken links in cases where the server displays an error page (or redirects to such a page), but returns incorrect error code. This may happen when custom error pages are used.
Note: This feature applies to internal links only.
Mark link as broken if it redirects to an URL matching one of these masks - In the field below, enter masks of files that are custom error pages. When the application hits a link that points to one of such files, it will mark that link as broken.
Example:
*example.com/error/* - Mark link as broken if it points or redirects to any file in the error folder on this domain and its sub-domains.
Mark link as broken if its HTML source code matches any of these masks - If the HTML source code of the page contains any phrases from the field below, the link will be marked as broken.
Example:
*404 Not Found* - Mark link as broken if the HTML source code of the page contains this string.
LIMITS
This allows to set limits on the exploration range.
Maximum exploration depth - Number of levels below the starting page the application is allowed to explore. The zero value means depth is not restricted.
Maximum number of links - Total number of links the application is allowed to process.
Non-HTML files - Total number of non-HTML files (e.g. external JavaScript, Cascaded Style Sheets, Flash animation, etc.) of size not greater than defined in the settings the application is allowed to download and analyze.
Note: To edit the value, click on the edit box and then enter the necessary number in it or, alternately, click on the up or down arrow by the edit box until the desired value is shown in it.
PAGE OPTIMIZATION
This group of settings allows to define, which pages are to be marked as too slow, large, new, old or deep.
Slow pages - Pages that exceed this number of kilobytes (by both HTML code and images) will be marked as slow.
Small pages - Pages that are smaller than this number of kilobytes in size will be marked as small.
New pages - Pages that were last modified within this number of days will be marked as new.
Old pages - Pages that were NOT modified within this number of days will be marked as old.
Deep pages - Pages with the number of clicks from starting page greater than this will be marked as deep.
Note: To edit the value, click on the edit box and then enter the necessary number in it or, alternately, click on the up or down arrow by the edit box until the desired value is shown in it.
RESOURCE UTILIZATION
Parameters in this section help to decrease the system resources required by the program (especially the RAM), and speed-up program operation. These features become extremely valuable when checking very large sites or sites containing numerous reciprocal links.
These settings specify which data should be recorded in the process of website verification.
Save the information on bookmarks - If bookmark information is not required, disable this option to speed up operation and save RAM.
ORPHAN ANALYSIS
This section contains Orphan Analysis settings to identify files on your web server that are not in use. These files needlessly occupy valuable disk space and can be deleted.
The program compares files found in directories on the local/network computer or FTP server with URLs (traced in the process). Files that do not belong to any page are flagged as orphaned.
Files become orphaned because of broken links with URLs. The orphaned files can be repaired either by fixing the links directed at them, or by creating such links in other files.
Related topic:
How to Perform Orphan Analysis
PAGE RULES
Page Rules are designed for evaluating pages against certain conditions and confirming the absence or presence of specific display text, tag text, links, scripting, forms, etc. generated by your code. For example, you can test your website's pages to find out whether each one of them contains contact information.
Page Rule Groups let you apply individual sets of page rules to different sections of your website. For instance, you can verify whether different product page titles contain the corresponding product names, i.e. the 'Product One' text must appear on the pages, which URLs begin with http://website/prod1, and the text 'Product Two' must appear on the pages, which URLs begin with http://website/prod2.
By default, page rules within a group will apply to all pages of the website. To limit a certain page rule group's applicable area, use the 'Exclude...' and 'Do not exclude...' settings.
Notes:
To quickly disable a group without deleting it, just add the '*' sign to the 'Exclude...' setting.
To apply a certain group's page rules to a certain page or section of the website: first, exclude all the pages by adding the '*' to the 'Exclude...', and then insert the pages to be evaluated to 'Do not exclude...', e.g. "http://www.relsoftware.com/wlv/*".
Related topic:
How to Find Specific Text on the Website
How to Use Web Link Validator as Reciprocal Link Checking Tool
HTML SYNTAX
This set of options enables analyzing HTML tags and finding those that contain errors.
Enable - enables verifying the validity of HTML tags. Other options on this page are only available when this option is enabled.
<IMG> tag without ALT attribute - finds all <IMG> tags without ALT attribute.
<IMG> tag with blank ALT attribute - finds all <IMG> tags with blank ALT attribute.
<IMG> tag without HEIGHT/WIDTH attribute - finds all <IMG> tags without HEIGHT/WIDTH attribute.
<INPUT TYPE=IMAGE> without ALT attribute - finds all <INPUT TYPE=IMAGE> tags without ALT attribute.
<INPUT TYPE=IMAGE> tag with blank ALT attribute - finds all <INPUT TYPE=IMAGE> tags with blank ALT attribute.
<INPUT TYPE=IMAGE> tag without HEIGHT/WIDTH attribute - finds all <INPUT TYPE=IMAGE> tags without HEIGHT/WIDTH attribute.
<A> tag without ALT attribute - finds all <A> tags without ALT attribute.
<A> tag with blank ALT attribute - finds all <A> tags with blank ALT attribute.
<A> tag without TITLE attribute - finds all <A> tags without TITLE attribute.
<A> tag with blank TITLE attribute - finds all <A> tags with blank TITLE attribute.
<A> tag with missing HREF/NAME attribute - finds all <A> tags with missing HREF/NAME attribute.
<A> tag with blank HREF/NAME attribute - finds all <A> tags with blank HREF/NAME attribute.
<A> tag with blank comment - finds all <A> tags with blank comment; e.g., <a href="http://www.example.com/"></a>.
REPLACE LINKS
Use these settings to replace certain links before the verification. For
example, you may want to replace links like
"http://www.example.com/redirect.aspx?redirURL=http://www.example.com/news.aspx"
with
"http://www.example.com/news.aspx" to avoid the redirect when
validating the link.
AUTO
These settings are used to simplify working with the command line when the Auto mode is on. The Auto mode is activated when "/auto" key is specified in the command prompt. The corresponding command line parameters have a higher priority than their counterparts in these subsections.
Related topic:
How to Get the Report Emailed Automatically on a Daily Basis
REPORT
These options are to define report filename, location and information to be included in report.
Report filename - Specify name of the file where the report is to be saved. File name along with folder path can be typed in, selected on the drop-down menu or (most easy) browsed to by clicking on the Browse button by the report filename field.
Use personal reports list for this profile - Allow selecting information to be included in report. This list can be adjusted on the report setup screen when generating a report.
Exclude
Exclude links from report if their error descriptions match at least one mask
specified in this box.
To edit masks in the exclude box, simply click on the box and edit the content
as necessary.
ADVANCED
Use these options to fine-tune selected profile's settings.
Load limit - Defines how many links per second the application is allowed to process. Select the Use personal limit settings... option and then select the desired limit on the drop-down menu below.
Disable cookies - Select this option to emulate a browser with disabled cookies.
Session ID - If the website adds session ID to its links' addresses, you may end up having different URLs pointing to the same pages. To avoid that, enter the session identifier in this field, and Validator will ignore the session value when checking these pages.
Example:
For the URL http://www.example.com/page.asp?SID=12345 enter "SID" in the Session ID field, and the software will read this link as http://www.example.com/page.asp.
Setup URLs - URLs to be visited for obtaining credentials necessary accessing the website to be checked. For example, if you need to get a cookie required for enabling links on the "Starting URLs" pages. You may also need this feature if the site uses complex HTML-based authentication scheme.
Notes:
Copyright © 2001—2008 REL Software. All rights reserved.