Website Verification - Profile
Profile contains settings for website verification and last verification results. Select Action -> Edit profile from the main menu to bring up the profile window. STARTING URLS Starting URL: Note: Starting URLs can be imported from a text file with a list of URLs or from an HTML file. To import starting URL from file, click on the Import button, then select the source type (List of URLs or HTML file), and then select the source file in the window that appears.
Retrieve links from the specified page only:
Verify external links:
Case sensitivity:
Spell checker:
INTERNAL LINKS Internal links are the links located within the Starting URL's folder or its subfolders. Example:
Starting URL:
The following links are considered external:
And also: These links are EXTERNAL since they are "outside" the starting URL's area, i.e. they are not contained within the http://www.mywebsite.com/folder/ folder or its subfolders.
The following links are considered internal: These links are INTERNAL since they are "inside" the starting URL's area, i.e. they are contained within the http://www.mywebsite.com/folder/ folder or its subfolders. Internal Links in Web Link Validator The Internal Links settings define links and masks to be considered as internal. The Starting URLs data defines the starting point to start the validation at and the links to be validated. The program automatically processes the links located within the Starting URL's folder or its subfolders (internal links) and automatically skips full processing the links located outside the Starting URL's folder or its subfolders (external links). To validate links that are outside the Starting URL's folder or its subfolders, these links or their masks should be listed in the internal links area, which is marked as "Treat external link as internal if it matches one of the following masks (one per line)". To exclude specific links from the list of internal links that you have just specified above, list such links or their masks in the external links area, which is marked as "...but treat link as external anyway if it matches one of the following masks (one per line)".
Notes:
Example:
Starting URL:
Treat an external link as internal only if it matches one of the following masks (one per line):
...but treat link as external anyway if it matches one of the following masks (one per line):
The program treat these links as internal: Yet the links that in their URL contain www.mywebsite.com/folder/temp/ shall be considered as external and therefore will not be fully processed by the program. AUTHENTICATION If your site contains pages that require a user name and password, enter them here. When Web Link Validator encounters a page that requires authentication it will automatically use the user name and password you provided. These settings work only for the web server-based authentication. If your website uses the HTML-based authentication - you may use the POST Method in the Starting URLs section. Note: Only one user name and password can be used for the entire site. Related topics: HTML ANALYSIS Searching for links within the following tags - Only selected HTML tags will be analyzed. If some tags are not marked, links found in them will be ignored. SCRIPT ANALYSIS Allows to find links inside scripts - like JavaScript or CSS - and non-HTML documents, such as Word, PDF, Excel, etc. Search for links in JavaScript: Analyze DHTML Events - Attempts to find links in event-handling commands like onClick, onMouseOver etc. Analyze parameter "VALUE=..." - Attempts to find links in the value parameter. Applicable to HTML tags: option, input, param. Analyze <SCRIPT>...</SCRIPT> sections - Attempts to find links in code enclosed in the <SCRIPT> and </SCRIPT> tags. By default, the application analyzes both code contained within the main web page and code in external JavaScript files. To disable analyzing external JavaScript files, drop the option Download .JS files and search for links. Search for links in CSS, <STYLE>...</STYLE> section: Search for links in Adobe Flash files: Search for links in XML/RSS Search for links in DOC Search for links in RTF Search for links in PDF Search for links in XLS VERIFY LINKS In this section of the profile settings you can set additional links verification parameters.
Verify external links:
Mark links found in the <FORM> tag as "Unsupported":
Waiving verification for links matching one of the following masks:
Verify links matching one of the following masks:
Notes:
Examples: The mask "http://www.site.com/part/*.zip" excludes the verification of the link "http://www.site.com/part/one.zip", but doesn't exclude the verification of the link "http://www.site.com/part/one.pdf". The mask "*/cgi-bin/enter.cgi" excludes the link "http://site.com/cgi-bin/enter.cgi", but doesn't exclude the link "http://site.com/enter.cgi". EXCLUDE LINKS Use the Exclude Links section to tell Web Link Validator which links or sections of your website are to be ignored during the verification. Exclude all external links Example: With the starting URL http://www.example.com/products/, links to all folders other than "products" and other domains will be excluded from the verification. Exclude links matching one of the following masks - Exclude these links from the verification. Example: */images/* - will exclude all links that contain the word "images". Do not exclude links matching one of the following masks - This field allows you to include URLs that would normally be excluded from the verification as specified in the Exclude links matching... field. Example: The "/images/"; folder and all of its subfolders, according to the rule above, are to be excluded from the verification. However, if you want to keep the "/images/new/" subfolder verified, you can enter the following mask in this field: */images/new/*. DIRECTORY INDEX Use settings in this section to avoid duplicate checking of an index file by removing its name from the URL. For example, the URLs "http://www.myhost.com/" and "http://www.myhost.com/index.html" have different syntax but link to one and the same file. So, we can remove the 'index.html' substring from the second URL without losing any information. Note: This feature applies to internal links only. Examples: ERROR PAGES Parameters in this section help identify broken links in cases where the server displays an error page (or redirects to such a page), but returns incorrect error code. This may happen when custom error pages are used. Note: This feature applies to internal links only. Mark link as broken if it redirects to an URL matching one of these masks - In the field below, enter masks of files that are custom error pages. When the application hits a link that points to one of such files, it will mark that link as broken. Example: *example.com/error/* - Mark link as broken if it points or redirects to any file in the error folder on this domain and its sub-domains. Mark link as broken if its HTML source code matches any of these masks - If the HTML source code of the page contains any phrases from the field below, the link will be marked as broken. Example: *404 Not Found* - Mark link as broken if the HTML source code of the page contains this string. LIMITS This allows to set limits on the exploration range. Maximum exploration depth - Number of levels below the starting page the application is allowed to explore. The zero value means depth is not restricted. Maximum number of links - Total number of links the application is allowed to process. Non-HTML files - Total number of non-HTML files (e.g. external JavaScript, Cascaded Style Sheets, Flash animation, etc.) of size not greater than defined in the settings the application is allowed to download and analyze. Note: To edit the value, click on the edit box and then enter the necessary number in it or, alternately, click on the up or down arrow by the edit box until the desired value is shown in it. PAGE OPTIMIZATION This group of settings allows to define, which pages are to be marked as too slow, large, new, old or deep. Slow pages - Pages that exceed this number of kilobytes (by both HTML code and images) will be marked as slow. Small pages - Pages that are smaller than this number of kilobytes in size will be marked as small. New pages - Pages that were last modified within this number of days will be marked as new. Old pages - Pages that were NOT modified within this number of days will be marked as old. Deep pages - Pages with the number of clicks from starting page greater than this will be marked as deep. Note: To edit the value, click on the edit box and then enter the necessary number in it or, alternately, click on the up or down arrow by the edit box until the desired value is shown in it. RESOURCE UTILIZATION Parameters in this section help to decrease the system resources required by the program (especially the RAM), and speed-up program operation. These features become extremely valuable when checking very large sites or sites containing numerous reciprocal links. These settings specify which data should be recorded in the process of website verification.
Save the information on bookmarks - If bookmark information is not required, disable this option to speed up operation and save RAM. ORPHAN ANALYSIS This section contains Orphan Analysis settings to identify files on your web server that are not in use. These files needlessly occupy valuable disk space and can be deleted. The program compares files found in directories on the local/network computer or FTP server with URLs (traced in the process). Files that do not belong to any page are flagged as orphaned. Files become orphaned because of broken links with URLs. The orphaned files can be repaired either by fixing the links directed at them, or by creating such links in other files. Related topic: PAGE RULES Page Rules are designed for evaluating pages against certain conditions and confirming the absence or presence of specific display text, tag text, links, scripting, forms, etc. generated by your code. For example, you can test your website's pages to find out whether each one of them contains contact information. Page Rule Groups let you apply individual sets of page rules to different sections of your website. For instance, you can verify whether different product page titles contain the corresponding product names, i.e. the 'Product One' text must appear on the pages, which URLs begin with http://website/prod1, and the text 'Product Two' must appear on the pages, which URLs begin with http://website/prod2. By default, page rules within a group will apply to all pages of the website. To limit a certain page rule group's applicable area, use the 'Exclude...' and 'Do not exclude...' settings. Notes: To quickly disable a group without deleting it, just add the '*' sign to the 'Exclude...' setting. To apply a certain group's page rules to a certain page or section of the website: first, exclude all the pages by adding the '*' to the 'Exclude...', and then insert the pages to be evaluated to 'Do not exclude...', e.g. "http://www.relsoftware.com/wlv/*". Related topic: HTML SYNTAX This set of options enables analyzing HTML tags and finding those that contain errors. Enable - enables verifying the validity of HTML tags. Other options on this page are only available when this option is enabled. <IMG> tag without ALT attribute - finds all <IMG> tags without ALT attribute. <IMG> tag with blank ALT attribute - finds all <IMG> tags with blank ALT attribute. <IMG> tag without HEIGHT/WIDTH attribute - finds all <IMG> tags without HEIGHT/WIDTH attribute. <INPUT TYPE=IMAGE> without ALT attribute - finds all <INPUT TYPE=IMAGE> tags without ALT attribute. <INPUT TYPE=IMAGE> tag with blank ALT attribute - finds all <INPUT TYPE=IMAGE> tags with blank ALT attribute. <INPUT TYPE=IMAGE> tag without HEIGHT/WIDTH attribute - finds all <INPUT TYPE=IMAGE> tags without HEIGHT/WIDTH attribute. <A> tag without ALT attribute - finds all <A> tags without ALT attribute. <A> tag with blank ALT attribute - finds all <A> tags with blank ALT attribute. <A> tag without TITLE attribute - finds all <A> tags without TITLE attribute. <A> tag with blank TITLE attribute - finds all <A> tags with blank TITLE attribute. <A> tag with missing HREF/NAME attribute - finds all <A> tags with missing HREF/NAME attribute. <A> tag with blank HREF/NAME attribute - finds all <A> tags with blank HREF/NAME attribute. <A> tag with blank comment - finds all <A> tags with blank comment; e.g., <a href="http://www.example.com/"></a>. REPLACE LINKS
Use these settings to replace certain links before the verification. For
example, you may want to replace links like AUTO These settings are used to simplify working with the command line when the Auto mode is on. The Auto mode is activated when "/auto" key is specified in the command prompt. The corresponding command line parameters have a higher priority than their counterparts in these subsections. Related topic: REPORT These options are to define report filename, location and information to be included in report. Report filename - Specify name of the file where the report is to be saved. File name along with folder path can be typed in, selected on the drop-down menu or (most easy) browsed to by clicking on the Browse button by the report filename field. Use personal reports list for this profile - Allow selecting information to be included in report. This list can be adjusted on the report setup screen when generating a report. Exclude ADVANCED Use these options to fine-tune selected profile's settings. Load limit - Defines how many links per second the application is allowed to process. Select the Use personal limit settings... option and then select the desired limit on the drop-down menu below. Disable cookies - Select this option to emulate a browser with disabled cookies. Session ID - If the website adds session ID to its links' addresses, you may end up having different URLs pointing to the same pages. To avoid that, enter the session identifier in this field, and Validator will ignore the session value when checking these pages. Example: For the URL http://www.example.com/page.asp?SID=12345 enter "SID" in the Session ID field, and the software will read this link as http://www.example.com/page.asp. Setup URLs - URLs to be visited for obtaining credentials necessary accessing the website to be checked. For example, if you need to get a cookie required for enabling links on the "Starting URLs" pages. You may also need this feature if the site uses complex HTML-based authentication scheme. Notes:
|
|
|
|
Copyright © 2001—2008 REL Software. All rights reserved. Privacy Statement | Send Feedback |
