WWW::CheckSite::Manual

A description of the metrics used in this package
Download

WWW::CheckSite::Manual Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Perl Artistic License
  • Price:
  • FREE
  • Publisher Name:
  • Abe Timmerman
  • Publisher web site:
  • http://search.cpan.org/~abeltje/WWW-CheckSite-0.018/lib/WWW/CheckSite/Spider.pm

WWW::CheckSite::Manual Tags


WWW::CheckSite::Manual Description

A description of the metrics used in this package WWW::CheckSite::Manual offers a description of the metrics used in this package. The idea behind this package is to provide an analysis of items contained in a web-site. We use the word kwalitee because it looks and sounds like quality but just isn't. The metrics used to assess kwalitee only give an indication of the technical state a web-site is in, and do not reflect on the user experience of quality of that web-site. At the heart of the package is the spider that fetches all the pages referred to within the web-site. For each page that is fetched a number of things is checked. Here is an explanation of the kwalitee metrics: * return status The most basic check for a web-page is to see if it can be fetched. The HTTP return-status should be 200 OK. SCORE: 0 for return status other than 200; 1 for return status 200 * title The next check is to see if the tag-pair has content. SCORE: 0 for not content; 1 for content * valid The next check is to see if the (X)HTML in the page validates. The default behaviour is to use the validator available on http://validator.w3.org SCORE: 0 for not valid, 1 for valid or validation disabled * links The next check is to see if the web-page does not contain "dead links". All hyperlinks (, _fcksavedurl=>, ) are checked with a HTTP HEAD request to see if they can be "followed". URLs that have the same origin as the primary url will also be put on the "to-fetch-list" of the spider. MAX SCORE: 1 (do not count urls excluded by robot-rules/exclude pattern) * images The next check is to see if the web-page does not contain "dead images". All images (, _fcksavedurl=>, ) are checked with a HTTP HEAD request to see if they exist on the server. If the Image::Info module is available, the image is fetched from the server and a basic sanity test on the image is done. MAX SCORE: 1 (do not count images excluded by robot-rules/exclude pattern) * styles The next check is to see if the web-page does not contain "dead style references". All styles referenced in are fetched and if validation is switched on, they will be sent to the css-validator at: http://jigsaw.w3.org/validator TODO: Extract inline styles, and send them of for validation. MAX SCORE: 1 kwalitee Every individual page can have a maximum of 6 kwalitee points that lead to a kwalitee of 1.00. For the complete web-site the mean of the page scores is taken and presented as a fraction of 1. Requirements: · Perl


WWW::CheckSite::Manual Related Software