theArchivist

theArchivist - Web-crawler for downloading and archiving web content
Download

theArchivist Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Freeware
  • Price:
  • FREE
  • Publisher Name:
  • Brian Johnson
  • Publisher web site:
  • http://quicksilver.caup.washington.edu/software/upload/
  • Operating Systems:
  • Mac OS X 10.1 or later
  • File Size:
  • 659 KB

theArchivist Tags


theArchivist Description

theArchivist - Web-crawler for downloading and archiving web content theArchivist is a web-crawler for downloading and archiving web content, rewriting absolute links as it goes. Variable "stopping rules" make sure you don't try to archive the whole web!Starting with a particular URL it retrieves the web page, scans it for links, and then attempts to retrieve all files linked to the page.This behavior repeats for each file retrieved and continues until one of several stop criteria is reached.If desired, the application will rewrite absolute URLs relative to the download hierarchy, producing a completely self-sufficient archive.What's New in This Release:Corrected parsing of javascript function references,added php and asp to recognized "html" file extensions,fixed bug that truncated crawls done without the "legal servers" restriction.


theArchivist Related Software