WebLech URL Spider

Free and open source download and mirroring tool created in Java
Download

WebLech URL Spider Ranking & Summary

Advertisement

  • Rating:
  • License:
  • MIT
  • Price:
  • FREE
  • Publisher Name:
  • WebLech URL Spider Team
  • Publisher web site:
  • Operating Systems:
  • Mac OS X
  • File Size:
  • 172 KB

WebLech URL Spider Tags


WebLech URL Spider Description

Free and open source download and mirroring tool created in Java WebLech is a free and open source, fully featured web site download/mirror tool created in Java, that supports many features required to download websites and emulate standard web-browser behavior as much as possible. WebLech is multithreaded and comes with a GUI console. Here are some key features of "WebLech URL Spider": · Open Source MIT Licence means it's totally free and you can do what you want with it · Pure Java code means you can run it on any Java-enabled computer · Multi-threaded operation for downloading lots of files at once · Supports basic HTTP authentication for accessing password-protected sites · TTP referrer support maintains link information between pages (needed to Spider some websites) Lots of configuration options: · Depth-first or breadth-first traversal of the site · Candidate URL filtering, so you can stick to one web server, one directory, or just Spider the whole web · Configurable caching of downloaded files allows restart without needing to download everything again · URL prioritization, so you can get interesting files first and leave boring files till last (or ignore them completely) · Checkpointing so you can snapshot spider state in the middle of a run and restart without lots of processing. Requirements: · Java What's New in This Release: · Added classification of URLs as "interesting" or "boring" by simple · string matching. Interesting URLs are downloaded in preference to · boring ones. · Separated Spider from the UI, which is now in ui/TextSpider. · Added checkpointing and resume functionality, so the spider can be · killed and restarted without doing lots of processing. · Fixed URL retrieval so fragments (URLs with a # in them) are not · treated as a new URL.


WebLech URL Spider Related Software