Webglimpse

A web administration interface, remote link spider, and the powerful Glimpse file indexing and query system
Download

Webglimpse Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Freeware
  • Price:
  • FREE
  • Publisher Name:
  • Internet WorkShop
  • Publisher web site:
  • http://webglimpse.net/
  • Operating Systems:
  • Mac OS X
  • File Size:
  • 793 KB

Webglimpse Tags


Webglimpse Description

A web administration interface, remote link spider, and the powerful Glimpse file indexing and query system Webglimpse site search software includes a web administration interface, remote link spider, and the powerful Glimpse file indexing and query system. Add sophisticated search capability to your site.Webglimpse is scalable: index hundreds of remote sites, one small local site, or gigabytes of compressed documents. The code is open, widely used, mature, and actively supported.Requires Unix server to run, can index documents on any server as long as they are accessible via Web or networked drive.Aside from creating a searchable website, Webglimpse can be used for data mining applications, as part of a document management solution, as Glimpse for LXR and other integrated solutions. Webglimpse is a feature-rich search engine that has been used on thousands of sites, all different flavors of Unix, and in countries from Nigeria to Norway, Chile to Canada. Nearly everything about Webglimpse is configurable: how to select the files to index, how to search them, and how to present the results to the user. Yet, we have tried to keep the install simple and quick so that you can get your search up and running today.Glimpse is the powerful indexing and query system inside of Webglimpse. It can also be used as a stand-alone program in a Unix environment. Glimpse and glimpseindex are written in C for speed, while most of the management and interface parts of Webglimpse are written in Perl.NOTE: .EDU, .GOV and most nonprofits and Open Source projects can use Webglimpse free of charge. You do have to link back to http://webglimpse.net/ from somewhere on your site if that is possible. Here are some key features of "Webglimpse": · Search From Any Page of your Site: optional search box can be added to all pages of your site, so that users can search from anywhere. They can also search just the links on the page they are looking at, so that the hits are more likely to be relevent. · Fast Searching: Glimpse builds a keyword index in advance for very fast searching (though it can also access individual files for complex boolean queries). Uncommon words will be found rapidly even in a very large fileset, up to several Gigabytes. Common words (with 100's or 1000's of matches) will take longer, but if the number of hits returned can be limited, even those will be very fast. The core index and search programs are written in C. · Large Data Sets: Glimpse is used to handle data sets up to 9 Gigabytes, to our knowledge. Because of the two-level search design that localizes keywords to a 'block' of data, the index footprint is quite small, typically less than 5% of the total data set size. The speed of the search scales with the number of matches to the keyword and only secondarily with the total size of the indexed data. · Index Local and Remote Pages: Webglimpse is not limited to searching only your own data! The Spider program has flexible rules for gathering pages from remote sites. It can gather all the pages under a specified domain, or traverse a set number of 'hops' from a starting page regardless of domain, or a combination of these rules. You can even make a single archive, searchable from one form, that combines local data on your hard drive and multiple remote sites you specify. · Boolean Expresions, Wildcards, Misspellings & more: Extremely powerful agrep engine allows users to specify partial or whole-word matchies; use regular expressions, boolean combinations, specify number of spelling errors allowed, case-sensitive or insensitive. The administrator can modify the search form to pre-set these values for optimum searching, or allow the user to choose at the time of search. HTML, PDF, Word, other formats: any filetype that can be converted to text can be indexed. On-the-fly conversion saves drive space, or you can create permanent text versions for speed. 3rd-party format converters are easily configured into the system. · Dynamically Generated Pages: PHP and database-driven sites are hard for many indexing programs to handle. Webglimpse allows you to handle both dynamically created and static pages in a single index. Dynamically generated pages are passed through the web server before indexing, so Webglimpse indexes exactly what a browser would see, rather than the source code used to generate the page. · "Neighborhood" Searching: Once your site is indexed, users can search the entire site, any subdirectory, or just the links on the current page. For large sites with many different areas, this is an essential tool to help users quickly find the most relevent results.As the site administrator, you control what options to provide. · Index Any Single-Byte Language: Glimpse can index any language that is single-byte encoded - that includes all European languages such as Spanish, French, etc. Currently we cannot do double-byte encoded languages such as Chinese, or with special rules for word breaks such as Thai. By using the included filter program, the special ¨aut; characters are correctly indexed whether they appear as html character entities or actual upper-ascii characters. What's New in This Release: · Webglimpse 2.18.8 times out slow sites much more aggressively in order to speed up the spider. (Timeout is now 5 sec, modify in /wglib/wgAgent.pm )


Webglimpse Related Software