Sherlock HolmesA universal search engine. | |
Download |
Sherlock Holmes Ranking & Summary
Advertisement
- License:
- GPL
- Price:
- FREE
- Publisher Name:
- Martin Mares
- Publisher web site:
- http://mj.ucw.cz/linux.shtml
Sherlock Holmes Tags
Sherlock Holmes Description
A universal search engine. Sherlock Holmes is a universal search engine, a system for gathering and indexing of textual data (text files, web pages, etc), both locally and over the network. Here are some key features of "Sherlock Holmes": · Gathers files via HTTP or from local files. · Parses text files, HTML, PDF, and several other formats using external parsers (such as MS Word and PostScript). · The whole system is modular, so adding your own data sources or parsers is just matter of plugging in right module (well, usually also writing it). · Works well in mixed charset environment. · Considers multiple occurences of the same file (even with minor changes) a single document with multiple URL's. · Everything is highly configurable. You can write filtering rules in a special language which allows to tweak configuration variables depending on the document being processed. · Searching of words, phrases, and boolean expressions. Searching in filenames and link texts. · Proximity search and proximity weighting of regular searches. · Recognition of languages, easy integration of stemmers and synonymic dictionaries. · Spelling checker based on word frequencies observed in the indexed data, hinting the user that his query might be misspelled. · Search results include context in each document. · Scales well to tens of millions of documents on normal PC hardware. · User interface (the front-end) is completely separated from the rest of the system, making it easy to modify and also to embed the search engine in existing applications. · Downloaded files and indices are compressed to save space.
Sherlock Holmes Related Software