rel

Rel is an application that determines the relevance of text documents to a set of keywords expressed in boolean infix notation.
Download

rel Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • John Conover
  • Publisher web site:
  • http://www.johncon.com/nformatix/rel.html

rel Tags


rel Description

Rel is an application that determines the relevance of text documents to a set of keywords expressed in boolean infix notation. Rel is an application that determines the relevance of text documents to a set of keywords expressed in boolean infix notation. The list of file names that are relevant are printed to the standard output, in order of relevance. The boolean operators supported are logical or, logical and, and logical not. These operators are represented by the symbols, "|", "&", and, "!", respectively, and left and right parenthesis, "(" and ")", are used as the grouping operators. The paths can be files and/or directories-if it is a directory, the program will recursively descend into the directory, searching all files and directories contained in the directory.For example, the command: rel "(directory & listing)" /usr/share/man/cat1(ie., find the order of relevance of all files that contain both of the words "directory" and "listing" in the catman directory) will list a few tens of files, out of the hundreds of catman files, of which "ls.1" is the among the most relevant-meaning that to find the command that lists directories in a Unix system, the "literature search" was reduced, on average, by about 98%, which is a considerable expediency in relation to browsing through the files in the directory. Although this example is remedial, a similar expediency can be demonstrated in searching for documents in email repositories and text archives.Additional applications include information robots, (ie., "mailbots," or "infobots,") where the disposition (ie., delivery, filing, or viewing,) of text documents can be determined dynamically, based on the relevance of the document to a set of criteria, framed in boolean infix notation. Or, in other words, the program can be used to order, or rank, text documents based on a "context," specified in a general mathematical language, similar to that used in calculators.The words in the query are case insensitive, and either upper or lower case can be used.The operator symbols can be escaped with the "" character to include the symbol in a search pattern. The "escape space" character sequence represents one or more instances of space character(s) in search patterns, and each instance will match one or more consecutive whitespace characters, (as defined by isspace(3) in ctype.h and/or locale.h,) and allows phrases to be searched for. The "many to one" whitespace character translation occurs in both the keyword arguments and the text document(s). Multiple consecutive instances of the "escape space" character sequence in keyword search phrases should not be used, and single instances are appropriate only when necessary to specify a consecutive sequence of keywords-the logical and operator is the preferred searching construct when searching documents that contain set(s) of keywords.Hyphenation issues are addressed by deleting hyphens and any following sequence of instances of whitespace characters, (as defined by isspace(3),) in both the keyword arguments and the text document(s).Backspace character issues are addressed by overwriting the character before the backspace with the character after the backspace, which will instantiate the character of the last instance of of consecutive backspace/character combinations. This is specifically for catman pages which utilize underscore/backspace/character combinations for underlining, in addition to backspace/character combinations for bold (overstrike,) representation-note that for this process to be successful, a single underscore (used for underlining,) must preceed a single character in the sequence.


rel Related Software