msort

Application for sorting files in sophisticated ways
Download

msort Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • Bill Poser
  • Publisher web site:
  • http://billposer.org/Software/ColorExplorer.html
  • Operating Systems:
  • Mac OS X
  • File Size:
  • 529 KB

msort Tags


msort Description

Application for sorting files in sophisticated ways msort is a program for sorting files in sophisticated ways. msort was originally developed for alphabetizing dictionaries of "exotic" languages, for which it has been extensively used, but is useful for many other purposes. msort differs from typical sort utilities in providing greater flexibility in parsing the input into records and identifying key fields and greater control over the sort order.msort understands UTF-8 Unicode. Unicode may be used anywhere that text is entered: in the text to be sorted, as a field or record separator, in sort order and exclusion definitions, or as a field tag. Full Unicode case-folding is available. Here are some key features of "msort": · Msort can be used as a command-line program or via a graphical user interface that is helpful not only to those who find a complicated command line difficult to deal with but also to those unfamiliar with the finer points of sorting. · Records need not be single lines of text but may be delimited in a number of ways. Fixed length records are also supported. · Key fields may be selected by position in the record (counting from the beginning or the end), by character ranges (e.g. the key consists of the fourth through eighth characters), or by matching a regular expression to a tag. · For each key an arbitrary sort order may be specified. Msort also understands locales. · For each key an effectively unlimited number of multigraphs (sequences of characters to be treated as a single unit for purposes of sorting, "collating elements" in Unicode parlance) of effectively unlimited length may be defined. · In addition to the usual lexicographic and numerical comparisons, msort supports hybrid lexicographic-numeric comparison (for things like filenames and section headings, so that, e.g., 2a will precede 10b), random comparison, and ordering by angle, date, time, month name, domain name/email address, ISO8601 date-time, and string length. · Numbers may be in just about any known number system, e.g. Chinese or Devanagari. · For each key a distinct set of characters may be excluded from consideration when sorting in any combination of initial, final, and medial position in the key field. · For each key a distinct set of regular expression substitutions may be defined. These provide the means to make names like McCarthy sort before MacCawley, as if McCarthy were spelled MacCarthy as well as to handle the rare cases in which a single character is treated for purposes of sorting as a sequence, such as German "eszet", which is traditionally sorted as if it were ss. · Lexicographic keys may be reversed, allowing the construction of reverse dictionaries. · Any or all keys may be optional. For optional keys, the user may specify how records missing the key field should compare to records in which the key field is present. · A choice of sorting algorithms with different properties is provided. What's New in This Release: · ISO8601 keys may now have an optional leading sign. · If a key has comparison type "random", it is no longer stored since it won't be used. This saves a little time and possibly a good bit of storage. · If one or more records have been discarded due to problems in key extraction but the run is otherwise successful, the exit code is now RECORDEXCLUDED (13) rather than BADRECORD (8). · Cleaned up and improved the log. · Made error-checking and reporting finer-grained in GetMonthNames. · A few of the regression tests depend on the locale system, which may fail for reasons independent of msort. These tests have now been separated so that their failure will not suggest that msort itself is not working. Typing "make test" runs the main set of tests. Typing "make localetest" runs the locale-dependent tests, the results of which are written to LocaleTestResults. · Split time and iso8601 date/time regression tests so as not to mix data with and without time zone offsets since mixing them causes tests to fail if executed in some time zones. · Added regression test for more complex substition. · Added information to the manual section on random comparison. If you don't know how random comparison can be useful other than for unsorting, you might want to check this out.


msort Related Software