File::SortedSeek

File::SortedSeek is a Perl module providing fast access to large files.
Download

File::SortedSeek Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Perl Artistic License
  • Price:
  • FREE
  • Publisher Name:
  • Dr James Freeman
  • Publisher web site:
  • http://search.cpan.org/~jfreeman/File-SortedSeek-0.012/SortedSeek/SortedSeek.pm

File::SortedSeek Tags


File::SortedSeek Description

File::SortedSeek is a Perl module providing fast access to large files. File::SortedSeek is a Perl module providing fast access to large files.SYNOPSIS use File::SortedSeek ':all'; open BIG, $file or die $!; # find a number or the first number greater in a file (ascending order) $tell = numeric( *BIG, $number ); # read a line in from where we matched in the file $line = < BIG >; print "Found exact match as $line" if File::SortedSeek:was_exact(); # find a string or the first string greater in a file (alphabetical order) $tell = alphabetic( *BIG, $string ); $line = < BIG >; # find a date in a logfile supplying a scalar localtime type string $tell = find_time( *BIG, "Thu Aug 23 22:59:16 2001" ); # or supplying GMT epoch time $tell = find_time( *BIG, 998571554 ); # get all the lines after our date @lines = < BIG >; # get the lines between two logfile dates $begin = find_time( *LOG, $start ); $end = find_time( *LOG, $finish ); # get lines as an array @lines = get_between( *LOG, $begin, $end ); # get lines as an array reference $lines = get_between( *LOG, $begin, $end ); # use you own sub to munge the file line data before comparison $tell = numeric( *BIG, $number, &epoch ); $tell = alphabetic( *BIG, $string, &munge_line ); # use methods on files in reverse alphabetic or descending numerical order File::SortedSeek::set_descending(); # for inexact matches set FH so first value read is before and second after File::SortedSeek::set_cuddle(); # get last $n lines of any file as an array @lines = get_last( *BIG, $n ) # or an array reference $lines = get_last( *BIG, $n ) # change the input record separator from the OS default @lines = get_last( *BIG, $n, $rec_sep )File::SortedSeek provides fast access to data from large files. Three methods numeric() alphabetic() and find_time() depend on the file data being sorted in some way. Logfiles are a typical example of big files that are sorted (by date stamp). The get_between() method can be used to get a chunk of lines efficiently from anywhere in the file. The required postion(s) for the get_between() method are supplied by the previous methods. The get_last() method will efficiently get the last N lines of any file, sorted or not.With sorted data a linear search is not required. Here is a typical linear search while (< FILE >) { next unless /$some_cond/ # found cond, do stuff }Remember that old game where you try to guess a number between lets say 0 and say 128? Let's choose 101 and now try to guess it.Using a linear search is the same as going 1 higher 2 higher 3 higher ... 100 higher 101 correct! Consider the geometric approach: 64 higher 96 higher 112 lower 104 lower 100 higher 102 lower - ta da must be 101! This is the halving the difference search method and can be applied to any data set where we can logically say higher or lower. In other words any sorted data set can be searched like this. It is a far more efficient method - see the SPEED section for a quick analysis. Requirements: · Perl


File::SortedSeek Related Software