Statistics::MaxEntropy

MaxEntropy is a Perl5 module for Maximum Entropy Modeling and Feature Induction.
Download

Statistics::MaxEntropy Ranking & Summary

Advertisement

  • Rating:
  • License:
  • GPL
  • Price:
  • FREE
  • Publisher Name:
  • Hugo WL ter Doest
  • Publisher web site:
  • http://search.cpan.org/~terdoest/Statistics-MaxEntropy-0.9/MaxEntropy.pm

Statistics::MaxEntropy Tags


Statistics::MaxEntropy Description

MaxEntropy is a Perl5 module for Maximum Entropy Modeling and Feature Induction. MaxEntropy is a Perl5 module for Maximum Entropy Modeling and Feature Induction.SYNOPSIS use Statistics::MaxEntropy; # debugging messages; default 0 $Statistics::MaxEntropy::debug = 0; # maximum number of iterations for IIS; default 100 $Statistics::MaxEntropy::NEWTON_max_it = 100; # minimal distance between new and old x for Newton's method; # default 0.001 $Statistics::MaxEntropy::NEWTON_min = 0.001; # maximum number of iterations for Newton's method; default 100 $Statistics::MaxEntropy::KL_max_it = 100; # minimal distance between new and old x; default 0.001 $Statistics::MaxEntropy::KL_min = 0.001; # the size of Monte Carlo samples; default 1000 $Statistics::MaxEntropy::SAMPLE_size = 1000; # creation of a new event space from an events file $events = Statistics::MaxEntropy::new($file); # Generalised Iterative Scaling, "corpus" means no sampling $events->scale("corpus", "gis"); # Improved Iterative Scaling, "mc" means Monte Carlo sampling $events->scale("mc", "iis"); # Feature Induction algorithm, also see Statistics::Candidates POD $candidates = Statistics::Candidates->new($candidates_file); $events->fi("iis", $candidates, $nr_to_add, "mc"); # writing new events, candidates, and parameters files $events->write($some_other_file); $events->write_parameters($file); $events->write_parameters_with_names($file); # dump/undump the event space to/from a file $events->dump($file); $events->undump($file);This module is an implementation of the Generalised and Improved Iterative Scaling (GIS, IIS) algorithms and the Feature Induction (FI) algorithm as defined in (Darroch and Ratcliff 1972) and (Della Pietra et al. 1997). The purpose of the scaling algorithms is to find the maximum entropy distribution given a set of events and (optionally) an initial distribution.Also a set of candidate features may be specified; then the FI algorithm may be applied to find and add the candidate feature(s) that give the largest `gain' in terms of Kullback Leibler divergence when it is added to the current set of features.Events are specified in terms of a set of feature functions (properties) f_1...f_k that map each event to {0,1}: an event is a string of bits. In addition of each event its frequency is given. We assume the event space to have a probability distribution that can be described byThe module requires the Bit::SparseVector module by Steffen Beyer and the Data::Dumper module by Gurusamy Sarathy. Both can be obtained from CPAN just like this module. Requirements: · Perl


Statistics::MaxEntropy Related Software