genomics

Perl extension for various DNA sequence analysis tools
Download

genomics Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Perl Artistic License
  • Price:
  • FREE
  • Publisher Name:
  • Jesse Salisbury
  • Publisher web site:
  • http://search.cpan.org/~ltboots/

genomics Tags


genomics Description

Perl extension for various DNA sequence analysis tools genomics is a Perl module for various DNA sequence analysis tools.SYNOPSIS use genomics::FilterSeq;This module condenses a fasta formated file to a 'unique' list of sequences. This is done rcursively by Hash{key} lookups. A unique key is sampled from each sequence and listed in a %HASH, thereby making all seqeucnes with identcal keys equivelent. The sequences are scanned +- the scanning window for other keys. Duplicates are squashed based on key prevelence or 5'->3' directionality. =head2 EXPORT Usage: Call the subroutine by sending in order: 1. \%SEQUENCE - a reference to a hash with %SEQUENCE{$name}=$sequence structure 2. $filter_start - the staring position in the sequence to gab a key 3. $filter_length - the length of the key (shorter keys produce more 'pruned' sets) 4. $filter_window - window +- to scan for keys 5. $filter_type - "M" = leave ambigous sequences, "T" = force ambigous to most 3' position, "F" = force ambigous to most 5' positionmy ( $RefKeyHash_R,$RefKeyHashSeq_R,$EST_PER_SITE_R,$SITES_CHOSEN_R,$STATS_R )= genomics::FilterSeq(\%SEQUENCE,$filter_start,$filter_length,$filter_window,$filter_type);subroutine returs the following: 1. $RefKeyHash_R - hash_reference to hash containing references to arrays with sequence names by key. 2. $RefKeyHashSeq_R, - similar, only returns condensed sequence by key 3. $EST_PER_SITE_R, a reference to a hash containg the key count value (number of keys represented) 4. $SITES_CHOSEN_R, a reference to a hash containg the key count value (number of sites represented) 5. $STATS_R reference to a hash of various counts.my $seq_count = $$STATS_R{"seq_count"}; my $Refseq_ID_count = $$STATS_R{"Refseq_ID_count"}; my $position_squashed_count = $$STATS_R{"position_squashed_count"}; my $key_count = $$STATS_R{"key_count"}; my $my_length_ave = $$STATS_R{"length_ave"};print "Out of $seq_count sequences ($my_length_ave), $Refseq_ID_count Id's were placed into $position_squashed_count sites (exact key), further reduced to $key_count sites by positional iteratation ";foreach(keys(%$RefKeyHash_R)){ print "$_ "; my $my_name_arr = $$RefKeyHash_R{$_}; print @$my_name_arr; print " "; print ${$$RefKeyHashSeq_R{$_}}; print " "; } Requirements: · Perl


genomics Related Software