genomicsPerl extension for various DNA sequence analysis tools | |
Download |
genomics Ranking & Summary
Advertisement
- License:
- Perl Artistic License
- Price:
- FREE
- Publisher Name:
- Jesse Salisbury
- Publisher web site:
- http://search.cpan.org/~ltboots/
genomics Tags
genomics Description
Perl extension for various DNA sequence analysis tools genomics is a Perl module for various DNA sequence analysis tools.SYNOPSIS use genomics::FilterSeq;This module condenses a fasta formated file to a 'unique' list of sequences. This is done rcursively by Hash{key} lookups. A unique key is sampled from each sequence and listed in a %HASH, thereby making all seqeucnes with identcal keys equivelent. The sequences are scanned +- the scanning window for other keys. Duplicates are squashed based on key prevelence or 5'->3' directionality. =head2 EXPORT Usage: Call the subroutine by sending in order: 1. \%SEQUENCE - a reference to a hash with %SEQUENCE{$name}=$sequence structure 2. $filter_start - the staring position in the sequence to gab a key 3. $filter_length - the length of the key (shorter keys produce more 'pruned' sets) 4. $filter_window - window +- to scan for keys 5. $filter_type - "M" = leave ambigous sequences, "T" = force ambigous to most 3' position, "F" = force ambigous to most 5' positionmy ( $RefKeyHash_R,$RefKeyHashSeq_R,$EST_PER_SITE_R,$SITES_CHOSEN_R,$STATS_R )= genomics::FilterSeq(\%SEQUENCE,$filter_start,$filter_length,$filter_window,$filter_type);subroutine returs the following: 1. $RefKeyHash_R - hash_reference to hash containing references to arrays with sequence names by key. 2. $RefKeyHashSeq_R, - similar, only returns condensed sequence by key 3. $EST_PER_SITE_R, a reference to a hash containg the key count value (number of keys represented) 4. $SITES_CHOSEN_R, a reference to a hash containg the key count value (number of sites represented) 5. $STATS_R reference to a hash of various counts.my $seq_count = $$STATS_R{"seq_count"}; my $Refseq_ID_count = $$STATS_R{"Refseq_ID_count"}; my $position_squashed_count = $$STATS_R{"position_squashed_count"}; my $key_count = $$STATS_R{"key_count"}; my $my_length_ave = $$STATS_R{"length_ave"};print "Out of $seq_count sequences ($my_length_ave), $Refseq_ID_count Id's were placed into $position_squashed_count sites (exact key), further reduced to $key_count sites by positional iteratation ";foreach(keys(%$RefKeyHash_R)){ print "$_ "; my $my_name_arr = $$RefKeyHash_R{$_}; print @$my_name_arr; print " "; print ${$$RefKeyHashSeq_R{$_}}; print " "; } Requirements: · Perl
genomics Related Software