But that does not compute the 'longest stretch'. The attached perl script does, and will allow you to write: > polyfind [-all] *.seq > polyfind.results Enjoy, Malcolm Cook > -----Original Message----- > From: Joseph Landman [mailto:landman at scalableinformatics.com] > Sent: Tuesday, September 09, 2003 6:58 PM > To: BiO BB > Cc: biodevelopers > Subject: Re: [BiO BB] Poly A tail length - script help please > > > First one is free ... > > #!/usr/bin/perl > > use strict; > > my ($directory,$directory_handle,$file, at files,$sequence); > my ($file_handle,$poly_a_tail,$rseq); > > $directory = "./"; # directory to open > if (!(opendir $directory_handle,$directory)) > { > die "FATAL ERROR: Unable to open directory = > ".$directory."\n"; > } > > # select only the .seq files > @files = grep { /\.seq$/ } readdir($directory_handle); > > # loop over these selected files > foreach $file (@files) > { > # try to open the file > if (!(open($file_handle,"< ".$file))) > { > # if we cannot open it, warn the user, and > skip to the next file > warn "Warning: unable to open file = > ".$file."\. Skipping\.\n"; > next; > } > else > { > # assume one line per file, or we will have > to modify this > chomp($sequence=<$file_handle>); > # now time to bring out the heavy artillery > $rseq=reverse $sequence; # poly-a is now > at the head > $rseq =~ /^([AN]+)\w+$/; # match A's > and/or N's at the front > $poly_a_tail = $1; # return the match ... > printf "%i %s\n",length($poly_a_tail),$file; > # tell the world ... > close($file_handle); > } > } > > > > On Tue, 2003-09-09 at 17:00, Tristan Fiedler wrote: > > Thanks for the scripting tips! I have a 'counting' issue > which I need to > > quickly resolve. A typical sequence input file (5 - 700 > bases) looks like > > : > > > > AGTAGTCGATCATNATANCTANTACNACTACTAACTATGCTAGNNAATATAAAAAAAAANAAA > > > > I have over 500 files, named *.seq. I would like to create > a script which : > > > > a. runs through all the files, > > b. counts the length of the 'poly A' tail (defined as the > longest stretch > > of A or N) > > c. sends the output to a file, eg. > > > > 25 1.seq > > 87 2.seq > > 13 3.seq > > > > Example valid poly A tails : > > > > AAAANANANANAAANNAAAAAA > > > > AAAAAAAAAAAAAA > > > > NNNNNNNNNNNNN > > > > AAANNNNNNNNNNNAAAAAAAAA > > > > Thank you so much for your expertise! > > > > Tristan > -- > Joseph Landman, Ph.D > Scalable Informatics LLC > email: landman at scalableinformatics.com > web: http://scalableinformatics.com > phone: +1 734 612 4615 > > > _______________________________________________ > BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > -------------- next part -------------- A non-text attachment was scrubbed... Name: polyafind Type: application/octet-stream Size: 3438 bytes Desc: polyafind Url : http://bioinformatics.org/pipermail/biodevelopers/attachments/20030910/7473c37a/polyafind.obj