hi i am working on a paper nameing DNA computing... can any one of u help me if have any information regarding this thanking you alok kumar pune (INDIA) On Wed, 10 Sep 2003 bio_bulletin_board-request@bioinformatics.org wrote : >When replying, PLEASE edit your Subject line so it is more >specific >than "Re: BiO_Bulletin_Board digest, Vol..." > > >Today's Topics: > > 1. BLASTing SCO (re: Linux IP, IBM suite, shred algorithm, >etc) (Harry Mangalam) > 2. NCBI Viewer (jinal jhaveri) > 3. Poly A tail length - script help please (Tristan >Fiedler) > 4. Re: Poly A tail length - script help please (Joseph >Landman) > 5. Re: Poly A tail length - script help please (Dmitri I >GOULIAEV) > >--__--__-- > >Message: 1 >Date: Tue, 09 Sep 2003 09:13:26 -0700 > From: Harry Mangalam >To: bio_bulletin_board@bioinformatics.org >Subject: [BiO BB] BLASTing SCO (re: Linux IP, IBM suite, shred >algorithm, etc) >Reply-To: bio_bulletin_board@bioinformatics.org > >I've been watching the SCO vs IBM suit with some interest and >this piqued my >interest. > >Eric Raymond has apparently reworked some old 'shred' code which >calculates MD5 >hashes for long (from the molbio perspective) words (~3 lines at >a time) and >then sorts the hashes to identify sections of the Linux source >code tree which >are identical to those from SCO-owned System V Unix base. > >This sounds a bit like the initial pass for BLAT, which generates >hashes for >much smaller words and uses the hashes in comparisons. > >http://www.eweek.com/article2/0,4149,1257617,00.asp > >Could BLAST not be used to faster & much more sensitively >identify not only >identical but similar sections of code? > >It would have to be modified to do an 'all against all' approach >and would have >to also take into account line numbers and file names, but >here'a a good >undergrad programming project for someone, with the possibility >of getting some >good press and creating a tool that will undoubtedly be used >again in litigation >(read: it could be worth real money) > >Then again, the Raymond's shred code approach is probably good >enough. > >Comments? >-- >Cheers, Harry >Harry J Mangalam - 949 856 2847 (v&f) - hjm@tacgi.com > <> > > >--__--__-- > >Message: 2 >Date: Tue, 09 Sep 2003 12:03:45 -0700 > From: jinal jhaveri >To: bio_bulletin_board@bioinformatics.org >Subject: [BiO BB] NCBI Viewer >Reply-To: bio_bulletin_board@bioinformatics.org > >Hi there, > >I am developing a zoom viewer for chromosomes. Can any one give >tips on >some available software to use for that. I want this zoom viewer >to be >online (same as the one ncbi has >(http://www.ncbi.nlm.nih.gov/mapview/maps.cgi?org=arabid&chr=I) , >i.e >the entrez one. > >thank you >--Jinal > > >--__--__-- > >Message: 3 >Date: Tue, 9 Sep 2003 17:00:55 -0400 (EDT) > From: "Tristan Fiedler" >To: bio_bulletin_board@bioinformatics.org >Cc: bio_bulletin_board@bioinformatics.org >Subject: [BiO BB] Poly A tail length - script help please >Reply-To: bio_bulletin_board@bioinformatics.org > >Thanks for the scripting tips! I have a 'counting' issue which I >need to >quickly resolve. A typical sequence input file (5 - 700 bases) >looks like >: > >AGTAGTCGATCATNATANCTANTACNACTACTAACTATGCTAGNNAATATAAAAAAAAANAAA > >I have over 500 files, named *.seq. I would like to create a >script which : > >a. runs through all the files, >b. counts the length of the 'poly A' tail (defined as the >longest stretch >of A or N) >c. sends the output to a file, eg. > >25 1.seq >87 2.seq >13 3.seq > >Example valid poly A tails : > >AAAANANANANAAANNAAAAAA > >AAAAAAAAAAAAAA > >NNNNNNNNNNNNN > >AAANNNNNNNNNNNAAAAAAAAA > >Thank you so much for your expertise! > >Tristan > >-- >Tristan J. Fiedler, Ph.D. >Postdoctoral Research Fellow >NIEHS Marine & Freshwater Biomedical Sciences Center >Rosenstiel School of Marine & Atmospheric Sciences >University of Miami > >tfiedler@rsmas.miami.edu >t.fiedler@umiami.edu (alias) >305-361-4626 > >--__--__-- > >Message: 4 >Subject: Re: [BiO BB] Poly A tail length - script help please > From: Joseph Landman >To: BiO BB >Cc: biodevelopers >Date: Tue, 09 Sep 2003 19:57:34 -0400 >Reply-To: bio_bulletin_board@bioinformatics.org > >First one is free ... > > #!/usr/bin/perl > > use strict; > > my >($directory,$directory_handle,$file,@files,$sequence); > my ($file_handle,$poly_a_tail,$rseq); > > $directory = "./"; # directory to open > if (!(opendir $directory_handle,$directory)) > { > die "FATAL ERROR: Unable to open directory = >".$directory."\n"; > } > > # select only the .seq files > @files = grep { /\.seq$/ } readdir($directory_handle); > > # loop over these selected files > foreach $file (@files) > { > # try to open the file > if (!(open($file_handle,"< ".$file))) > { > # if we cannot open it, warn the user, and skip >to the next file > warn "Warning: unable to open file = >".$file."\. Skipping\.\n"; > next; > } > else > { > # assume one line per file, or we will have to >modify this > chomp($sequence=<$file_handle>); > # now time to bring out the heavy artillery > $rseq=reverse $sequence; # poly-a is now at the head > $rseq =~ /^([AN]+)\w+$/; # match A's and/or N's at the >front > $poly_a_tail = $1; # return the match ... > printf "%i %s\n",length($poly_a_tail),$file; # tell >the world ... > close($file_handle); > } > } > > > >On Tue, 2003-09-09 at 17:00, Tristan Fiedler wrote: > > Thanks for the scripting tips! I have a 'counting' issue >which I need to > > quickly resolve. A typical sequence input file (5 - 700 >bases) looks like > > : > > > > >AGTAGTCGATCATNATANCTANTACNACTACTAACTATGCTAGNNAATATAAAAAAAAANAAA > > > > I have over 500 files, named *.seq. I would like to create a >script which : > > > > a. runs through all the files, > > b. counts the length of the 'poly A' tail (defined as the >longest stretch > > of A or N) > > c. sends the output to a file, eg. > > > > 25 1.seq > > 87 2.seq > > 13 3.seq > > > > Example valid poly A tails : > > > > AAAANANANANAAANNAAAAAA > > > > AAAAAAAAAAAAAA > > > > NNNNNNNNNNNNN > > > > AAANNNNNNNNNNNAAAAAAAAA > > > > Thank you so much for your expertise! > > > > Tristan >-- >Joseph Landman, Ph.D >Scalable Informatics LLC >email: landman@scalableinformatics.com > web: http://scalableinformatics.com >phone: +1 734 612 4615 > > > >--__--__-- > >Message: 5 >Date: Wed, 10 Sep 2003 10:43:42 -0500 > From: Dmitri I GOULIAEV >To: bio_bulletin_board@bioinformatics.org >Subject: Re: [BiO BB] Poly A tail length - script help please >Organization: DIG >Reply-To: bio_bulletin_board@bioinformatics.org > >Hi, Tristan Fiedler ! > > On Tue, Sep 09, 2003 at 05:00:55PM -0400, Tristan Fiedler >wrote: > > > Thanks for the scripting tips! I have a 'counting' issue >which I need to > > quickly resolve. A typical sequence input file (5 - 700 >bases) looks like > > : > > > > >AGTAGTCGATCATNATANCTANTACNACTACTAACTATGCTAGNNAATATAAAAAAAAANAAA > > > > I have over 500 files, named *.seq. I would like to create a >script which : > > > > a. runs through all the files, > > b. counts the length of the 'poly A' tail (defined as the >longest stretch > > of A or N) > > c. sends the output to a file, eg. > > > > 25 1.seq > > 87 2.seq > > 13 3.seq > > > > Example valid poly A tails : > > > > AAAANANANANAAANNAAAAAA > > > > AAAAAAAAAAAAAA > > > > NNNNNNNNNNNNN > > > > AAANNNNNNNNNNNAAAAAAAAA > >If you have this sequences: > > 1.seq CACATGACTGACTGACTGACTACGACTGCAAAANANANANAAANNAAAAAA > 2.seq CGTAGCTCTACGATGCTACGAGAAAAAAAAAAAAAA > 3.seq TGTACGTACGATCGATGCTAGCNNNNNNNNNNNN > 4.seq CATGTGCTACGACGATGCT > 5.seq CATGTGCTACGACGATGCTAAAANNNNNNNNNNNAAAAAAAAA > >and you run this command: > > $ for s in *.seq ; do cat $s | grep -o '[AN]*[AN]$' \ > | tr -d '\n' | wc -c | tr -d '\n' ; echo -e "\t$s" ; done > >then the output will be: > > 22 1.seq > 14 2.seq > 12 3.seq > 0 4.seq > 24 5.seq > >Exercise for the OP: redirect this output to a file. > > > Thank you so much for your expertise! > >You should really start learning the un*x text utilities and/or >some scripting language (e.g. python, tcl, perl). > > >Regards, > >-- >DIG (Dmitri I GOULIAEV) >http://www.bioinformatics.org/~dig/ >1024D/63A6C649: 26A0 E4D5 AB3F C2D4 0112 66CD 4343 C0AF 63A6 >C649 > > >--__--__-- > >_______________________________________________ >BiO_Bulletin_Board maillist - >BiO_Bulletin_Board@bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > >End of BiO_Bulletin_Board Digest ___________________________________________________ Interior meets Software; Rani Weds Gaurav. Rediff Matchmaker strikes another interesting match Visit http://matchmaker.rediff.com?1