Joseph Landman said: > Hi Dan: > > I am assuming that your arrays contain not HSP's, but some sort of > object representing the HSP ala BioPerl. Is this correct? > Nope, just a simple hash, key => value, I.E. { hsp_hit-from => 5, hsp_hit-to => 10, score => 20, } > See > http://doc.bioperl.org/releases/bioperl-1.2/Bio/Tools/BPlite/HSP.html Ta. The problem is I find these implementations a bit confusing. I know I am reinventing the wheel, but sometimes it pays off. I.E. I have my HSP's in a RDB, so why load them into objects? Does bioperl have that func (I guess it does. I also like the htmlReport stuff with bioperl, but again I wish it was a bit more general, less self specific). > for a way to handle some of the HSP processing. This might help you > simplify the expression of what you are doing... Yup, or at least make it a bit more readable by people who don't know my specifics. But I seriously doubt that it will help my optimzation problem. Any suggestions of the complexity of below / how to improve it would be greatly appreaciated. Cheers, Dan. > > Joe > > On Wed, 2003-08-27 at 14:56, Dan Bolser wrote: >> Hello, splice to see you etc. >> >> I am trying to write a *simple* "best HST in family " >> algorithm in perl. >> >> My raw materials are SCOP queries against target sequences. >> >> I get each set of hits for each protein in turn, sorted >> by P_START (Hsp_query-from). >> >> I then go through the list and remove any pair of sequences >> with more than $THRESH AA overlap (if they come from the same >> scop family). >> >> This list removal involves lots of splicing, which is O(N) with >> list size. >> >> I figure I could avoid all that splice if I just use pointers >> to array positions, but I can't work out how to do this... >> >> Maby splicing is the least of my optimzation problems.... >> >> __SKIP__ >> >> preamble >> >> @hsps = array of HSP hashes, for a particular protein >> each HSP can be from several SCOP sequences. >> >> __RESUME__ >> >> my @result; # Final HSP's >> >> TOP:while (@hsps){ # NB: Ordered by Hsp_query-from >> # (for optimzation). >> >> my $p = 0; # Current HSP pointer. >> >> MID:for (my $j=$p+1; $j<@hsps; $j++){ # Overlap slider. >> >> # Family overlap only! >> >> next MID if >> $hsps[$p]->{SCCS} != $hsps[$j]->{SCCS}; >> >> # Optimization. >> >> if ( $THRESH > >> $hsps[$p]->{P_END} - $hsps[$j]->{P_START} ){ >> >> shift @hsps; >> next TOP; >> } >> >> # Pick best of pair (removing the other from the list). >> >> if ( $hsps[$p]->{E_VALUE} > $hsps[$j]->{E_VALUE} ){ >> splice (@hsps, $p, 1); >> $j--; >> $p = $j; >> } >> else { >> splice (@hsps, $j, 1); >> $j--; >> } >> } >> push @result, splice(@hsps, $p, 1); >> } >> print "OK\n\n"; >> >> __END_ISH__ >> >> Whaddya think? >> Any better way? >> >> Cheers, >> >> >> >> On Wed, 27 Aug 2003, sekhar kavuru wrote: >> >> > Dear Joseph, >> > >> > Iam a Perl Developer with BioInformatics Certification. >> > >> > Recently I developed a software package using BioPerl/ EnsEmbl to create a >> Perl/Html based database interface to access Genome data from EnsEMBL and >> SwissProt. The Browser I developed enables users to query ENSEMBL database >> based on either CloneId or Chromosome Number. >> > >> > If you need any assistance or help please feel free to write to me. >> > >> > Regards >> > >> > Sekhar >> > >> > biodevelopers-request at bioinformatics.org wrote: >> > Send Biodevelopers mailing list submissions to >> > biodevelopers at bioinformatics.org >> > >> > To subscribe or unsubscribe via the World Wide Web, visit >> > https://bioinformatics.org/mailman/listinfo/biodevelopers >> > or, via email, send a message with subject or body 'help' to >> > biodevelopers-request at bioinformatics.org >> > >> > You can reach the person managing the list at >> > biodevelopers-admin at bioinformatics.org >> > >> > When replying, please edit your Subject line so it is more specific than "Re: >> Contents of Biodevelopers digest..." >> > >> > >> > Today's Topics: >> > >> > 1. Re: [BiO BB] perl scripting assistance (Joseph Landman) >> > >> > --__--__-- >> > >> > Message: 1 >> > From: Joseph Landman >> > To: BiO BB >> > Cc: biodevelopers >> > Date: 26 Aug 2003 21:06:11 -0400 >> > Subject: [Biodevelopers] Re: [BiO BB] perl scripting assistance Reply-To: >> biodevelopers at bioinformatics.org >> > >> > Try the biodevelopers group on bioinformatics.org ... >> > >> > On Tue, 2003-08-26 at 13:06, Tristan J. Fiedler wrote: >> > > Are any bulletin boards / discussion groups available for obtaining tips in >> scripting with perl? >> > > >> > > Thank you. >> > > -- > Joseph Landman, Ph.D > Scalable Informatics LLC > email: landman at scalableinformatics.com > web: http://scalableinformatics.com > phone: +1 734 612 4615