Hi Dan: I am assuming that your arrays contain not HSP's, but some sort of object representing the HSP ala BioPerl. Is this correct? See http://doc.bioperl.org/releases/bioperl-1.2/Bio/Tools/BPlite/HSP.html for a way to handle some of the HSP processing. This might help you simplify the expression of what you are doing... Joe On Wed, 2003-08-27 at 14:56, Dan Bolser wrote: > Hello, splice to see you etc. > > I am trying to write a *simple* "best HST in family " > algorithm in perl. > > My raw materials are SCOP queries against target sequences. > > I get each set of hits for each protein in turn, sorted > by P_START (Hsp_query-from). > > I then go through the list and remove any pair of sequences > with more than $THRESH AA overlap (if they come from the same > scop family). > > This list removal involves lots of splicing, which is O(N) with > list size. > > I figure I could avoid all that splice if I just use pointers > to array positions, but I can't work out how to do this... > > Maby splicing is the least of my optimzation problems.... > > __SKIP__ > > preamble > > @hsps = array of HSP hashes, for a particular protein > each HSP can be from several SCOP sequences. > > __RESUME__ > > my @result; # Final HSP's > > TOP:while (@hsps){ # NB: Ordered by Hsp_query-from > # (for optimzation). > > my $p = 0; # Current HSP pointer. > > MID:for (my $j=$p+1; $j<@hsps; $j++){ # Overlap slider. > > # Family overlap only! > > next MID if > $hsps[$p]->{SCCS} != $hsps[$j]->{SCCS}; > > # Optimization. > > if ( $THRESH > > $hsps[$p]->{P_END} - $hsps[$j]->{P_START} ){ > > shift @hsps; > next TOP; > } > > # Pick best of pair (removing the other from the list). > > if ( $hsps[$p]->{E_VALUE} > $hsps[$j]->{E_VALUE} ){ > splice (@hsps, $p, 1); > $j--; > $p = $j; > } > else { > splice (@hsps, $j, 1); > $j--; > } > } > push @result, splice(@hsps, $p, 1); > } > print "OK\n\n"; > > __END_ISH__ > > Whaddya think? > Any better way? > > Cheers, > > > > On Wed, 27 Aug 2003, sekhar kavuru wrote: > > > Dear Joseph, > > > > Iam a Perl Developer with BioInformatics Certification. > > > > Recently I developed a software package using BioPerl/ EnsEmbl to create a Perl/Html based database interface to access Genome data from EnsEMBL and SwissProt. > > The Browser I developed enables users to query ENSEMBL database based on either CloneId or Chromosome Number. > > > > If you need any assistance or help please feel free to write to me. > > > > Regards > > > > Sekhar > > > > biodevelopers-request at bioinformatics.org wrote: > > Send Biodevelopers mailing list submissions to > > biodevelopers at bioinformatics.org > > > > To subscribe or unsubscribe via the World Wide Web, visit > > https://bioinformatics.org/mailman/listinfo/biodevelopers > > or, via email, send a message with subject or body 'help' to > > biodevelopers-request at bioinformatics.org > > > > You can reach the person managing the list at > > biodevelopers-admin at bioinformatics.org > > > > When replying, please edit your Subject line so it is more specific > > than "Re: Contents of Biodevelopers digest..." > > > > > > Today's Topics: > > > > 1. Re: [BiO BB] perl scripting assistance (Joseph Landman) > > > > --__--__-- > > > > Message: 1 > > From: Joseph Landman > > To: BiO BB > > Cc: biodevelopers > > Date: 26 Aug 2003 21:06:11 -0400 > > Subject: [Biodevelopers] Re: [BiO BB] perl scripting assistance > > Reply-To: biodevelopers at bioinformatics.org > > > > Try the biodevelopers group on bioinformatics.org ... > > > > On Tue, 2003-08-26 at 13:06, Tristan J. Fiedler wrote: > > > Are any bulletin boards / discussion groups available for obtaining tips > > > in scripting with perl? > > > > > > Thank you. > > -- Joseph Landman, Ph.D Scalable Informatics LLC email: landman at scalableinformatics.com web: http://scalableinformatics.com phone: +1 734 612 4615