[Biodevelopers] Re: splice advice...

Joseph Landman landman at scalableinformatics.com
Wed Aug 27 15:43:47 EDT 2003


Hi Dan:

  I am assuming that your arrays contain not HSP's, but some sort of
object representing the HSP ala BioPerl.  Is this correct?

  See
http://doc.bioperl.org/releases/bioperl-1.2/Bio/Tools/BPlite/HSP.html

  for a way to handle some of the HSP processing.  This might help you
simplify the expression of what you are doing...

Joe

On Wed, 2003-08-27 at 14:56, Dan Bolser wrote:
> Hello, splice to see you etc.
> 
> I am trying to write a *simple* "best HST in family " 
> algorithm in perl.
> 
> My raw materials are SCOP queries against target sequences.
> 
> I get each set of hits for each protein in turn, sorted
> by P_START (Hsp_query-from).
> 
> I then go through the list and remove any pair of sequences
> with more than $THRESH AA overlap (if they come from the same
> scop family).
> 
> This list removal involves lots of splicing, which is O(N) with
> list size. 
> 
> I figure I could avoid all that splice if I just use pointers
> to array positions, but I can't work out how to do this...
> 
> Maby splicing is the least of my optimzation problems....
> 
> __SKIP__
> 
> preamble
> 
> @hsps = array of HSP hashes, for a particular protein
> each HSP can be from several SCOP sequences.
> 
> __RESUME__
> 
>   my @result;						# Final HSP's
>   
>   TOP:while (@hsps){			  # NB: Ordered by Hsp_query-from
>                                           # (for optimzation).
> 
>     my $p = 0;	                          # Current HSP pointer.
>     
>     MID:for (my $j=$p+1; $j<@hsps; $j++){ # Overlap slider.
>       
>      # Family overlap only!
> 
>       next MID if
>         $hsps[$p]->{SCCS} != $hsps[$j]->{SCCS};	
>       
>       # Optimization.
>       
>       if ( $THRESH >
>              $hsps[$p]->{P_END} - $hsps[$j]->{P_START} ){
>         
>         shift @hsps;
>         next TOP;
>       }
> 
>       # Pick best of pair (removing the other from the list).
>       
>       if ( $hsps[$p]->{E_VALUE} > $hsps[$j]->{E_VALUE} ){
>         splice (@hsps, $p, 1);
>         $j--;
>         $p = $j;
>       }
>       else {
>         splice (@hsps, $j, 1);
>         $j--;
>       }
>     }
>     push @result, splice(@hsps, $p, 1);
>   }
>   print "OK\n\n";
> 
> __END_ISH__
> 
> Whaddya think?
> Any better way?
> 
> Cheers, 
> 
> 
> 
> On Wed, 27 Aug 2003, sekhar kavuru wrote:
> 
> > Dear Joseph,
> >  
> > Iam a Perl Developer with BioInformatics Certification.
> >  
> > Recently I developed a software package using BioPerl/ EnsEmbl to create a Perl/Html based database interface to access Genome data from EnsEMBL and SwissProt.
> > The Browser I developed enables users to query ENSEMBL database  based on either CloneId or Chromosome Number.
> >  
> > If you need any assistance or help please feel free to write to me.
> >  
> > Regards
> >  
> > Sekhar
> > 
> > biodevelopers-request at bioinformatics.org wrote:
> > Send Biodevelopers mailing list submissions to
> > biodevelopers at bioinformatics.org
> > 
> > To subscribe or unsubscribe via the World Wide Web, visit
> > https://bioinformatics.org/mailman/listinfo/biodevelopers
> > or, via email, send a message with subject or body 'help' to
> > biodevelopers-request at bioinformatics.org
> > 
> > You can reach the person managing the list at
> > biodevelopers-admin at bioinformatics.org
> > 
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of Biodevelopers digest..."
> > 
> > 
> > Today's Topics:
> > 
> > 1. Re: [BiO BB] perl scripting assistance (Joseph Landman)
> > 
> > --__--__--
> > 
> > Message: 1
> > From: Joseph Landman 
> > To: BiO BB 
> > Cc: biodevelopers 
> > Date: 26 Aug 2003 21:06:11 -0400
> > Subject: [Biodevelopers] Re: [BiO BB] perl scripting assistance
> > Reply-To: biodevelopers at bioinformatics.org
> > 
> > Try the biodevelopers group on bioinformatics.org ...
> > 
> > On Tue, 2003-08-26 at 13:06, Tristan J. Fiedler wrote:
> > > Are any bulletin boards / discussion groups available for obtaining tips
> > > in scripting with perl?
> > > 
> > > Thank you.
> > 
-- 
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman at scalableinformatics.com
  web: http://scalableinformatics.com
phone: +1 734 612 4615





More information about the Biodevelopers mailing list