[Bioclusters] ncbi blast

Justin Powell bioclusters@bioinformatics.org
Wed, 23 Jun 2004 18:52:01 +0100


I'm using precompliled 2.2.9 from the NCBI and same for my 2.2.6. I also
have compiled 2.2.9 myself on both systems, with the same result.

justin

On Wed, 23 Jun 2004, Joe Landman wrote:

> On Wed, 23 Jun 2004 10:56:58 -0400, Susan Chacko wrote
> > I'm seeing exactly this same problem on our systems. It does not
>
> Hi Susan and Justin:
>
>   Ok.  Which binary are you using?  Did you build it yourself?  Is it one of the
> precompiled ones (including the Scalable Informatics LLC version)?  Would you
> mind sharing it?
>
> > appear to be related to hardware, as I see it on all of the following:
> > - 2.8 GHz Xeon, 2 Gb RAM - 2.8 GHz Xeon, 4 Gb RAM - 1.8 GHz Athlon, 2
> > Gb RAM - 1.4 GHz Athlon, 2 Gb RAM - 866 MHz Pentium III, 1 Gb RAM
>
> This strongly suggests software, either application or OS.
>
> >
> > They're all running RH 7.1 with updated kernels. I'm using the NCBI
> > blastall and the NCBI nt db. There aren't any missing libraries,
> > according to ldd.
> >
> > Our databases sit on a Netapp 960 Filer, and at first I thought that
> > was the problem, but I still see the failures when I copy the db to
> > local scratch on the nodes.
>
> Ok.
>
> What happens if you boot one of the units with a "mem=1024M" option, which
> forces Linux to use only 1 GB ram?  There are some oddities that happen at the 4
> GB region (really 3.8 GB or so depending upon which kernel you are using).
>
> > The problem appears _only_ with the combination of the nt database and
> > the '-a 2' flag on Blast. It is random, in that I get somewhere
> > between 5 and 20 failures out of 20 Blast runs with the same db+query.
> > I get no failures with other dbs (e.g. est) or if I don't use the -a flag.
>
> Ok, worth a test on my systems as well.  If I build a static binary of blastall,
> would you be willing to give it a try?  RH7.1 means probably i686 optimizations
> at best.  It also means that the 2.96 GCC was probably used.  This compiler had
> some problems generating good (and in some instances, correct) code.
>
> > I've also run Blast with the same db+query on an SGI Origin 3400 with
> > no failures, using -a 2, -a 3, -a 4.
>
> I presume this is with the native MIPSpro compilers (very good compilers BTW)
> and not gcc?
>
> >
> > I've emailed NCBI about this and am waiting for a response.
>
> Ok, please let me know if you would like help.  Thanks.
>
> >
> > Susan.
> >
> > On Jun 16, 2004, at 11:21 AM, Justin Powell wrote:
> >
> > > I'm experiencing trouble with blastall 2.2.9 running blastn on a linux
> > > cluster against a recently downloaded version of the 'nt' database from
> > > ncbi.  Intermittently I get a segmentation fault partway through the
> > > search.
> > >
> > > This happens both with precompiled blast and blast I compile myself. It
> > > happens on a two dual xeon systems running redhat9.0 and a dual athlon
> > > system running redhat7.1.  Both systems have 4GB ram. It happens with
> > > several different query sequences, but never with the est nucleotide
> > > database. It also happens if I use fastacmd to dump the ncbi nt
> > > database
> > > into fasta format and then formatdb it myself. Blastdbs are kept
> > > locally
> > > so its not a networking issue.
> > >
> > > Strangely this also happens with blastall2.2.6 on the athlon system,
> > > I've
> > > not tested it on the xeon systems (or other releases).
> > >
> > > So I would guess, given the variety of systems, that its a bug which nt
> > > provokes specifically - but then I assume huge numbers of people must
> > > use
> > > blast to search nt on linux boxes and would have noticed already if
> > > this
> > > were the case. Anyone have any ideas what might be going on?
> > >
> > >
> > > Justin
> > > jacp1@mole.bio.cam.ac.uk
> > >
> > > _______________________________________________
> > > Bioclusters maillist  -  Bioclusters@bioinformatics.org
> > > https://bioinformatics.org/mailman/listinfo/bioclusters
> >
> > _______________________________________________
> > Bioclusters maillist  -  Bioclusters@bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bioclusters
>
>
> --
> Joseph Landman, Ph.D
> Scalable Informatics LLC,
> email: landman@scalableinformatics.com
> web  : http://scalableinformatics.com
> phone: +1 734 612 4615
>
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>