Hi Malay: Are there any documentations and/or papers which describe such a setup? I would assume that there would be general interest in seeing how such a setup could be implemented. I was thinking, instead of duplicating ALL the available databases to the local HD, could some file-staging utlity be used to simply stage the database to be BLASTed against? Obviously the file-staging utlity has to work really quick on the cluster for this method to be viable. Thanks, Bernard > -----Original Message----- > From: bioclusters-bounces at bioinformatics.org > [mailto:bioclusters-bounces at bioinformatics.org] On Behalf Of Malay > Sent: Wednesday, January 05, 2005 10:23 > To: Clustering, compute farming & distributed computing in > life science informatics > Subject: Re: [Bioclusters] Versions of Blast that run on a cluster? > > Bernard Li wrote: > > Hi Malay: > > > > > >>Oops I forgot to mention the third option. This is for production > >>machine for very high end scaling up and requires ample > amount of disc > >>space in each node. This is to have each node it's local copy of > >>database. And use input spitting through SGE. This the best way to > >>scale up to ~1000 jobs at a time. But because of database > maintanance > >>issue, this method is advisable of for dedicated BLAST farm. > > > > > > You meant 'input splitting' right? And how would you > accomplish that > > using SGE? By scripting it in your job script? > > > > I meant submit each sequence as a separate job. > > There is one more way of doing it. Which is called "pull technique". > Where you store each sequences in a RDBMS. A demon runs on > each node and > pulls the sequence from the RDBMS and runs it against it's own local > BLAST database, stores the result in a accesible place and > marks the job > in RDBMS as "done". A designated node then seek the RDBMS for > job marked > done and pulls the result for the place. This method is the most > efficient of them all, and is used in BLAST server at NCBI. > > > -Malay > > _______________________________________________ > Bioclusters maillist - Bioclusters at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters >