Bernard Li wrote: > Hi Malay: > > >>Oops I forgot to mention the third option. This is for >>production machine for very high end scaling up and requires >>ample amount of disc space in each node. This is to have each >>node it's local copy of database. And use input spitting >>through SGE. This the best way to scale up to ~1000 jobs at a >>time. But because of database maintanance issue, this method >>is advisable of for dedicated BLAST farm. > > > You meant 'input splitting' right? And how would you accomplish that > using SGE? By scripting it in your job script? > I meant submit each sequence as a separate job. There is one more way of doing it. Which is called "pull technique". Where you store each sequences in a RDBMS. A demon runs on each node and pulls the sequence from the RDBMS and runs it against it's own local BLAST database, stores the result in a accesible place and marks the job in RDBMS as "done". A designated node then seek the RDBMS for job marked done and pulls the result for the place. This method is the most efficient of them all, and is used in BLAST server at NCBI. -Malay