[BiO BB] Looking for researcher, to assist on blast-like invention

DT theoriste at gmail.com
Tue Feb 12 11:46:03 EST 2008


By the way, nr is ftp-able from NCBI and is a protein-based database if you
didn't know.

On Feb 12, 2008 11:44 AM, DT <theoriste at gmail.com> wrote:

>
> On Feb 11, 2008 6:56 PM, Theodore H. Smith <delete at elfdata.com> wrote:
>
> >
> > On 11 Feb 2008, at 22:28, Ryan Golhar wrote:
> >
> > > Why don't you write up a paper describing the algorithm in detail and
> > > submit it to a bioinformatics journal?  And, why not make the
> > > executable
> > > available with documentation so that people can download it and try it
> > > out for themselves.
> > >
> > > Do you have any test cases that show it runs faster/better than BLAST?
> > > Describe them and make them available.
> >
> > The first thing I'd need to do is make a good test. I'm not sure what
> > constitutes "a good test", in this case.
>
>
>
> NR ALL VS ALL:  This will test speed and somehow test performance. The nr
> database (non-redundant) from NCBI is a good place to start testing as a
> template database. I'd use your algorithm all-against-all in nr. Test
> against  BLAST and then use your algorithm for each entry in nr versus all
> of nr, and then compare performance. You can generate a ROC plot for BLAST
> vs your algorithm against a known set of homologs and distant homologs,
> based on a p-value or significance level cutoff.
>
> A real randomization test would be this to test sensitivity and
> specificity: take known sequences in nr  -- all or some of them -- and
> scramble them by 'homologous recombination" -- create chimeras of known
> sequences  by different randomization criteria  -- by domain (criteria based
> on domain annotation)  or by individual sequence based on a known
> randomization function, and then test the sensitivity and specificity of
> BLAST vs your algorithm to detect the originating sequences that created the
> chimeras.
>
> You will also need to check the performance of your algorithm against
> nucleotide sequences. There are already test cases in BLAST for
> mouse-vs-human, that would be a good test case.
>
> Deanne Taylor
>
>
>



More information about the BBB mailing list