[ssml] [Fwd: Re: Request to mailing list ssml-general rejected]

Tue Dec 16 04:20:15 EST 2003

Using the current state of the art bioinformatics tools/software, what is the
preferred method of *identifying EST sequences* for the subtraction procedure of a
cDNA library ?

In order to decrease the abundant messages which dominate cDNA libraries, I hope
to identify the longest, most abundant, and annotatable (based on e.g. swissprot)
ESTs.  I would like to get expert opinions on how to most effectively go about it.
 I have several thousand ESTs and would, for at least this first round, like to
identify 96 clones which are the most abundant/longest/annotatable.

Approaches I have considered are :

1. Running the entire dataset through CAP3 to produce contigs.  Then take the
consensus sequence for each contig and run a blastp against swissprot to see if is
annotatable.
2.  Running an all against all blast search using the ESTs as both the query and
the database.  Additionally, one could make the database a combination of both the
ESTs and swissprot, thus indicating not only which sequences have
similar/identical matches within the EST database, but also whether they have a
homolog in swissprot

Does anything exist in bioperl which performs the necessary sequence analysis for
subtraction of a cDNA library?

BTW, if these are not the correct listserv/bulletin boards for such a query,
please let me know the preferred location.

Thank you and Happy Holidays!

Tristan Fiedler

-- 
Tristan J. Fiedler, Ph.D.
Postdoctoral Research Fellow - Walsh Laboratory
NIEHS Marine & Freshwater Biomedical Sciences Center
Rosenstiel School of Marine & Atmospheric Sciences
University of Miami

tfiedler at rsmas.miami.edu
t.fiedler at umiami.edu (alias)
305-361-4626