[BiO BB] Efficient way to retrieve full length cDNA sequences from GenBank?
dale richardson
dalesan at gmail.com
Wed Apr 1 15:17:15 EDT 2009
Hello All,
Please forgive me if this post comes off as inexperienced, but if any
of you have the time I would like to hear your suggestions on the
following problem.
I've got a set of genomic DNA sequences for a number of species. What
I want to do is to obtain only full-length cDNA matches to these
genomic sequences from GenBank, excluding Refseq sequences. What I've
been doing so far is blasting these genomic sequences against the nr
nucleotide database and manually evaluating which hits to keep or
discard, depending on the coverage of the subject sequence to the
query. While this method may be suitable for organisms with poorly
characterized expression data, when trying to do this for mouse or
human the task becomes entirely daunting.
So my question is this:
What is the most efficient way to obtain a set of cDNA sequences that
match to a set of genomic DNA sequences while excluding spurious
hits , RefSeq sequences and "pseudo" full length cDNAs?
As you can imagine, I am interesting in looking for alternative splice
variants for a number of genes.
Any information or help that you could graciously muster would be very
much appreciated.
with sincere regards,
dale richardson
More information about the BBB
mailing list