[BiO BB] Starting point EST annotation/grouping
Alex.Bossers at wur.nl
Tue Aug 15 01:20:11 EDT 2006
In our case the unigenelist is the list we ended-up with after
assembling (so basically the singletons and contigs). To make it more
complete into a unigenelist we did a second round of clustering
assembling after we added the entries with which we had a BLAST hit at
nr/nt database (excluding the BAC clones etc). Hereby we also grouped
contigs/singletons that where not overlapping in our dataset yet.
Van: bio_bulletin_board-bounces+alex.bossers=wur.nl at bioinformatics.org
[mailto:bio_bulletin_board-bounces+alex.bossers=wur.nl at bioinformatics.or
g] Namens Ahmed Moustafa
Verzonden: vrijdag 11 augustus 2006 23:00
Aan: Bossers, Alex
CC: bio_bulletin_board at bioinformatics.org
Onderwerp: Re: [BiO BB] Starting point EST annotation/grouping
Thank you so much for your reply.
How do you generate the unigene list after assembling the ESTs?
On 8/11/06, Bossers, Alex <Alex.Bossers at wur.nl > wrote:
No I did not use the PHRED/PHRAP/CONSED package for this (I did try it
but found the TGICL suite more appropriate for ESTs).
For basecalling I used a windows platform based caller since PHRED at
that time did not support our used dyes for sequencing.. :(
Thereafter the TGICL basically uses megablast to cluster all sequences
into groups and than assembles it using the cap3 assembler.
Clustering and assembling is always difficult to explain. As far as I
understand the clustering of large groups of sequences speeds up the
second step; assembling. Basically you end up with contigs (having more
than one sequence) or singletons.
With unigene list I mean a list of all different genes present in my
sample of 13k. Like the Gene indices lists of species present at TIGR.
Now the next steps.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the BBB