Hi,<br>
<br>
I have started work on the clustering and assembly of 3` sequenced
ESTs. Because of the nature of the sequencing process we can be
certain that each EST represents the extreme 3` end of an expressed
transcript.<br>
<br>
In order to allow for incorrect base calling and determine which
transcripts are detected with greater frequency I wish to cluster and
assemble these ESTs to form consensus sequences and generate contigs.<br>
<br>
My problem is that clustering and assembly software does not take into
account the fact that all the ESTs under investigation are confirmed
extreme 3` and will assimilate genuine terminal 3` sequence into
upstream positions of longer transcripts in cases of alternative
polyadenylation of a single gene.<br>
<br>
Does anyone have experience of similar problems or approaches? Any help or direction would be sincerely appreciated.<br>
<br>
Regards, <br>
<br>
Dr Hulk Norris<br>
Principal Bioinformatician<br>