Dear All, I have a very basic problem of which I wonder how others have solved this. I want to make a unigene collection of a large EST database. We have chromat files in ABI format and I use Linux on the intel platform. I have phred and phrap running but since phrap was originally designed for genomic sequences we get lots of misaasemblies on poly-A or poly-T stretches. Therefore I installed the TIGR tigcl package which is designed for EST databases and also runs very well on multi node machines. However, it uses multi fasta files (and corresponding (optional) quality files) as input. I wanted to use the phred package to generate the required fasta and qual files. This runs fine but the fasta file has in the >name line additional info separated with spaces. These files are not accepted by TGICL. Is there an easy unix (linux) utility to convert these multi fasta files and quality fasta files in simpel >name {CRT} seq files so they kan be used as input for tgicl? Or is a conversion utility available to convert/extract phreds phd files into fasta-seq and fasta-qual? Any help would be appreciated, Alex -------------- next part -------------- An HTML attachment was scrubbed... URL: http://bioinformatics.org/pipermail/biodevelopers/attachments/20030401/3bcee204/attachment.html