[Biococoa-dev] Even more on sequence formats

Tue Apr 11 06:08:27 EDT 2006

On Apr 11, 2006, at 5:44 AM, Alexander Griekspoor wrote:

> - I've added support for reading sequence files that weren't saved  
> as plain-text but as rtf, basically adding a check and converting  
> the file to plain text before continuing with the normal format  
> determination

That would be  great addition.

> - I've changed the raw reading method such that it becomes more  
> greedy. Peter's variant reads in all lines as separate entries in  
> the matrix dictionary (which is probably what you want in aligned  
> phylogenetic sequence files, but not in EnzymeX where people  
> usually read in a single sequence file. So I remove all return  
> characters first. I had one person complaining that EnzX only read  
> the first line when he tried to open his nicely in 80 char columns  
> formatted text file.
>

BCSequenceReader now works very different from the way BCReader does.  
I am not using a matrixDictionary anymore (see my comments in the  
previous mail, and the actual code). The current BCSequenceReader  
already reads a raw file in as one whole.

BTW, be careful with what you put in your README for EnzymeX. I  
noticed you wrote that it supports BEAST and TNT formats, but I don't  
see those in  BCReader. Unless you have a local version that does, of  
course ;-)  I'll have a look at the readseq code later.

cheers,

- Koen.