[Biococoa-dev] read Fasta File with gap symbols

Scott Christley schristley at mac.com
Wed Apr 1 14:21:32 EDT 2009


Hello Stephan,

Yes this is an issue.  BioCocoa attempts to determine the sequence  
type by looking at the symbols, once it makes a decision it strips any  
unknown symbols, of course it can make a wrong decision or more likely  
in your case, it considers the gap symbol as unknown.

Unfortunately there isn't a direct workaround unless you are willing  
to make a modification to the BioCocoa source code, there are just a  
few lines you can comment out in BCSequence.m that will skip.

I think BioCocoa probably needs to be changed so that it doesn't  
modify the sequence data at all, and the user is responsible for  
initiating a sequence type check and/or filtering.

cheers
Scott

On Apr 1, 2009, at 3:05 AM, Stephan wrote:

> Hi,
>
> I am new to BioCocoa and was wondering whether there is a way of  
> parsing FASTA files that contain alignment-information, i.e. they  
> include sequences with the gap-symbol "-".
> Right now, if I parse the file, the gaps are filtered out.
>
> Any sugguestions?
>
> Best,
> Stephan
>
>
>
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> http://www.bioinformatics.org/mailman/listinfo/biococoa-dev





More information about the Biococoa-dev mailing list