[Biococoa-dev] More on sequence formats

Koen van der Drift kvddrift at earthlink.net
Sun Apr 9 21:12:47 EDT 2006


Hi,

This weekend I have been doing some more work on converting reading  
formats to the new BCSequence structure. For many of the formats, I  
have included a comment in the method about the structure of the  
format and a simple example. This not only for documentation, but  
also to help with the coding ;-)

I have a couple of more questions before I can continue

1. What's the difference between Nexus and Nexusfileandblocks ?

2. How is the nona format defined, I couldn't find anything about this?

3. The MSF file now uses the string "Pileup" as a selector. However,  
when searching for the format definition, I found that this format  
uses a '!!NA' or '!!AA' instead. But I may have found the wrong info,  
so if anyone knows which is correct, please let me know.

4. I am thinking about adding a plist file to the framework that  
contains all the file extensions of possible sequence files. This can  
then be used in openPanel (see the code that Alex supplied). The nice  
thing about this is, is that we can synchronize the entries with the  
methods in BCSequenceReader. Any reason I should not do this?


For those who feel like helping out, the way to implement the code is:

- remove white lines (optional)
- get each line
- extract annotations into a BCAnnotationsArray
- extract the sequence(s) into an NSString
- once done with all the sequences, create a BCSequence from each  
sequenceString
- add the annotations to each BCSequence
- add the new BCSequence(s) to the BCSequenceArray
- return the BCSequenceArray


cheers,

- Koen.



More information about the Biococoa-dev mailing list