[Biococoa-dev] More on sequence formats
Koen van der Drift
kvddrift at earthlink.net
Sun Apr 9 21:12:47 EDT 2006
Hi,
This weekend I have been doing some more work on converting reading
formats to the new BCSequence structure. For many of the formats, I
have included a comment in the method about the structure of the
format and a simple example. This not only for documentation, but
also to help with the coding ;-)
I have a couple of more questions before I can continue
1. What's the difference between Nexus and Nexusfileandblocks ?
2. How is the nona format defined, I couldn't find anything about this?
3. The MSF file now uses the string "Pileup" as a selector. However,
when searching for the format definition, I found that this format
uses a '!!NA' or '!!AA' instead. But I may have found the wrong info,
so if anyone knows which is correct, please let me know.
4. I am thinking about adding a plist file to the framework that
contains all the file extensions of possible sequence files. This can
then be used in openPanel (see the code that Alex supplied). The nice
thing about this is, is that we can synchronize the entries with the
methods in BCSequenceReader. Any reason I should not do this?
For those who feel like helping out, the way to implement the code is:
- remove white lines (optional)
- get each line
- extract annotations into a BCAnnotationsArray
- extract the sequence(s) into an NSString
- once done with all the sequences, create a BCSequence from each
sequenceString
- add the annotations to each BCSequence
- add the new BCSequence(s) to the BCSequenceArray
- return the BCSequenceArray
cheers,
- Koen.
More information about the Biococoa-dev
mailing list