[Biococoa-dev] BCSequence class cluster? [Was Re: Introducing myself]

Koen van der Drift koenvanderdrift at gmail.com
Sat Feb 28 13:38:53 EST 2009


On Feb 28, 2009, at 12:51 PM, Scott Christley wrote:
>
> * I'm not very enamored with the BCSequenceArray class.  I'm not  
> sure how it is any better than just using a standard NSArray, I  
> tried looking in the archives for discussion but didn't really find  
> anything.  However my guess is that BCSequenceArray would somehow  
> provide additional sequence specific functionality?  Personally, I  
> don't really want to treat BCSequence objects in any special way.  I  
> think its best if users can include into standard collections  
> (NSArray, NSDictionary, NSSet) instead of having to use specialized  
> collections.  Thoughts?

I think I added it to be a replacement for NSArray. I cannot really  
think what any additional functionality could have been, if I remember  
I'll post it here.


>
> * Craig makes a good point about being able to do -sequenceWithId:  
> to lookup a sequence.  One issue to be aware of is that the id's in  
> the FASTA files are not necessarily unique.  In fact, the definition  
> of the sequences often lie outside of the fasta file.  Now if you  
> download from  NCBI then you have a good chance of getting unique  
> id's, but take UCSC's goldenPath for example.  If you download the  
> human genome from there, the id just says chr1, chr2, etc.  Mix and  
> match with another organism and you can quickly forget which chr  
> goes with whom.
> So from this perspective, we need to be careful not to rely upon the  
> id's being unique.  Typically id's are unique within a file, but  
> this would really have to be a contract that the user enforces, it  
> is not part of the FASTA format.

This is why we added the BCAnnotation and BCFeature classes. Most data  
formats have different labels for name, sequence, authors, etc. I  
think the idea was to make our own definitions (BSSequenceName,  
BCSequenceAuthor, etc), and let BCSequenceReader take care of putting  
the right annotation and or feature in combination wth the actual  
sequence.

- Koen.





More information about the Biococoa-dev mailing list