[Biococoa-dev] Even more on sequence formats
Alexander Griekspoor
a.griekspoor at nki.nl
Wed Apr 12 03:54:22 EDT 2006
Thanks Koen, an important file format added -cough-!
On a more relevant note, the binary file format reading now works on
intel as well, here are the update methods:
- (NSDictionary *)readStriderFile:(NSString *)textFile{
/*
Binary file format, read in header, determine features and sequence
-> create dictionary.
*/
STRIDER_HEADER *signature;
NSMutableDictionary *matrixDictionary = [NSMutableDictionary
dictionary];
NSMutableDictionary *striderDictionary = [NSMutableDictionary
dictionary];
NSMutableArray *itemArray = [NSMutableArray arrayWithCapacity:10];
NSData *data = [NSData dataWithContentsOfFile: textFile];
// Memory alloc and read in struct
signature = malloc(sizeof(STRIDER_HEADER));
[data getBytes: signature length: sizeof(STRIDER_HEADER)];
// Sequence
NSData *seqdata = [data subdataWithRange: NSMakeRange(sizeof
(STRIDER_HEADER), CFSwapInt32BigToHost(signature->nLength))];
NSString *sequence = [[NSString alloc] initWithBytes: [seqdata
bytes] length: [seqdata length] encoding: NSASCIIStringEncoding];
NSString *filename = [[textFile lastPathComponent]
stringByDeletingPathExtension];
[matrixDictionary setObject:sequence forKey:filename];
[itemArray addObject: filename];
[sequence release];
// Comments
if(signature->com_length > 0){
NSData *comdata = [data subdataWithRange: NSMakeRange([data
length] - CFSwapInt32BigToHost(signature->com_length),
CFSwapInt32BigToHost(signature->com_length))];
NSString *comments = [[NSString alloc] initWithBytes:
[comdata bytes] length: [comdata length] encoding:
NSASCIIStringEncoding];
[striderDictionary setObject:comments forKey:@"comments"];
[comments release];
}
[striderDictionary setObject:matrixDictionary forKey:@"matrix"];
[striderDictionary setObject:itemArray forKey:@"items"];
[striderDictionary setObject:@"DNAStrider" forKey:@"fileType"];
// Clean up
free(signature);
return striderDictionary;
}
- (NSDictionary *)readGCKFile:(NSString *)textFile{
/*
Binary file format, read in header, determine features and sequence
-> create dictionary.
Same as DNA strider but comments are ignored
*/
GCK_HEADER *signature;
NSMutableDictionary *matrixDictionary = [NSMutableDictionary
dictionary];
NSMutableDictionary *gckDictionary = [NSMutableDictionary
dictionary];
NSMutableArray *itemArray = [NSMutableArray arrayWithCapacity:10];
NSData *data = [NSData dataWithContentsOfFile: textFile];
// Memory alloc and read in struct
signature = malloc(sizeof(GCK_HEADER));
[data getBytes: signature length: sizeof(GCK_HEADER)];
// Sequence
NSData *seqdata = [data subdataWithRange: NSMakeRange(sizeof
(GCK_HEADER), CFSwapInt32BigToHost(signature->nLength))];
NSString *sequence = [[NSString alloc] initWithBytes: [seqdata
bytes] length: [seqdata length] encoding: NSASCIIStringEncoding];
NSString *filename = [[textFile lastPathComponent]
stringByDeletingPathExtension];
[matrixDictionary setObject:sequence forKey:filename];
[itemArray addObject: filename];
[sequence release];
[gckDictionary setObject:matrixDictionary forKey:@"matrix"];
[gckDictionary setObject:itemArray forKey:@"items"];
[gckDictionary setObject:@"Gene Construction Kit" forKey:@"fileType"];
// Clean up
free(signature);
return gckDictionary;
}
Cheers,
Alex
On 12-apr-2006, at 1:02, Charles Parnot wrote:
> I got you on this one, Koen :-)
>
> btw, great work. I see all these entries in the BioCocoa svn RSS
> feed in NetNewsWire, and I am amazed!
>
> charles
>
>
>> On Apr 11, 2006, at 6:19 PM, Alexander Griekspoor wrote:
>>
>>> Hi Koen,
>>>
>>> The format is explained in detail on this page I happened to
>>> encounter: http://www.mekentosj.com/enzymex
>>> I copied the relevant part below:
>>>
>>> Another sequence format?
>>> Not really. The files EnzymeX creates look like normal files, but
>>> right-click and open their contents and you will see that they
>>> consist of a simple FASTA file and a file in which EnzymeX stores
>>> its preferences. Send a file to someone who doesn't have EnzymeX
>>> or to a Windows user and they will simply see a folder with a
>>> FASTA file. No problem! If you have any questions about the exDNA
>>> file "format", don't hesitate to contact us.
>>
>>
>> Hehehehehe :)
>>
>> - Koen.
>
> --
> Xgrid-at-Stanford
> Help science move fast forward:
> http://cmgm.stanford.edu/~cparnot/xgrid-stanford
>
> Charles Parnot
> charles.parnot at gmail.com
>
>
>
>
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
AIM: mekentosj at mac.com
E-mail: a.griekspoor at nki.nl
Web: http://www.mekentosj.com
Windows is a 32-bit patch to a 16-bit shell for an 8-bit
operating system, written for a 4-bit processor by a 2-
bit company without 1 bit of sense.
*********************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/biococoa-dev/attachments/20060412/1a56cff9/attachment.html>
More information about the Biococoa-dev
mailing list