[Biococoa-dev] strider and gck format

Alexander Griekspoor a.griekspoor at nki.nl
Sat Mar 11 04:44:39 EST 2006

Hi Koen,

Both the GCK and the Strider files are inherited from the good old  
MacOS9 period. In those days you would identify files using a type/ 
creator code, and that's what I check for in this code. You will  
never see these 4 character codes unless you use a program like  
resedit, and they are not the same as file extensions, in fact under  
MacOS9 you would never use file extensions unless you wanted your  
Windows buddies to open your word file as well. So NSHFSTypeOfFile is  
not the same as [filepath pathExtension].
The problem is that both filetypes are in binary format compared to  
the other formats which are ASCII based. Trying to read in the file  
as a string with ascii encoding creates some garbage which is  
impossible to interpret, let alone determine the filetype from. So  
the options were to either read all files in as data and check if it  
would fit the header of either filetype, and try to make anything out  
of that (quite tricky!) or simply to the filetype check. As a result  
I changed the code to have it pass the filepath instead of the raw  
text to the reader class. It better isolates code (no more reading at  
all in the delegate class), but we sacrifice the possibility to work  
with remote files. Guess this could be easily added to through a  
readFileFromURL method.
Now that I think of it, perhaps we should also check for the file  
extensions that are added to GCK files if you create them on the  
windows platform. I'm not sure if in those files the creator/type  
codes are added, I could check that on monday...
Hope this clears things up?

On 11-mrt-2006, at 2:41, Koen van der Drift wrote:

> Hi,
> I am still a bit confused about the strider and gck formats. In his  
> code, Alex uses the following snippet to determine if a file is one  
> of both formats:
> - (NSDictionary *)readFile:(NSString *)textFile
> {
>     NSMutableDictionary *theContents;
>     NSString *lineBreak;
> 	// BINARY
> 	// Strider?
>     if([NSHFSTypeOfFile(textFile) isEqualToString: @"'xDNA'"]){
> 		theContents =  (NSMutableDictionary*) [self  
> readStriderFile:textFile];
> 	// GCK?
> 	}else if([NSHFSTypeOfFile(textFile) isEqualToString: @"'GCKc'"] ||  
> [NSHFSTypeOfFile(textFile) isEqualToString: @"'GCKs'"]){
> 		theContents =  (NSMutableDictionary*) [self readGCKFile:textFile];
> 	// TEXT
> 	}else {
> So it's based on the file type. However, looked on the net for some  
> sample files to test the code, and found those, but almost none of  
> them have the xDNA, GCKc, or GCKs extension. So those files will be  
> skipped by the code.  For all other formats we use a recognition  
> string within the file, eg > for fasta or HEADER for PDB's.  I  
> think that is a better approach, since we are not dependent on file  
> types but on file content.
> Is there a typical recognition string for these data formats that  
> we can use for file recognition? If not, is there another way we  
> can make sure we catch all Strider and/or GCK files?
> cheers,
> - Koen.
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev

                     ** Alexander Griekspoor **
               The Netherlands Cancer Institute
               Department of Tumorbiology (H4)
          Plesmanlaan 121, 1066 CX, Amsterdam
                   Tel:  + 31 20 - 512 2023
                   Fax:  + 31 20 - 512 2029
                   AIM: mekentosj at mac.com
                   E-mail: a.griekspoor at nki.nl
               Web: http://www.mekentosj.com

                             iRNAi, do you?


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/biococoa-dev/attachments/20060311/bf49b0bb/attachment.html>

More information about the Biococoa-dev mailing list