[Biococoa-dev] strider and gck format
Alexander Griekspoor
a.griekspoor at nki.nl
Sat Mar 11 04:44:39 EST 2006
Hi Koen,
Both the GCK and the Strider files are inherited from the good old
MacOS9 period. In those days you would identify files using a type/
creator code, and that's what I check for in this code. You will
never see these 4 character codes unless you use a program like
resedit, and they are not the same as file extensions, in fact under
MacOS9 you would never use file extensions unless you wanted your
Windows buddies to open your word file as well. So NSHFSTypeOfFile is
not the same as [filepath pathExtension].
The problem is that both filetypes are in binary format compared to
the other formats which are ASCII based. Trying to read in the file
as a string with ascii encoding creates some garbage which is
impossible to interpret, let alone determine the filetype from. So
the options were to either read all files in as data and check if it
would fit the header of either filetype, and try to make anything out
of that (quite tricky!) or simply to the filetype check. As a result
I changed the code to have it pass the filepath instead of the raw
text to the reader class. It better isolates code (no more reading at
all in the delegate class), but we sacrifice the possibility to work
with remote files. Guess this could be easily added to through a
readFileFromURL method.
Now that I think of it, perhaps we should also check for the file
extensions that are added to GCK files if you create them on the
windows platform. I'm not sure if in those files the creator/type
codes are added, I could check that on monday...
Hope this clears things up?
Alex
On 11-mrt-2006, at 2:41, Koen van der Drift wrote:
> Hi,
>
> I am still a bit confused about the strider and gck formats. In his
> code, Alex uses the following snippet to determine if a file is one
> of both formats:
>
> - (NSDictionary *)readFile:(NSString *)textFile
> {
> NSMutableDictionary *theContents;
> NSString *lineBreak;
>
> // BINARY
> // Strider?
> if([NSHFSTypeOfFile(textFile) isEqualToString: @"'xDNA'"]){
>
> theContents = (NSMutableDictionary*) [self
> readStriderFile:textFile];
>
> // GCK?
> }else if([NSHFSTypeOfFile(textFile) isEqualToString: @"'GCKc'"] ||
> [NSHFSTypeOfFile(textFile) isEqualToString: @"'GCKs'"]){
>
> theContents = (NSMutableDictionary*) [self readGCKFile:textFile];
>
> // TEXT
> }else {
>
>
> So it's based on the file type. However, looked on the net for some
> sample files to test the code, and found those, but almost none of
> them have the xDNA, GCKc, or GCKs extension. So those files will be
> skipped by the code. For all other formats we use a recognition
> string within the file, eg > for fasta or HEADER for PDB's. I
> think that is a better approach, since we are not dependent on file
> types but on file content.
>
> Is there a typical recognition string for these data formats that
> we can use for file recognition? If not, is there another way we
> can make sure we catch all Strider and/or GCK files?
>
> cheers,
>
> - Koen.
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
AIM: mekentosj at mac.com
E-mail: a.griekspoor at nki.nl
Web: http://www.mekentosj.com
iRNAi, do you?
http://www.mekentosj.com/irnai
*********************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/biococoa-dev/attachments/20060311/bf49b0bb/attachment.html>
More information about the Biococoa-dev
mailing list