[BiO BB] Genbank file conversion to GCG format

Peter Rice pmr at ebi.ac.uk
Fri Jul 20 12:22:32 EDT 2007


> On Jul 20, 2007, at 4:53 AM, Sterten at aol.com wrote:
> 
> no checksum needed these days, data storage is reliable.

Hah! It isn't reliable in this case. GCG added the checksum to catch 
users who edited their sequence files and deliberately (or accidentally 
while editing the annotation in the header) changed the sequence data :-)

If you edit a GCG file, you run the reformat program to calculate a new 
checksum line.

> Should be easy to write a short program to convert the  formats...

Generating the checksum is ... ummm .... interesting. It helps if you 
have a friend with access to the GCG "reformat" program who can tell you 
if you got it right. Some years ago there was a thread in one of the 
bionet newsgroups (ah, those were the days) when it took 4 attempts 
before someone could reliably agree with GCG's calculation (upper and 
lower case, numeric characters, spaces, and other IUPAC "standard" 
sequence characters such as "=" all have to be considered.

regards,

Peter



More information about the BBB mailing list