[BiO BB] GI numbers

Robson Francisco de Souza rfsouza at citri.iq.usp.br
Fri Mar 26 12:48:18 EST 2004


I'm analyzing a set of sequences with regard to their classifications as
homologs from both COG and Kegg databases of orthologs. Although both
COG and Kegg provide tables relating gene names to GI (PID) numbers,
I'm, up to this moment, unable to map GIs from one dataset to the other,
in order to check classifications for genes in both catalogs.

GIs from COG appear to be from RefSeq and those from Kegg seem to be
from GenPept. How can I map GI numbers from Kegg to GI numbers from COG
database? Is there any query I can make to download such info for 185904
proteins in COG and their equivalents on Kegg Orthologs database?

Here is an example:

Sequence 14600509 is the protein coded by gene APE0180 from Aeropyrum 
pernix complete genome, as described in COG's table myva=gb. The same 
sequence is identified by GI 5103570 in Kegg. In this case, I was able map
COG's GI to Kegg's GI by using the gene identifier and annotation, a
procedure that is not easily automated.

How can I retrive equivalent IDs for the whole COG gene set?

Thanks in advance for any help.

More information about the BBB mailing list