[BiO BB] KEGG vs GO

Samantha Fox bioinfosm at gmail.com
Thu Apr 6 14:54:25 EDT 2006


Thanks all for the replies.
It was nice reading Michael's explanation. So in effect if I am using sets
of genes belonging to a particular pathway in 3rd level of KO, its quite
similar to using the sets of genes sharing GO terms (just going by
nomeclature for now, but we may later have ko2go mapping)

~S

On 4/6/06, Michael Ashburner (Genetics) <ma11 at gen.cam.ac.uk> wrote:
>
> I think that there is some confusion in this thread.
>
> 1. There is the Gene Ontology.  Its terms are used (primarily)
> for the annotation of gene products.  Both the Ontology and the
> annotations contributed by the members of the GO Consortium database
> are available from the GO site.
>
> 2. There is the KEGG Orthology, available from the KEGG site.
> This is _both_ an ontology, seen, for example, by
> opening KO up to its 3rd level:
> http://www.genome.ad.jp/dbget-bin/get_htext?KO+-s+F+-f+F+C
> _and_ annotations of classes of gene product, seen if it is opened up
> to level 4:
> http://www.genome.ad.jp/dbget-bin/get_htext?KO+-s+F+-f+F+D
>
>
> It would be easy for us to make a mapping between the Gene Ontology
> and KO (level 3), except that the KO includes domains outwith the GO
> (e.g.  01500 Human Diseases, and its child terms).  In fact we will
> do that and make it available as a ko2go mapping file on GO. We do not
> need the "SwissProt Relational Database" to do this. Indeed, KEGG already
> provide many of these mappings to the GO.
>
> Mapping to level 4 is more problematic.  The KO presents three levels:
>
> Ontology terms ("Levels 1-3")
>         e.g.: 00010 Glycolysis / Gluconeogenesis PATH:ko00010] [GO:0006096
> 0006094]
> Families of proteins ("Level 4")
>         e.g.  K00845 E2.7.1.2, glk; glucokinase [EC:2.7.1.2] [COG:COG0837]
> [GO:0004340]
> Genes, whose products are members of this family
>         e.g. Genes HSA: 2645(GCK)
>
> While for those Level 4 terms that are enzymes a 'mapping' of KO to the GO
> would not be hard, it gets more difficult further down.  Consider the
> term:
> K06051 DLL; delta
> This is a child of (among others)
> Notch signaling pathway [PATH:ko04330] {which would map to the GO)
> and has children:
> HSA: 10683(DLL3) 28514(DLL1) 54567(DLL4)
> MMU: 13388(Dll1) 13389(Dll3) 54485(Dll4)
> RNO: 114125(Dll3) 311332(Dll4_predicted) 84010(Dll1)
> XLA: 379238(MGC52561)
> DRE: 30120(dlc) 30131(dla) 30138(dld) 30141(dlb)
> DME: CG3619-PA(Dmel_CG3619)
> Which are clearly individual gene products.
>
> Thus, I conclude, that KO's: K06051 DLL; delta  is a _genus_
> of gene products.  This is conceptually very different from the GO,
> despite what may seem to be superficial similarities.
>
> So, contra Lucy, the difference between the GO and KO has nothing to
> do with manual vs automatic annotation, or on the 'focus' of the KO,
> but rather they differ in their underlying structure.
>
> Michael
>
>
> lucifer at slimy.greenend.org.uk wrote:
> > "Samantha Fox" <bioinfosm at gmail.com> writes:
> >
> >> I was wondering how KEGG and GO differ from a broad perspective of
> >> grouping functionally related genes.  So a KEGG pathway lists all
> >> genes that kind of work together, and a similar GO term would also
> >> contain such > a gene list.
> >
> >
> > IIRC, KEGG is manually created from the literature whilst GO also
> > contains automatic/electronic annotation based on sequence homology.
> > KEGG also focuses more on metabolic pathways, whilst GO covers a more
> > comprehensive set of cellular processes and molecular functions.
> >
> > Hope that helps,
>
> It should be possible to 'cross correlate' KEGG an GO in a number of
> different ways using one of the SWISSPROT relational databases. However
> you should know that generally 'ontology mapping' is an open problem :)
>
> Good luck!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/bbb/attachments/20060406/ff500237/attachment.html>


More information about the BBB mailing list