GO2MSIG GO based GSEA gene set generator

GO2MSIG generates collections of gene sets in MSigDB format based on the Gene Ontology (GO) project hierarchy and gene association data. These collections can be used directly with the Gene Set Enrichment Analysis (GSEA) implementation available at the Broad Institute. Gene set collections can be automatically created for a wide variety of species.

The easiest way to use GO2MSIG is via the website:


The website is setup to use sensible defaults. NCBI gene2go is a good database to use if your organism is contained in that dataset (Appendix A). If not, then the GO annotation database is also available. The default set of evidence codes used is the same as for the prebuilt GO based gene set collections provided at MSigDB. For many less well characterised species almost all GO term associations have the automated electronic annotation code 'IEA' and including all evidence codes is more appropriate in that case.

Prebuilt gene sets for human, mouse, rat and other species are available.

Detailed usage information and instructions for local installation.


For publication of results please cite:

Powell, J. A. C. (2014). GO2MSIG, an automated GO based multi-species gene set generator for gene set enrichment analysis. BMC Bioinformatics, 15(1), 146. doi:10.1186/1471-2105-15-146

Website updated 21 February 2016.   For questions about this project please contact jacp10 at bioinformatics.org.