Human
Mouse
How to access data
FAQ
Documentation
- Data format
- Tags
- Biotypes
- Custom array
- Benchmarking
- LRGASP
- RGASP
About us

The GENCODE Project: Encyclopædia of genes and gene variants

Background

The National Human Genome Research Institute (NHGRI) launched a public research consortium named ENCODE, the Encyclopedia Of DNA Elements, in September 2003, to carry out a project to identify all functional elements in the human genome sequence. After a successful pilot phase on 1% of the genome, the scale-up to the entire genome is now underway. The Wellcome Sanger Institute was awarded a grant to carry out a scale-up of the GENCODE project for integrated annotation of gene features.

Having been involved in successfully delivering the definitive annotation of functional elements in the human genome, the GENCODE group were awarded a second grant in 2013 in order to continue their human genome annotation work and expand GENCODE to include annotation of the mouse genome. A third grant was awarded in 2017 for the continued improvement of the annotation of the human and mouse genomes, and a fourth grant followed in 2021. Details of the grants awarded can be found here.

GENCODE gene annotation is used by a large number of international consortia.

Current GENCODE Goals

The aims of the current GENCODE phase running from 2021 to 2025 are to continue to improve the coverage and accuracy of the GENCODE human and mouse gene sets by enhancing and extending the annotation of all evidence-based gene features in the human genome at a high accuracy, including protein-coding loci with alternatively splices variants, non-coding loci and pseudogenes.

The process to create this annotation involves manual curation, computational analysis and targeted experimental approaches.

The human and mouse GENCODE resources will continue to be available to the research community with regular releases of Ensembl genome browser and the UCSC genome browser will continue to present the current release of the GENCODE gene set.

Participants, PI & Co-PIs

Fergal Martin (Lead PI), EMBL European Bioinformatics Institute, Cambridge, UK
Roderic Guigo (PI), Centre de Regulació Genòmica (CRG), Barcelona, Catalonia, Spain
Manolis Kellis (PI), Massachusetts Institute of Technology (MIT), Boston, USA
Mark Gerstein (PI), Yale University, New Haven, USA
Benedict Paten (PI), University of California, Santa Cruz, California, USA
Michael Tress, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
Anshul Kundaje, Stanford University, California, USA

List of all participants

GENCODE collaborators

We are working in close collaboration with various other resources and research groups around the world. These include the NCBI (Terence Murphy, CCDS project), Dana Farber Cancer Institute (Marc Vidal and David Hill, CCSB), and others.

Please contact us if you would like to start a collaboration with the GENCODE project.

Acknowledgements

The GENCODE project is funded by the National Human Genome Research Institute (NHGRI) (HG007234) and the European Molecular Biology Laboratory.

When citing GENCODE, please use our current primary reference here.

Cookies policy | Terms of use