The GENCODE Project: Encyclopædia of genes and gene variants

Background

The National Human Genome Research Institute (NHGRI) launched a public research consortium named ENCODE, the Encyclopedia Of DNA Elements, in September 2003, to carry out a project to identify all functional elements in the human genome sequence. After a successful pilot phase on 1% of the genome, the scale-up to the entire genome is now underway. The Wellcome Sanger Institute was awarded a grant to carry out a scale-up of the GENCODE project for integrated annotation of gene features.

Having been involved in successfully delivering the definitive annotation of functional elements in the human genome, the GENCODE group were awarded a second grant in 2013 in order to continue their human genome annotation work and expand GENCODE to include annotation of the mouse genome. A third grant was awarded in 2017 for the continued improvement of the annotation of the human and mouse genomes.

The GENCODE gene sets are used by the entire ENCODE consortium and by many other projects (eg. Genotype-Tissue Expression (GTEx), The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium (ICGC), NIH Roadmap Epigenomics Mapping Consortium, Blueprint Epigenome Project, Exome Aggregation Consortium (EXAC), Genome Aggregation Database (gnomAD), 1000 Genomes Project and the Human Cell Atlas (HCA)) as reference gene sets.

Current GENCODE Goals

The aims of the current GENCODE phase running from 2017 to 2021 are:

The process to create this annotation involves manual curation, computational analysis and targeted experimental approaches.

The human and mouse GENCODE resources will continue to be available to the research community with regular releases of Ensembl genome browser and the UCSC genome browser will continue to present the current release of the GENCODE gene set.

Participants, PI & Co-PIs

List of all participants

GENCODE collaborators

We are working in close collaboration with various other research groups around the world. These include the NCBI (eg. Terence Murphy, CCDS project), CSHL (Tom Gingeras group) and others.

Please contact us if you would like to start a collaboration with the GENCODE project.

Acknowledgements

The GENCODE project is funded by the National Human Genome Research Institute (NHGRI) (2U41HG007234) and the European Molecular Biology Laboratory.

When referencing, please use ”Frankish A, et al (2018) GENCODE reference annotation for the human and mouse genomes” (PubMed).