The GENCODE Project: Encyclopædia of genes and gene variants
Current Gencode version
The current version is Gencode 17, released on the 18/06/2013.
** NEW ** This release has two extra annotation GTFs:
1) a GTF that has not only the annotation in the main chromosomes but also the annotation in patches/scaffolds/haplotypes: gencode.v17.chr_patch_hapl_scaff.annotation.gtf.
2) a GTF that has only the annotation in patches/scaffolds/haplotypes: gencode.v17.patch_hapl_scaff.annotation.gtf.
For more information about this release please see the README.txt file.
Introduction
The National Human Genome Research Institute (NHGRI) launched a public research consortium named ENCODE, the Encyclopedia Of DNA Elements, in September 2003, to carry out a project to identify all functional elements in the human genome sequence. After a successful pilot phase on 1% of the genome, the scale-up to the entire genome is now underway. The Wellcome Trust Sanger Institute was awarded a grant to carry out a scale-up of the GENCODE project for integrated annotation of gene features.
Details
The aim of GENCODE as a sub-project of the ENCODE scale-up project is to annotate all evidence-based gene features in the entire human genome at a high accuracy. The result will be a set of annotations including all protein-coding loci with alternatively transcribed variants, non-coding loci with transcript evidence, and pseudogenes. The process to create this annotation involves manual curation, different computational analysis and targeted experimental approaches. Putative loci can be verified by wet-lab experiments and computational predictions will be analysed manually.
The international team working in the GENCODE project is headed by Tim Hubbard at the Wellcome Trust Sanger Institute.
The Gencode gene sets are used by the entire ENCODE consortium and by many other projects (eg. 1000 Genomes) as reference gene sets.
About
The Gencode project is funded through an NHGRI ENCODE grant with additional funding from the Wellcome Trust.
When referencing, please use "Harrow J, et al. (2012) GENCODE: The reference human genome annotation for The ENCODE Project" (PubMed). Before the release of this paper authors were asked to reference "Harrow J, et al. (2006) GENCODE: producing a reference annotation for ENCODE" (PubMed).
To assess the current progress of automatic gene building using RNASeq as its primary dataset the RNASeq genome annotation assessment project (RGASP) was launched. For more information about the different phases of this project please use the links on the left.
