Statistics about the current GENCODE Release (version M39)

The statistics derive from the gtf file that contains only the annotation of the main chromosomes.

For details about the calculation of these statistics please see the README_stats.txt file.

General stats

Total No of Genes 78289
Protein-coding genes 21540
- readthrough genes (not included) 230
Long non-coding RNA genes 36103
Small non-coding RNA genes 6105
Pseudogenes 13818
- processed pseudogenes 10249
- unprocessed pseudogenes 3153
- unitary pseudogenes 206
- pseudogenes 2
Immunoglobulin/T-cell receptor gene segments
- protein coding segments 493
- pseudogenes 208
Total No of Transcripts 481871
Protein-coding transcripts 180331
- full length protein-coding 166735
- partial length protein-coding 13596
Nonsense mediated decay transcripts 89109
Long non-coding RNA loci transcripts 155878
 
Total No of distinct translations 123098
Genes that have more than one distinct translations 15589

Further details on this version's gene and transcript types

biotype genes transcripts
IG_C_gene 13 22
IG_C_pseudogene 1 1
IG_D_gene 19 19
IG_D_pseudogene 4 4
IG_J_gene 14 14
IG_LV_gene 3 3
IG_pseudogene 1 1
IG_V_gene 218 270
IG_V_pseudogene 158 158
lncRNA 32884 152085
miRNA 2201 2201
misc_RNA 562 566
Mt_rRNA 2 2
Mt_tRNA 22 22
non_stop_decay 0 27
nonsense_mediated_decay 0 89109
processed_pseudogene 9311 9311
processed_transcript 0 11
protein_coding 21770 180331
protein_coding_CDS_not_defined 0 14295
protein_coding_LoF 0 115
pseudogene 2 2
retained_intron 0 22074
ribozyme 22 22
rRNA 354 354
scaRNA 51 51
scRNA 1 1
snoRNA 1507 1507
snRNA 1381 1381
sRNA 2 2
TEC 3219 3302
TR_C_gene 8 9
TR_D_gene 4 4
TR_J_gene 70 70
TR_J_pseudogene 10 10
TR_V_gene 144 170
TR_V_pseudogene 34 34
transcribed_processed_pseudogene 938 938
transcribed_unitary_pseudogene 94 97
transcribed_unprocessed_pseudogene 978 986
translated_unprocessed_pseudogene 3 3
unitary_pseudogene 112 112
unprocessed_pseudogene 2172 2175