Statistics about the GENCODE Release M4

The statistics derive from the gtf file that contains only the annotation of the main chromosomes.

For details about the calculation of these statistics please see the README_stats.txt file.

General stats

Total No of Genes 43346
Protein-coding genes 22032
Long non-coding RNA genes 6951
Small non-coding RNA genes 5853
Pseudogenes 8031
- processed pseudogenes 5560
- unprocessed pseudogenes 2171
- unitary pseudogenes 15
- polymorphic pseudogenes 18
- pseudogenes 178
Immunoglobulin/T-cell receptor gene segments
- protein coding segments 479
- pseudogenes 89
Total No of Transcripts 103639
Protein-coding transcripts 48482
- full length protein-coding 38578
- partial length protein-coding 9904
Nonsense mediated decay transcripts 4558
Long non-coding RNA loci transcripts 9962
 
Total No of distinct translations 39260
Genes that have more than one distinct translations 8394

Further details on this version's gene and transcript types

biotype genes transcripts
3prime_overlapping_ncrna 2 3
antisense 1838 2666
IG_C_gene 13 20
IG_C_pseudogene 1 1
IG_D_gene 21 22
IG_D_pseudogene 4 4
IG_J_gene 74 75
IG_LV_gene 196 197
IG_V_gene 91 132
IG_V_pseudogene 69 69
lincRNA 2998 4463
miRNA 1973 1973
misc_RNA 590 590
Mt_rRNA 2 2
Mt_tRNA 22 22
non_stop_decay 0 9
nonsense_mediated_decay 0 4558
polymorphic_pseudogene 18 23
processed_pseudogene 5420 5421
processed_transcript 756 13314
protein_coding 22032 48482
pseudogene 178 189
retained_intron 0 14221
rRNA 353 353
sense_intronic 149 161
sense_overlapping 19 25
snoRNA 1530 1530
snRNA 1383 1383
TEC 1189 1249
TR_C_gene 2 3
TR_D_gene 2 2
TR_J_gene 13 15
TR_J_pseudogene 1 1
TR_V_gene 67 90
TR_V_pseudogene 14 14
transcribed_processed_pseudogene 139 142
transcribed_unprocessed_pseudogene 128 138
translated_processed_pseudogene 1 13
translated_unprocessed_pseudogene 1 1
unitary_pseudogene 15 15
unprocessed_pseudogene 2042 2048