Statistics about the GENCODE Release M1

The statistics derive from the gtf file that contains only the annotation of the main chromosomes.

For details about the calculation of these statistics please see the README_stats.txt file.

General stats

Total No of Genes 37310
Protein-coding genes 22380
Long non-coding RNA genes 3845
Small non-coding RNA genes 5395
Pseudogenes 5209
- polymorphic pseudogenes 8
- pseudogenes 5201
Immunoglobulin/T-cell receptor gene segments
- protein coding segments 481
Total No of Transcripts 95495
Protein-coding transcripts 51733
- full length protein-coding 43329
- partial length protein-coding 8404
Nonsense mediated decay transcripts 3784
Long non-coding RNA loci transcripts 5669
 
Total No of distinct translations 43187
Genes that have more than one distinct translations 9943

Further details on this version's gene and transcript types

biotype genes transcripts
ambiguous_orf 0 30
antisense 0 1876
disrupted_domain 0 1
IG_C_gene 13 13
IG_D_gene 25 25
IG_J_gene 88 88
IG_V_gene 355 356
lincRNA 1273 2072
miRNA 1577 1577
misc_RNA 487 487
Mt_rRNA 2 2
Mt_tRNA 22 22
ncrna_host 0 3
non_coding 0 75
nonsense_mediated_decay 0 3784
polymorphic_pseudogene 8 12
processed_pseudogene 0 147
processed_transcript 2572 12683
protein_coding 22380 51733
pseudogene 5201 541
retained_intron 0 11496
retrotransposed 0 259
rRNA 332 332
sense_intronic 0 87
snoRNA 1552 1552
snRNA 1423 1423
TEC 0 5
transcribed_processed_pseudogene 0 3761
transcribed_unprocessed_pseudogene 0 795
unitary_pseudogene 0 6
unprocessed_pseudogene 0 252