Statistics about the GENCODE Release M3

The statistics derive from the gtf file that contains only the annotation of the main chromosomes.

For details about the calculation of these statistics please see the README_stats.txt file.

General stats

Total No of Genes 41128
Protein-coding genes 22026
Long non-coding RNA genes 5385
Small non-coding RNA genes 5853
Pseudogenes 7388
- polymorphic pseudogenes 13
- pseudogenes 7373
Immunoglobulin/T-cell receptor gene segments
- protein coding segments 476
- pseudogenes 2
Total No of Transcripts 99839
Protein-coding transcripts 47979
- full length protein-coding 38350
- partial length protein-coding 9629
Nonsense mediated decay transcripts 4382
Long non-coding RNA loci transcripts 8170
 
Total No of distinct translations 38956
Genes that have more than one distinct translations 8259

Further details on this version's gene and transcript types

biotype genes transcripts
3prime_overlapping_ncrna 2 3
antisense 1731 2511
IG_C_gene 12 14
IG_D_gene 25 25
IG_J_gene 88 88
IG_LV_gene 304 305
IG_V_gene 2 3
IG_V_pseudogene 1 1
lincRNA 2769 4101
miRNA 1973 1973
misc_RNA 590 590
Mt_rRNA 2 2
Mt_tRNA 22 22
non_stop_decay 0 7
nonsense_mediated_decay 0 4382
polymorphic_pseudogene 13 14
processed_pseudogene 0 5031
processed_transcript 738 13343
protein_coding 22026 47979
pseudogene 7373 199
retained_intron 0 13568
rRNA 353 353
sense_intronic 129 139
sense_overlapping 16 36
snoRNA 1530 1530
snRNA 1383 1383
TR_V_gene 45 62
TR_V_pseudogene 1 1
transcribed_processed_pseudogene 0 134
transcribed_unprocessed_pseudogene 0 131
translated_processed_pseudogene 0 13
translated_unprocessed_pseudogene 0 1
unitary_pseudogene 0 16
unprocessed_pseudogene 0 1879