HumanEnsembl Home

What do the different biotypes in Ensembl mean?

The Ensembl automatic annotation system classifies genes and transcripts into biotypes inlcuding: protein_coding, pseudogene, processed_pseudogene, miRNA, rRNA, scRNA, snoRNA, snRNA.

For human, mouse and selected other species, we incorporate manual annotation from Havana. For genes and transcripts that include manual annotation, we display the manually assigned biotype. The full list of Havana biotypes can be found here.

The biotypes can be grouped into protein coding, pseudogene, long noncoding and short noncoding. Examples of biotypes in each group are as follows:

  1. Protein coding: IG_C_gene, IG_D_gene, IG_gene, IG_J_gene, IG_LV_gene, IG_M_gene, IG_V_gene, IG_Z_gene, nonsense_mediated_decay, nontranslating_CDS, non_stop_decay, polymorphic, polymorphic_pseudogene, protein_coding, TR_C_gene, TR_D_gene, TR_gene, TR_J_gene, TR_V_gene
  2. Pseudogene: disrupted_domain, IG_C_pseudogene, IG_J_pseudogene, IG_pseudogene, IG_V_pseudogene, processed_pseudogene, pseudogene, transcribed_processed_pseudogene, transcribed_unitary_pseudogene, transcribed_unprocessed_pseudogene, translated_processed_pseudogene, TR_J_pseudogene, TR_pseudogene, TR_V_pseudogene, unitary_pseudogene, unprocessed_pseudogene
  3. Long noncoding: 3prime_overlapping_ncrna, ambiguous_orf, antisense, antisense_RNA, lincRNA, ncrna_host, non_coding, processed_transcript, retained_intron, sense_intronic, sense_overlapping
  4. Short noncoding: miRNA, miRNA_pseudogene, misc_RNA, misc_RNA_pseudogene, Mt_rRNA, Mt_tRNA, Mt_tRNA_pseudogene, ncRNA, ncRNA_pseudogene, rRNA, rRNA_pseudogene, scRNA, scRNA_pseudogene, snlRNA, snoRNA, snoRNA_pseudogene, snRNA, snRNA_pseudogene, tRNA, tRNA_pseudogene

If you have any other questions about Ensembl, please do not hesitate to contact our HelpDesk. You may also like to subscribe to the developers' mailing list.