Annotation views are separated into gene-based views and transcript-based views according to which level the information is more appropriately associated with. This view is a transcript level view. To flip between the two sets of views you can click on the Gene and Transcript tabs in the menu bar at the top of the page.
The table shows all splice variants for a gene and includes noncoding transcripts. Each transcript ID includes a unique, stable 11 digit number. Transcripts beginning with ENST are human transcripts (for example, ENST00000369985). A three-letter code is inserted for other species; (for example, ENSMUST defines a mouse transcript).
If the transcript is a member of the Consensus CoDing Sequence set, the CCDS ID is listed in the transcript table.
The transcript which you are viewing is highlighted in blue in the table.
Immediately below the transcript table, you will find additional information about the transcript you are viewing. This includes:
- Statistics - Number of exons, transcript length in basepairs, and translation length (number of amino acids in the protein)
- CCDS - If the transcript has a Consensus Coding Sequence, the CCDS ID will be listed.
- Type - Both the transcript status and the biotype are shown. More information about this is below.
- Prediction method - The original source of the transcript (the Ensembl annotation pipeline or the VEGA/Havana manual annotation project.
- Alternative transcripts - Matching transcripts, for example the original Vega/Havana transcript identifier
- Frameshift introns - Frameshift introns (if shown) are introduced by the Ensembl genebuild in order to fit the cDNA sequence to the genome. They are indicated below the transcript diagram if these short introns are present. These are introns the length of 1, 2, 4, or 5 basepairs.
Boxes are exons. Lines connecting the boxes are introns. Filled boxes are coding sequence, and empty, unfilled boxes are UTR (UnTranslated Region).
TRANSCRIPT NAMES AND COLOURS (protein coding)
- A red transcript comes from either the Ensembl automatic annotation pipeline or manual curation by the VEGA/Havana project.
- A transcript from the Ensembl annotation pipeline has a number beginning with 2 (for example, MYO6-201) in the transcript name.
- A transcript with Vega/Havana manual curation has a number beginning with 0 (for example, MYO6-001) in the transcript name.
- A gold, or merged, transcript is identical between Ensembl automated annotation and VEGA/Havana manual curation. Only human, mouse, and zebrafish will have gold transcripts. This transcript can be thought of as stable (unlikely to change), and is coloured gold. It is assigned a number beginning with 0.
- A blue, pink or grey transcript is non-coding. See the 'NON-CODING TRANSCRIPTS' section below for more.
WHICH TRANSCRIPT (protein coding)?
Gold (merged) transcripts and those with a CCDS are both reviewed, high quality transcripts in human and mouse.
Depending on factors such as cell type/ tissue type, you may need to use one or more of the transcripts not in these 'reviewed' sets (i.e. not with a CCDS ID, nor in the merged set). The general identifiers link at the left of the transcript tab shows matching IDs in other databases, and may help you decide on transcripts. ESTs and expression data from various projects can be turned on in the Location tab, Region in Detail view. This may be of use when determining which transcript set to use.
Most of our non-coding transcripts (e.g. nonsense mediated decay, processed transcript) are annotated by the VEGA/Havana project, and are blue, pink, or grey. Descriptions can be found in the VEGA/Havana website or in the Ensembl glossary.
- A processed transcript is a noncoding transcript that does not contain an open reading frame (ORF). This type of transcript is annotated by VEGA/Havana.
- Nonsense-mediated decay indicates that the transcript undergoes nonsense mediated decay, a process which detects nonsense mutations and prevents the expression of truncated or erroneous proteins. This type of transcript is annotated by VEGA/Havana.
- Transcribed pseudogenes and other non-coding transcripts. These types are annotated by VEGA/Havana manual curation and the Ensembl annotation pipeline .
KNOWN AND NOVEL STATUS
- A known transcript has a sequence match in a sequence repository external to Ensembl for the same species.
- A novel transcript has a sequence match outside Ensembl for an alternate species. (Can be read as novel transcript for this species).
For more detail on Ensembl annotation, see articles listed here.
If the transcript contains a variation whose alternative allele has a population frequency of at least 10% and is causing the loss or gain of a stop codon in a HapMap population, the variation and affected populations are listed.