EnsemblEnsembl Home

Ensembl Assembly and Genebuild News

Release 90

New rodent species (multiple species)

In this release we are adding 12 new rodent species to Ensembl (including three species that were formerly available on our Pre site), and updating three existing non-murine species (guinea pig, squirrel and kangaroo rat). The new genomes are:

  • Cavia aperea (Brazilan guinea pig)
  • Chinchilla lanigera (Long-tailed chinchilla)
  • Cricetulus griseus (Chinese hamster) - 2 cell lines
  • Fukomys damarensis (Damara mole rat)
  • Heterocephalus glaber female (Naked mole rat) - male and female
  • Jaculus jaculus (Lesser Egyptian jerboa)
  • Mesocricetus auratus (Golden hamster)
  • Microtus ochreogaster (Prairie vole)
  • Mus caroli (Ryukyu mouse)
  • Mus pahari (Shrew mouse)
  • Nannospalax galili (Upper Galilee mountains blind mole rat)
  • Octodon degus (Degu)
  • Peromyscus maniculatus bairdii (North American deer mouse)

New genome annotation on the pig assembly Sscrofa11.1 (Pig)

New genome annotation of the new pig assembly Sscrofa11.1, GCA_000003025.6, based on a comprehensive set of species specific RNA-Seq data, PacBio long reads, cDNAs and vertebrate proteins

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Update to Ensembl-Havana human GENCODE gene set (release 27) (Human)

Updated Ensembl-Havana gene set (GENCODE release 27). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh38.p10 gene annotation is also included:

The patches for GRCh38.p10 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 32,512 transcript models as part of an updated version (April 2017) of CCDS.

Mouse: updated cDNA alignments (all species)

A new cdna database will be created for e89: The latest set of cDNAs for mouse (as of April 2017) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

New external data for the pig genome (Pig)

New database containing all the cDNAs and the PacBio long reads alignments

New RNA-Seq database for the pig annotation (Pig)

New RNA-Seq database containing the models for all the different tissue samples of PRJEB19386

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e90: The latest set of cDNAs for mouse (as of June 2017) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Human: updated cDNA alignments (Human)

A new cdna database will be created for e90: The latest set of cDNAs for human (as of June 2017) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Add transcript models from new RNAseq to zebrafish core set (Zebrafish)

New transcript and gene models will be added to the zebrafish core gene set that were produced from new developmental RNAseq data from WTSI.

New zebrafish pri-miRNAs (Zebrafish)

Zebrafish pri-miRNA transcript models are being imported into the zebrafish otherfeatures database for e90 so that they can be viewed in the browser.

Annotation of Horizon Chinese hamster ovary cell line assembly by Eagle Genomics (Chinese hamster CHOK1GS)

Eagle Genomics have created a gene annotation for the Horizon Chinese hamster ovary cell line assembly using the Ensembl gene annotation pipeline.

import annotations for caroli and pahari (Shrew mouse, Ryukyu mouse)

importing annotations of 2 mouse species. 

Annotation of Chinese hamster ovary cell line assembly (Chinese hamster CriGri)

We have produced gene annotation for Chinese Hamster ovary cell line assembly

Release 89

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Human: updated cDNA alignments (Human)

A new cdna database will be created for e89: The latest set of cDNAs for human (as of April 2017) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e89: The latest set of cDNAs for mouse (as of April 2017) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

mouse lemur lincRNA (Mouse Lemur)

Remove 3 lincRNAs

Human protein features for mappings between Ensembl proteins and PDB structures with chains (Human)

Protein features that represent the mapping between human Ensembl proteins (ENSP) and PDB protein structures (including their corresponding PDB chains) have been added to the Ensembl human core database under the "sifts_import" logic name. This data has been imported from SIFTS, which is a resource for residue-level mapping between UniProt and PDB, and from GIFTS, which is a database containing alignments between UniProt and Ensembl proteins.

Remove duplicated Genscan prediction (Platyfish)

Genscan prediction are duplicated which makes it harder to process the ab initio file from the FTP

Release 88

Update to Ensembl-Havana human GENCODE gene set (release 26) (Human)

Updated Ensembl-Havana gene set (GENCODE release 26). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh38.p10 gene annotation is also included:

The patches for GRCh38.p10 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Rat: gene set update (Rat)

The rat gene set will be updated to include the latest manual annotation from the Havana team.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 32,514 transcript models as part of an updated version (November 2016) of CCDS

Human: updated cDNA alignments (Human)

A new cdna database will be created for e88: The latest set of cDNAs for human (as of December 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e88: The latest set of cDNAs for mouse (as of January 2017) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Human and mouse: Ensembl-to-RefSeq comparison attributes (Human, Mouse)

For each Ensembl transcript present in the human and mouse core db, a comparison is carried out with all overlapping RefSeq transcripts from the otherfeatures db.

Up to five comparisons are carried out (depending on if the models are non-coding or coding):

1) Check if all exons coordinates match (all transcripts) 

2) Check if transcript sequences match (all transcripts)

3) Check if the CDS exon coordinates match (coding transcripts only)

4) Check if the CDS sequences match (coding transcripts only)

5) Check if the translation sequences match (coding transcripts only)

For non-coding models, if comparisons (1) and (2) are a match then the transcripts are considered to match on the whole transcript level and the Ensembl transcript is given an attribute to say there is a match on the whole transcript level.

For coding models if all five comparisons are true then the Ensembl transcript is given an attribute to say there is a match on the whole transcript level. Failing that, if comparisons (3), (4) and (5) are true the Ensembl transcript is given an attribute to say there is a match on the whole transcript level.

The stable ids of any matching RefSeq transcripts will be stored in the value field of the Ensembl transcript attribute.

Updated mouse otherfeatures db: New CCDS import (Mouse)

The latest CCDS mouse set will be imported.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 68

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega 68

Vega Rat annotation updated (Rat)

Manual annotation of rat from Havana has been updated and contains the data released in Vega 68

mouse lemur lincRNA (Mouse Lemur)

Adding lincRNA models to core db

Platypus: Assembly Synonyms added (Platypus)

Genbank accessions to Ensembl sequence names to be added as synonyms to the Platypus database.

Pika: assembly name change (Pika)

Owing to the fragmentary nature of the OchPri2.0 assembly, it was necessary to arrange some scaffolds into "gene-scaffold" super-structures,in order to present complete genes. The pika assembly name related values will be changed to 'OchPri2.0-Ens' to allow the name to accurately reflect the fact the genome as presented by Ensembl differs from OchPri-2.0. 

Release 87

Chicken: Gene set update (Chicken)

This is an update to the chicken gene set to correct UTRs and remove duplicates

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The mouse GRCm38.p5 gene annotation is also included:

The patches for GRCm38.p5 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Zebrafish: update to Ensembl-Havana merged gene set (Zebrafish)

Updated Ensembl-Havana zebrafish gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. 

Tilapia lincRNA (Tilapia)

lincRNAs for Tilapia will be added to  core database

Platypus lincRNA (Platypus)

lincRNAs for Platypus will be added to  core database

Opossum lincRNA (Opossum)

lincRNAs for Opossum will be added to  core database

Spotted gar lincRNA (Spotted gar)

lincRNAs for Spotted gar will be added to  core database

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 24,816 transcript models as part of an updated version (September 2016) of CCDS.

Human: updated cDNA alignments (Human)

A new cdna database will be created for e87: The latest set of cDNAs for human (as of July 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e87: The latest set of cDNAs for mouse (as of July 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

UCSC synonyms for rat and mouse (Mouse, Rat)

Addititon of the UCSC sequence names to the synonyms to allow the retrieval of data using UCSC naming convention, chr1,...

New zebrafish rnaseq (Zebrafish)

New zebrafish RNASeq is being aligned to GRCz10 to produce gene models that will be viewable as separate tracks in the browser. The RNASeq data come from 18 different stages of embryonic development.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 67

Vega Zebrafish annotation updated (Zebrafish)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 67

RNAseq db for mouse lemur (Mouse Lemur)

A new RNAseq db for mouse lemur will be created due to few "misc" models

Release 86

Mouse Strains (all species)

Annotation and assemblies for 16 mice, produced by the Mouse Genomes Project, have been added. De novo assemblies for each strain were built from a mixture of short- and long-range illumina libraries, optical maps, and third generation sequencing. Genes were annotated primarily by projection of the GENCODE gene set from GRCm38 to each strain. The projected annotation was refined with strain-specific RNA-seq data. The RNA-seq data were also used to find novel annotations.

We have annotationed additional genomic features including: repeats, CpG islands, predicted promotor regions and BLAST alignments of UniProt proteins. We have also annotated protein domains using InterProScan.

The collection comprises of one outgroup mouse (Mus spretus), three Mus musculus subspecies (castaneus, musculus and domesticus) and twelve strains of Mus musculus. The full list can be seen below:

Mus musculus 129S1/SvImJ

Mus musculus A/J

Mus musculus AKR/J

Mus musculus BALB/cJ

Mus musculus C3H/HeJ

Mus musculus C57BL/6NJ

Mus musculus CBA/J

Mus musculus DBA/2J

Mus musculus FVB/NJ

Mus musculus LP/J

Mus musculus NOD/ShiLtJ

Mus musculus NZO/HlLtJ

Mus musculus castaneus CAST/EiJ

Mus musculus domesticus WSB/EiJ

Mus musculus musculus PWK/PhJ

Mus spretus SPRET/EiJ

You can access the strains here.

Chicken new assembly and gene set (Chicken)

A new genebuild on the chicken assembly, Gallus_gallus-5.0

Macaque new assembly and genebuild (Macaque)

A new gene set for the macaque assembly Mmul_8.0.1

Mouse lemur new assembly and genebuild (Mouse Lemur)

A new gene set for the mouse lemur assembly Mmur_2.0

Anolis lizard lincRNA (Anole lizard)

lincRNAs for anole lizard will be added to  core database

Flycatcher lincRNA (Flycatcher)

lincRNAs for flycatcher will be added to  core database

Cave fish lincRNA (Cave fish)

lincRNAs for cave fish will be added to  core database

Human: updated cDNA alignments (Human)

A new cdna database will be created for e86: The latest set of cDNAs for human (as of July 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e86: The latest set of cDNAs for mouse (as of July 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Zebrafish: update to Ensembl-Havana merged gene set (Zebrafish)

Updated Ensembl-Havana zebrafish gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. 

New zebrafish otherfeatures database (Zebrafish)

Zebrafish-specific cDNA and ESTs have been aligned to GRCz10. These are made available through the website and otherfeatures database.

Zebrafish: updated RefSeq gene import (Zebrafish)

The imported RefSeq gene set was updated in the zebrafish otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Mouse: updated RefSeq gene import (Mouse)

The imported RefSeq gene set was updated in the mouse otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 66

Vega Zebrafish annotation updated (Zebrafish)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 66

Human/mouse/zebrafish: Ensembl-to-RefSeq comparison attributes (Zebrafish, Human, Mouse)

For each Ensembl transcript present in the human and mouse core db, a comparison is carried out with all overlapping RefSeq transcripts from the otherfeatures db.

Up to five comparisons are carried out (depending on if the models are non-coding or coding):

1) Check if all exons coordinates match (all transcripts) 

2) Check if transcript sequences match (all transcripts)

3) Check if the CDS exon coordinates match (coding transcripts only)

4) Check if the CDS sequences match (coding transcripts only)

5) Check if the translation sequences match (coding transcripts only)

For non-coding models, if comparisons (1) and (2) are a match then the transcripts are considered to match on the whole transcript level and the Ensembl transcript is given an attribute to say there is a match on the whole transcript level.

For coding models if all five comparisons are true then the Ensembl transcript is given an attribute to say there is a match on the whole transcript level. Failing that, if comparisons (3), (4) and (5) are true the Ensembl transcript is given an attribute to say there is a match on the whole transcript level.

The stable ids of any matching RefSeq transcripts will be stored in the value field of the Ensembl transcript attribute.

Mouse lemur otherfeatures database (Mouse Lemur)

A new gene set for the mouse lemur assembly, Mmur2.0, requires new core, rnaseq and otherfeatures databases. Species-specific ESTs and cDNAs were aligned to the genome and alignments are available through the website or otherfeatures database.

Chicken otherfeatures database (Chicken)

A new genebuild on the chicken assembly, Gallus_gallus-5.0, requires chicken core, rnaseq and otherfeatures databases. Chicken-specific cDNAs and ESTs were aligned to the chicken genome and are made available through the Ensembl website and the chicken otherfeatures database.

Chicken RNAseq and Bam files (Chicken)

In addition to the gene annotation for Galgal_5.0, an rnaseq database will be released where users can view BAM files and transcript models for different tissues.

Macaque otherfeatures database (Macaque)

A new genebuild on the macaque assembly, Mmul_8.0.1, requires macaque core, rnaseq and otherfeatures databases.  Macaque-specific cDNAs and ESTs were aligned to the macaque genome and are made available through the Ensembl website and the macaque otherfeatures database.

Macaque RNAseq and Bam files (Macaque)

In addition to the gene annotation for macaque, an rnaseq database will be released where users can view BAM files and transcript models for different tissues.

Mouse lemur RNASeq and Bam files (Mouse Lemur)

In addition to the gene annotation for mouse lemur, an rnaseq database will be released where users can view BAM files and transcript models for different tissues.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 24,826 transcript models as part of an updated version (July 2016) of CCDS

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Human: transcript attributes for Refseq-genomic-to-mRNA comparison (Human)

Transcript attributes will be added for the refseq_import geneset in the human otherfeatures db. Each refseq_import transcript will have an attribute to denote whether the genomic sequence that the transcript covers matches the mRNA sequence that the transcript is based on (the sequences present in the RefSeq mRNA file).

A prefect match is denoted as an alignment across the entirety of both sequences that contains no mismatches or indels. If initially there is a mismatch, the RefSeq mRNA will go through polyA clipping and the sequences will be compared again to see if a perfect match is possible post polyA clipping.

Transcripts that do not have a perfect match between the mRNA and the genomic sequence will get additional attributes to define what regions (5' UTR, CDS, 3' UTR, or 'whole transcript' if there is no CDS defined) do not align perfectly, along with a summary of the information in the alignment (match,mismatch, indel count, total indel length).

Release 85

Update to Ensembl-Havana human GENCODE gene set (release 25) (Human)

Updated Ensembl-Havana gene set (GENCODE release 25). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh38.p7 gene annotation is also included:

The patches for GRCh38.p7 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

New zebrafish rnaseq (Zebrafish)

New zebrafish RNASeq is being aligned to GRCz10 to produce gene models that will be viewable as separate tracks in the browser.

Update to Rat Ensembl-Havana gene set (Rat)

Updated Ensembl-Havana rat gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation.

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Human: GRCh38.p7 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 31,327 transcript models as part of an updated version (February 2016) of CCDS

Human: updated cDNA alignments (Human)

A new cdna database will be created for e85: The latest set of cDNAs for human (as of June 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e85: The latest set of cDNAs for mouse (as of June 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Correct assembly.accession meta_key for horse, elephant and Orangutan (Horse, Elephant, Orangutan)

The meta_key 'assembly.accession' will be removed from orangutan as there is a discrepancy between the assembly shown in Ensembl and the assembly accessible through any INSDC database with the assembly accession.

Synonyms will be added for horse and elephant so sequences can be accessed using INSDC sequence accessions.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 24,826 transcript models as part of an updated version (February 2016) of CCDS

Human/mouse: Ensembl-to-RefSeq comparison attributes (Human, Mouse)

For each Ensembl transcript present in the human and mouse core db, a comparison is carried out with all overlapping RefSeq transcripts from the otherfeatures db.

Up to five comparisons are carried out (depending on if the models are non-coding or coding):

1) Check if all exons coordinates match (all transcripts) 

2) Check if transcript sequences match (all transcripts)

3) Check if the CDS exon coordinates match (coding transcripts only)

4) Check if the CDS sequences match (coding transcripts only)

5) Check if the translation sequences match (coding transcripts only)

For non-coding models, if comparisons (1) and (2) are a match then the transcripts are considered to match on the whole transcript level and the Ensembl transcript is given an attribute to say there is a match on the whole transcript level.

For coding models if all five comparisons are true then the Ensembl transcript is given an attribute to say there is a match on the whole transcript level. Failing that, if comparisons (3), (4) and (5) are true the Ensembl transcript is given an attribute to say there is a match on the whole transcript level.

The stable ids of any matching RefSeq transcripts will be stored in the value field of the Ensembl transcript attribute.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 65

Vega Rat annotation updated (Rat)

Manual annotation of rat from Havana has been updated and contains the data released in Vega 65

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega 65

Release 84

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 64

Vega Zebrafish annotation updated (Zebrafish)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 64

Human: updated cDNA alignments (Human)

A new cdna database will be created for e84: The latest set of cDNAs for human (as of Jan 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e84: The latest set of cDNAs for mouse (as of Jan 2016) from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 24,831 transcript models as part of an updated version (December 2015) of CCDS

Baboon: add lincRNA models (Olive baboon)

The baboon gene set will be updated with lincRNA data using RNAseq data

Release 83

Update to Ensembl-Havana human GENCODE gene set (release 24) (Human)

Updated Ensembl-Havana gene set (GENCODE release 24). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh38.p5 gene annotation is also included:

The patches for GRCh38.p5 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Human: updated cDNA alignments (Human)

A new cdna database was created for e83: The latest set of cDNAs for human (as of Oct2015) from the European Nucleotide Archive and NCBI RefSeq (release 70) were aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e83: The latest set of cDNAs for mouse (as of Oct2015) from the European Nucleotide Archive and NCBI RefSeq were aligned to the current genome using Exonerate.

rat: update to Ensembl-Havana gene set (Rat)

Updated Ensembl-Havana rat gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 24,831 transcript models as part of an updated version (September 2015) of CCDS

Human: GRCh38.p5 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 31,357 transcript models as part of an updated version (September 2015) of CCDS

Human: RefSeq-to-Ensembl model comparison (Human)

For each refseq_import transcript model present in the human otherfeatures db, a comparison is carried out with all overlapping Ensembl transcript models from the core db.

Initially the models are compared on the whole transcript level, all exons are compared in terms of genomic coordinates and the transcript sequences of the two models are also compared.

For non-coding models, if both of these comparisons match then the models are considered to match on the whole transcript level and the RefSeq model is given an attribute to say there is a match on the whole transcript level. If no overlapping Ensembl model meets the criteria the RefSeq model is given a transcript attribute to denote this.

For models where a CDS is defined there is an extra level of comparison. The coding exon coordinates, CDS and translation sequences of both models are also compared.

If all exons coordinates (coding and non-coding) and the transcript, CDS and the translation sequences all match then the RefSeq model is given an attribute to say there is a match on the whole transcript level.

Failing this, a comparison is done on the coding exons coordinates, CDS and translation sequences only. If the comparisons now match, the RefSeq transcript is given an attribute to denote that there is a match on the CDS level only.

If there are still no matching Ensembl transcripts at this point the RefSeq transcript is given an attribute to denote that there is no matching Ensembl model.

All matching Ensembl models have their stable ids listed in the value field of the corresponding transcript attribute for the RefSeq model.

Human: transcript attributes for Refseq-genomic-to-mRNA comparison (Human)

Transcript attributes will be added for the refseq_import geneset in the human otherfeatures db. Each refseq_import transcript will have an attribute to denote whether the genomic sequence that the transcript covers matches the mRNA sequence that the transcript is based on (the sequences present in the RefSeq mRNA file).

A prefect match is denoted as an alignment across the entirety of both sequences that contains no mismatches or indels. If initially there is a mismatch, the RefSeq mRNA will go through polyA clipping and the sequences will be compared again to see if a perfect match is possible post polyA clipping.

Transcripts that do not have a perfect match between the mRNA and the genomic sequence will get additional attributes to define what regions (5' UTR, CDS, 3' UTR, or 'whole transcript' if there is no CDS defined) do not align perfectly, along with a summary of the information in the alignment (match,mismatch, indel count, total indel length).

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Mouse: updated RefSeq gene import (Mouse)

The imported RefSeq gene set was updated in the mouse otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 63

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega 63

Vega Rat annotation updated (Rat)

Manual annotation of rat from Havana has been updated and contains the data released in Vega 63

Release 82

Zebrafish developmental stage RNASeq data set (Zebrafish)

Models built based on different developmental stages and different tissue samples: 2 cells, 6 hours post fertilisation, 1 day post fertilisation (dpf), 2 dpf, 3 dpf, 5 dpf, ovary, male head, female head, male body, female body. We provide the alignment BAM files, the intron supporting evidence and the gene models

Human: updated cDNA alignments (Human)

A new cdna database was created for e82: The latest set of cDNAs for human (as of June 2015) from the European Nucleotide Archive and NCBI RefSeq (release 70) were aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e82: The latest set of cDNAs for mouse (as of June 2015) from the European Nucleotide Archive and NCBI RefSeq (release 70) were aligned to the current genome using Exonerate.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 23,830 transcript models as part of an updated version (May 2015) of CCDS

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 62

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Stable id mapping for GRCz10 (Zebrafish)

Stable id events table for zebrafish is missing entries. This needs to be fixed for gene, transcript and translation tables

Mouse XREFs cleanup (Mouse)

The XREFs for mouse need to be cleaned up, We only want to keep Ens%, OTT% and Vega% and HGNC and LRG XREFS

Rat XREFs cleanup (Rat)

The XREFs for rat need to be cleaned up We only want to keep Ens%, OTT% and Vega% and HGNC and LRG XREFS

Release 81

Update to Ensembl-Havana human GENCODE gene set (release 23) (Human)

Updated Ensembl-Havana gene set (GENCODE release 23). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh38.p3 gene annotation is also included:

The patches for GRCh38.p3 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Mouse: assembly updated to GRCm38.p4 (Mouse)

The mouse genome assembly was updated to GRCm38.p4 and the assembly information in all mouse databases has been altered accordingly. This minor assembly update contains 30 assembly patches. The DNA sequence for the primary assembly (chromosomes, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Remove Amazon molly empty transcripts (Amazon molly)

There are some empty transcripts for Amazon Molly that need to be removed

Amazon Molly:Fix genscan predictions (Amazon molly)

There are some problems with Amazon molly Genscan predictions that prevent the annotation from being dumped. These need to be fixed.

Human: GRCh38.p3 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 31,359 transcript models as part of an updated version (June 2015) of CCDS

Zebrafish: import clone data (Zebrafish)

Zebrafish clones were imported from the NCBI clone database. The tracks for the clones can be found under "Clones and misc regions" in the configuration menu, while the coordinates for the BAC ends can be found as tracks under "Simple features", also in the configuration menu.

Mouse: GRCm38.p4 Karyotype Bands (Mouse)

Karyotype bands were updated in regions overlapping patches

Update to Ensembl-Havana mouse GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The mouse GRCm38.p4 gene annotation is also included:

The patches for GRCm38.p4 were annotated using a combination of annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e80: The latest set of cDNAs for mouse (as of Month 2015) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 23,830 transcript models as part of an updated version (May 2015) of CCDS

Correction of DMD transcript in Dog (Dog)

A transcript in the DMD gene is missing an exon, we will fix the transcript

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 61

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega 61

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Mouse: updated RefSeq gene import (Mouse)

The imported RefSeq gene set was updated in the mouse otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Human: RefSeq-to-Ensembl model comparison (Human)

For each refseq_import transcript model present in the human otherfeatures db, a comparison is carried out with all overlapping Ensembl transcript models from the core db.

Initially the models are compared on the whole transcript level, all exons are compared in terms of genomic coordinates and the transcript sequences of the two models are also compared.

For non-coding models, if both of these comparisons match then the models are considered to match on the whole transcript level and the RefSeq model is given an attribute to say there is a match on the whole transcript level. If no overlapping Ensembl model meets the criteria the RefSeq model is given a transcript attribute to denote this.

For models where a CDS is defined there is an extra level of comparison. The coding exon coordinates, CDS and translation sequences of both models are also compared.

If all exons coordinates (coding and non-coding) and the transcript, CDS and the translation sequences all match then the RefSeq model is given an attribute to say there is a match on the whole transcript level.

Failing this, a comparison is done on the coding exons coordinates, CDS and translation sequences only. If the comparisons now match, the RefSeq transcript is given an attribute to denote that there is a match on the CDS level only.

If there are still no matching Ensembl transcripts at this point the RefSeq transcript is given an attribute to denote that there is no matching Ensembl model.

All matching Ensembl models have their stable ids listed in the value field of the corresponding transcript attribute for the RefSeq model.

Human: transcript attributes for Refseq-genomic-to-mRNA comparison (Human)

Transcript attributes will be added for the refseq_import geneset in the human otherfeatures db. Each refseq_import transcript will have an attribute to denote whether the genomic sequence that the transcript covers matches the mRNA sequence that the transcript is based on (the sequences present in the RefSeq mRNA file).

A prefect match is denoted as an alignment across the entirety of both sequences that contains no mismatches or indels. If initially there is a mismatch, the RefSeq mRNA will go through polyA clipping and the sequences will be compared again to see if a perfect match is possible post polyA clipping.

Transcripts that do not have a perfect match between the mRNA and the genomic sequence will get additional attributes to define what regions (5' UTR, CDS, 3' UTR, or 'whole transcript' if there is no CDS defined) do not align perfectly, along with a summary of the information in the alignment (match,mismatch, indel count, total indel length).

Mouse clone import (Mouse)

Mouse clone libraries have been imported from the NCBI clone database. to replace previous DAS tracks. The tracks for the clones can be found under "Clones and misc regions" in the configuration menu, while the coordinates for the BAC ends can be found as tracks under "Simple features", also in the configuration menu.

Human: updated cDNA alignments (Human)

A new cdna database was created for e80: The latest set of cDNAs for human (as of Month 2015) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Upgrade remaining species to rnaseq matrix (all species)

For some species we have RNASeq data but have not yet displayed options in an RNASeq matrix for the users. This requires changes to the analysis_description, analysis_web_data and web_data tables in the ensembl_production database

Human: assembly updated to GRCh38.p3 (Human)

The human genome assembly was updated to GRCh38.p3 and the assembly information in all human databases has been altered accordingly. The DNA sequence for the primary assembly (chromosomes, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Release 80

Updated zebrafish gene annotation based on the GRCz10 assembly (Zebrafish)

This is the new gene set for the zebrafish based on the latest assembly, GRCz10.

The annotation is a merge of complete Ensembl gene models and the latest HAVANA gene annotation. The Ensembl gene set contains some models based on RNASeq data.

Updated rat gene annotation based on the Rnor_v6.0 assembly (Rat)

This is the new gene set for rat based on Rnor_v6.0. The Y chromosome has been added to the assembly. The gene set contains RNASeq-based models.

This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation.

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e80: The latest set of cDNAs for mouse from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Human: updated cDNA alignments (Human)

A new cdna database will be created for e80: The latest set of cDNAs for human from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Mouse: updated RefSeq gene import (Mouse)

The imported RefSeq gene set was updated in the mouse otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Human: RefSeq-to-Ensembl model comparison attributes (Human)

For each refseq_import transcript model present in the human otherfeatures db, a comparison is carried out with all overlapping Ensembl transcript models from the core db.

Initially the models are compared on the whole transcript level, all exons are compared in terms of genomic coordinates and the transcript sequences of the two models are also compared.

For non-coding models, if both of these comparisons match then the models are considered to match on the whole transcript level and the RefSeq model is given an attribute to say there is a match on the whole transcript level. If no overlapping Ensembl model meets the criteria the RefSeq model is given a transcript attribute to denote this.

For models where a CDS is defined there is an extra level of comparison. The coding exon coordinates, CDS and translation sequences of both models are also compared.

If all exons coordinates (coding and non-coding) and the transcript, CDS and the translation sequences all match then the RefSeq model is given an attribute to say there is a match on the whole transcript level.

Failing this, a comparison is done on the coding exons coordinates, CDS and translation sequences only. If the comparisons now match, the RefSeq transcript is given an attribute to denote that there is a match on the CDS level only.

If there are still no matching Ensembl transcripts at this point the RefSeq transcript is given an attribute to denote that there is no matching Ensembl model.

All matching Ensembl models have their stable ids listed in the value field of the corresponding transcript attribute for the RefSeq model.

Human: Refseq-genomic-to-mRNA comparison attributes (Human)

Transcript attributes will be added for the refseq_import geneset in the human otherfeatures db. Each refseq_import transcript will have an attribute to denote whether the genomic sequence that the transcript covers matches the mRNA sequence that the transcript is based on (the sequences present in the RefSeq mRNA file).

A prefect match is denoted as an alignment across the entirety of both sequences that contains no mismatches or indels. If initially there is a mismatch, the RefSeq mRNA will go through polyA clipping and the sequences will be compared again to see if a perfect match is possible post polyA clipping.

Transcripts that do not have a perfect match between the mRNA and the genomic sequence will get additional attributes to define what regions (5' UTR, CDS, 3' UTR, or 'whole transcript' if there is no CDS defined) do not align perfectly, along with a summary of the information in the alignment (match,mismatch, indel count, total indel length).

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 60

Vega Rat annotation updated (Rat)

Manual annotation of rat from Havana has been updated and contains the data released in Vega 59

Vega Zebrafish annotation updated (all species)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 59

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes nn,nnn transcript models as part of an updated version (March 2015) of CCDS

Release 79

Update to Ensembl-Havana GENCODE gene set (release 22) (Human)

Updated Ensembl-Havana gene set (GENCODE release 22). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh38.p2 gene annotation is also included:

The patches for GRCh38.p2 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

RefSeq-to-Ensembl model comparison (Human)

For each refseq_import transcript model present in the human otherfeatures db, a comparison is carried out with all overlapping Ensembl transcript models from the core db.

Initially the models are compared on the whole transcript level, all exons are compared in terms of genomic coordinates and the transcript sequences of the two models are also compared.

For non-coding models, if both of these comparisons match then the models are considered to match on the whole transcript level and the RefSeq model is given an attribute to say there is a match on the whole transcript level. If no overlapping Ensembl model meets the criteria the RefSeq model is given a transcript attribute to denote this.

For models where a CDS is defined there is an extra level of comparison. The coding exon coordinates, CDS and translation sequences of both models are also compared.

If all exons coordinates (coding and non-coding) and the transcript, CDS and the translation sequences all match then the RefSeq model is given an attribute to say there is a match on the whole transcript level.

Failing this, a comparison is done on the coding exons coordinates, CDS and translation sequences only. If the comparisons now match, the RefSeq transcript is given an attribute to denote that there is a match on the CDS level only.

If there are still no matching Ensembl transcripts at this point the RefSeq transcript is given an attribute to denote that there is no matching Ensembl model.

All matching Ensembl models have their stable ids listed in the value field of the corresponding transcript attribute for the RefSeq model.

Human: updated cDNA alignments (Human)

A new cdna database will be created for e79: The latest set of cDNAs for human from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database will be created for e79: The latest set of cDNAs for mouse from the European Nucleotide Archive and NCBI RefSeq will be aligned to the current genome using Exonerate.

Human: GRCh38.p2 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 30,446 transcript models as part of an updated version (Jan 2015) of CCDS

All species: updated RefSeq gene import (all species)

RefSeq GFF3 annotation were added to the otherfeatures databases. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Vervet-AGM geneset update (Vervet-AGM)

The existing geneset will be enhanced by 3 selected non-coding vervet genes.

Transcript attributes for Refseq-genomic-to-mRNA comparison (Human)

Transcript attributes will be added for the refseq_import geneset in the human otherfeatures db. Each refseq_import transcript will have an attribute to denote whether the genomic sequence that the transcript covers matches the mRNA sequence that the transcript is based on (the sequences present in the RefSeq mRNA file).

A prefect match is denoted as an alignment across the entirety of both sequences that contains no mismatches or indels. If initially there is a mismatch, the RefSeq mRNA will go through polyA clipping and the sequences will be compared again to see if a perfect match is possible post polyA clipping.

Transcripts that do not have a perfect match between the mRNA and the genomic sequence will get additional attributes to define what regions (5' UTR, CDS, 3' UTR, or 'whole transcript' if there is no CDS defined) do not align perfectly, along with a summary of the information in the alignment (match,mismatch, indel count, total indel length).

Vega Human annotation updated (all species)

Manual annotation of human from Havana has been updated and contains the data released in Vega 59

Tilepath for mouse (Mouse)

Clone tilepath for this assembly

Release 78

GRC alignments (Human)

GRC alignments between the primary assembly and the alternate loci added.

Update to Ensembl-Havana mouse merge (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The mouse GRCm38.p3 gene annotation is also included:

The patches for GRCm38.p3 were annotated using a combination of annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Mouse: assembly updated to GRCm38.p3 (Mouse)

The mouse genome assembly was updated to GRCm38.p3 and the assembly information in all mouse databases has been altered accordingly. This minor assembly update contains 17 assembly patches. The DNA sequence for the primary assembly (chromosomes, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes nn,nnn transcript models as part of an updated version (Month 2014) of CCDS

Mouse: GRCm38.p3 Karyotype Bands (Mouse)

Karyotype bands were updated in regions overlapping patches

Merge species: updated RefSeq gene import (all species)

RefSeq GFF3 annotation were added to the otherfeatures databases. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons. In this release we will also include Immunoblobulin genes, tRNAscan-SE source data and MT genes.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e78: The latest set of cDNAs for mouse (as of Month 2014) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Human: updated cDNA alignments (Human)

A new cdna database was created for e78: The latest set of cDNAs for human (as of Month 2014) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 58

Vega pig annotation updated (Pig)

Manual annotation of pig from Havana has been updated and contains the data released in Vega 57

Vervet-AGM Maker gene annotation for Otherfeatures (Vervet-AGM)

A gene set made for Vervet-AGM by WashU using the Maker software package has been added to the database.

Release 77

Vervet Monkey assembly and genebuild (Vervet-AGM)

The Vervet Monkey assembly ChlSab1.1 was released. We have produced new gene annotation on this assembly.

Vervet Monkey RNASeq database and Bam files (Vervet-AGM)

In addition to the gene annotation for ChlSab1.1, an rnaseq database will be released where users can view BAM files and transcript models.

Update to Ensembl-Havana rat merge (Rat)

Updated Ensembl-Havana rat gene set; this is the first merge for rat. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation.

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Update to Ensembl-Havana GENCODE gene set (release 21) (Human)

Updated Ensembl-Havana gene set (GENCODE release 21). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Transcript Support Levels (Human, Mouse)

Transcript Supports Levels (TSLs) were imported from UCSC. TSLs for human are based on the GENCODE 20 gene set. TSLs for mouse are based on the GENCODE M2 gene set.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 30,493 transcript models as part of an updated version (August 2014) of CCDS.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 23,861 transcript models as part of an updated version (August 2014) of CCDS

Merge species: updated RefSeq gene import (Zebrafish, Human, Mouse, Rat, Pig)

RefSeq GFF3 annotation from human, mouse, rat, zebrafish and pig were added to their respective otherfeatures databases. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Human: updated cDNA alignments (all species)

A new cdna database was created for e77: The latest set of cDNAs for human (as of August 2014) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (all species)

A new cdna database was created for e77: The latest set of cDNAs for mouse (as of August 2014) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Vervet Monkey otherfeatures database (Vervet-AGM)

Vervet Monkey EnsEMBL longest translations from human have been aligned to ChlSab1.1 to generate gene models and are made available through the website and otherfeatures database.

Amazon Molly genebuild (Amazon molly)

We have made an improved geneset for the Amazon molly compared to the previous release (e76)

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega 57

Vega Rat annotation included (Rat)

Manual annotation of rat from Havana is included. This represents the data released in Vega 57.

APPRIS tags (Human, Mouse)

APPRIS labels were imported for human and mouse. APPRIS is a system that deploys a range of computational methods to provide value to the annotations of the human and mouse genomes. APPRIS also selects one of the CDS for each gene as the principal isoform. APPRIS defines the principal variant by combining protein structural and functional information and information from the conservation of related species.

Clones for sheep (Sheep)

Clone track for the sheep assembly

New RNASeq data matrix configuration (all species)

The configuration panel for the RNASeq data is changed to a matrix to easily turn on/off tracks together.

Release 76

Update to Ensembl-Havana GENCODE gene set (release 20) (Human)

Updated Ensembl-Havana gene set (GENCODE release 20). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh38 gene annotation is also included.

Human: assembly updated to GRCh38 (Human)

The human genome assembly was updated to GRCh38 and the assembly information in all human databases has been altered accordingly. It consists of 24 chromosomes (1-22, X and Y), 127 unplaced scaffolds and 42 unlocalized scaffolds. This major assembly update contains 261 alt loci scaffolds (including haplotypes for the MHC region on chromosome 6 and LRC region on chromosome 19), in 35 alternate assembly units. 72 of these alternate loci were previously available as NOVEL patches to GRCh37.

Olive baboon new assembly and genebuild (Olive baboon)

We have produced a new set of gene annotations for the Panu_2,0 assembly.

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Mouse: updated RefSeq gene import (Mouse)

The imported RefSeq gene set was updated in the mouse otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Update to Ensembl-Havana mouse merge (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The mouse GRCm38.p2 gene annotation is also included:

The patches for GRCm37.p2 were annotated using a combination of annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Amazon molly new assembly and genebuild (Amazon molly)

We have produced a new set of gene annotations for the PoeFor_5.1.2 assembly

Amazon molly RNASeq database and Bam files (Amazon molly)

In addition to the gene annotation for PoeFor_5.1.2, an rnaseq database will be released where users can view BAM files and transcript models for 11 tissues.

Human: GRCh38 Karyotype Bands (Human)

Karyotype bands were updated in all regions.

Olive baboon Otherfeatures database (Olive baboon)

Baboon-specific cDNA and ESTs along with EnsEMBL longest translations from human have been aligned to Panu_2.0. These are made available through the website and otherfeatures database.

Olive baboon RNASeq database and Bam files (Olive baboon)

In addition to the gene annotation for Panu_2.0, an rnaseq database will be released where users can view BAM files and transcript models for 14 tissues including thymus, liver, lung, heart and pituitary.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 29,033 transcript models as part of an updated version (April 2014) of CCDS which has been projected from assembly GRCh37.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 23,868 transcript models as part of an updated version (April 2014) of CCDS

All species: updated RefSeq sequence synonyms (all species)

The imported RefSeq sequence synonyms have been updated for all species.

Rat: Imported ncRNAs (Rat)

We have updated the set of ncRNAs for rat.

Human: updated cDNA alignments (all species)

A new cdna database was created for e75: The latest set of cDNAs for human (as of December 2013) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e75: The latest set of cDNAs for mouse (as of January 2013) from the European Nucleotide Archive and NCBI RefSeq (release 62) were aligned to the current genome using Exonerate.

Amazon molly Otherfeatures database (Amazon molly)

EnsEMBL longest translations from human, zebrafish and stickleback have been aligned to PoeFor_5.1.2. These are made available through the website and otherfeatures database.

Remove sequences and associated genes not in the assembly (Turkey)

Remove 275 sequences from the turkey assembly as they are not part of the official assembly, GCA_000146605.1.

Four genes will be deleted: ENSMGAG00000016986, ENSMGAG00000017366, ENSMGAG00000016551, ENSMGAG00000016615

Updated MT for sea squirt (C.intestinalis)

The MT was updated for sea squirt. Updating the mitochondrion annotation to NC_004447.2

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 56.

Markers from UniSTS (Sheep)

Import of the markers from SheepMap4.7 and CAB Ovine Linkage Map

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega 56 (GRCh38).

Platyfish MT contig removal (Platyfish)

The Platyfish assembly has contigs from the Mitochondial chromosome labelled as toplevel sequence. The MT contigs do not form the MT scaffold in the platyfish assembly as stored in the database, and will be deleted.

Release 75

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Havana merge for Zebrafish (Zebrafish)

An updated set of Zebrafish genes is included in this release. There hasn't been a new Ensembl genebuild but the Havana annotations have been updated and we therefore re-ran the merge of the two gene sets.

This is also the first merge using a new merge pipeline that has been in development since the beginning of the year. The new software delivers slightly different results, but the bulk of the annotations are the same. The biggest difference is an increase in merged protein coding genes by about 960 (compared to running the older pipeline on the same input data).

Merged genes and transcripts can be fetched using 'source' column (Zebrafish, Human, Mouse, Pig)

From this release we will introduce a new use for the gene.source column and the new transcript.source column. These columns will now indicate whether genes have been annotated by both Ensembl and Havana ('ensembl_havana'), Ensembl only ('ensembl), or Havana only ('havana'). This will feed into BioMart to make it easier for users to fetch genes and transcripts from only the annotation sources they are interested in. In release 74 and earlier releases, this information could be found using the analysis.logic_name. Note: An addiitonal source, 'insdc', is used for genes and transcripts on the mitochondrial chromosome because they are imported from the MT genbank file.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e75: The latest set of cDNAs for mouse (as of January 2013) from the European Nucleotide Archive and NCBI RefSeq (release 62) were aligned to the current genome using Exonerate.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 29,033 transcript models as part of an updated version (December 2013) of CCDS

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 23,084 transcript models as part of an updated version (Dec 2013) of CCDS

Human: updated cDNA alignments (Human)

A new cdna database was created for e75: The latest set of cDNAs for human (as of December 2013) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

New C.elegans core database (WS240) (Caenorhabditis elegans)

The C.elegans reference annotation will be updated to WormBase version WS240. The reference genome will remain the same (version WBcel235). 

Mouse RNAseq database (Mouse)

This is RNAseq database for mouse. It contains models that are build by using Sanger RNAseq data.  

Vega Zebrafish annotation updated (all species)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 55.

Correcting spelling mistakes in sheep (Sheep)

Changing Merino to Gansu fine wool sheep

Correcting spelling mistakes in the tissue samples

Ghrelin gene added (Chinese softshell turtle)

New data has been provided by the P. sinsensis community that has allowed us to annotate the ghrelin gene.

Deleted transcripts with stop codons (Marmoset)

We deleted 15 single-transcript genes and an additional 6 transcripts from multi-transcript genes where the translation had stop codons.

Deleted ENSRNOG00000042244 (Rat)

One gene, reported by a user, has been deleted from the rat gene set.

Deleted transcript ENSSSCT00000011005 (Pig)

Transcript ENSSSCT00000011005 was deleted. There remains an overlapping transcript within the same gene that has been manually annotated by Havana.

Release 74

Cave fish assembly and genebuild (Cave fish)

The Cave fish assembly AstMex102 was released. We have produced new gene annotation on this assembly.

Cave fish RNASeq database and Bam files (Cave fish)

In addition to the gene annotation for AstMex102, an rnaseq database will be released where users can view BAM files and transcript models.

Sheep new assembly and genebuild (Sheep)

Genome annotation of the new sheep assembly Oar_v3.1

Sheep RNASeq database and BAM files (Sheep)

Rnaseq database and alignment bam files for the new sheep assembly Oar_v3.1

Spotted gar assembly and genebuild (Spotted gar)

The Spotted Gar assembly LepOcu1 was released. We have produced new gene annotation on this assembly.

Spotted gar RNASeq database and Bam files (Spotted gar)

In addition to the gene annotation for LepOcu1, an rnaseq database will be released where users can view BAM files and transcript models.

Update to Ensembl-Havana mouse merge (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The mouse GRCm38.p2 gene annotation is also included:

The patches for GRCm37.p2 were annotated using a combination of annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Mouse: assembly updated to GRCm38.p2 (Mouse)

The mouse genome assembly was updated to GRCm38.p2 and the assembly information in all mouse databases has been altered accordingly. This minor assembly update contains 14 assembly patches. The DNA sequence for the primary assembly (chromosomes, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Armadillo assembly and genebuild (Armadillo)

The Armadillo assembly Dasnov3.0 was released. We have produced new gene annotation on this assembly.

Armadillo RNASeq database and Bam files (Armadillo)

In addition to the gene annotation for Dasnov3.0, an rnaseq database will be released where users can view BAM files and transcript models and transcript models for a number of tissues including Colon, Cerebellum, Heart, Kidney, Liver, Lung, Quadricep and Spleen.

Update to Ensembl-Havana GENCODE gene set (release 19) (Human)

Updated Ensembl-Havana gene set (GENCODE release 19). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh37.p13 gene annotation is also included:

The patches for GRCh37.p13 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Human: assembly updated to GRCh37.p13 (Human)

The human genome assembly was updated to GRCh37.p13 and the assembly information in all human databases has been altered accordingly. This minor assembly update contains 204 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e74: The latest set of cDNAs for mouse (as of October 2013) from the European Nucleotide Archive and NCBI RefSeq (release 61) were aligned to the current genome using Exonerate.

Human: updated cDNA alignments (Human)

A new cdna database was created for e74: The latest set of cDNAs for human (as of October 2013) from the European Nucleotide Archive and NCBI RefSeq (release 61) were aligned to the current genome using Exonerate.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 27,732 transcript models as part of an updated version (Aug 2013) of CCDS

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 23,087 transcript models as part of an updated version (Aug 2013) of CCDS

Mouse: GRCm38.p2 Karyotype Bands (Mouse)

Karyotype bands were updated in regions overlapping patches

Human: GRCh37.p13 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Cave fish Otherfeatures database (Cave fish)

Cave fish EnsEMBL longest translations from zebrafish and stickleback have been aligned to AstMex102 to generate gene models and are made available through the website and otherfeatures database.

Sheep otherfeatures databse (Sheep)

Other features database for the new sheep assembly Oar_v3.1 containing species specific cDNA and EST alignments

Spotted gar Otherfeatures database (Spotted gar)

Spotted gar EnsEMBL longest translations from zebrafish have been aligned to LepOcu1 to generate gene models and are made available through the website and otherfeatures database.

Armadillo Otherfeatures database (Armadillo)

EnsEMBL longest translations from human have been aligned to Armadillo (Dasnov3.0) to generate gene models and are made available through the website and otherfeatures database.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 54.

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega release 54.

Vega Pig annotation updated (Pig)

Manual annotation of pig from Havana has been updated and contains the data released in Vega release 54.

Update to chromosome Z (Chicken)

The contig sequence AC186840.3 will be replaced with AC186840.2 for scaffold JH375087.1 for chromosome Z. This affects the gene annotation and alignments on the regions.

Update to chromosome Z (Chicken)

The contig sequence AC186840.3 will be replaced with AC186840.2 for scaffold JH375087.1 for chromosome Z. This affects the alignments on the regions.

Update to chromosome Z (Chicken)

The contig sequence AC186840.3 will be replaced with AC186840.2 for scaffold JH375087.1 for chromosome Z. This affects the alignments on the regions.

Update to chromosome Z (Chicken)

BAM files updated due to the sequence change for chromosome Z

Name change to scaffold underlying MT (Guinea Pig)

The sequence for the guinea pig MT has not changed, however we have changed the name of its underlying scaffold from AY172335 to NC_000884.1

Release 73

Update to Ensembl-Havana GENCODE gene set (release 18) (Human)

Updated Ensembl-Havana gene set (GENCODE release 18). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh37.p12 gene annotation is also included:

The patches for GRCh37.p12 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Human: assembly updated to GRCh37.p12 (Human)

The human genome assembly was updated to GRCh37.p12 and the assembly information in all human databases has been altered accordingly. This minor assembly update contains 194 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Havana merge for Zebrafish (Zebrafish)

An updated set of Zebrafish genes will be released. There hasn't been a new Ensembl genebuild but the Havana annotations have been updated and we therefore re-ran the merge of the two gene sets.

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Duck genebuild (Duck)

Duck is a new species to Ensembl as from release 73. Here we provide gene annotation on the genome assembly BGI_duck_1.0. This assembly and annotation were available through our Pre! site for some time and they are now available through our main site.

Mouse: updated RefSeq gene import (Mouse)

The imported RefSeq gene set was updated in the mouse otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Human: GRCh37.p12 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 27,747 transcript models as part of an updated version (Jul 2013) of CCDS

Flycatcher new assembly and genebuild (Flycatcher)

The flycatcher assembly FicAlb_1.4 was released. We have produced new gene annotation on this assembly.

Flycatcher Otherfeatures database (Flycatcher)

Flycatcher EnsEMBL longest translations from chicken and zebra finch have been aligned to FicAlb_1.4 to generate gene models and are made available through the website and otherfeatures database.

Flycatcher RNASeq database and Bam files (Flycatcher)

In addition to the gene annotation for FicAlb_1.4, an rnaseq database will be released where users can view BAM files and transcript models for a number of tissues including embryo, liver, heart, kidney and brain.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 22,983 transcript models as part of an updated version (Jul 2013) of CCDS

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e73: The latest set of cDNAs for mouse (as of July 2013) from the European Nucleotide Archive and NCBI RefSeq (release 59) were aligned to the current genome using Exonerate.

Human: updated cDNA alignments (Human)

A new cdna database was created for e73: The latest set of cDNAs for human  (as of July 2013) from the European Nucleotide Archive and NCBI RefSeq (release 56) were aligned to the current genome using Exonerate.

Rabbit gene set updated using RNAseq data (Rabbit)

RNAseq data has been used to update the protein-coding gene set.

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega release 53.

Vega Zebrafish annotation updated (all species)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 53.

Rabbit RNASeq database and Bam files (Rabbit)

In addition to the new core DB for Rabbit, an rnaseq database will be released where users can view BAM files and transcript models for a number of tissues including testis, liver, heart, kidney and brain.

Fixed RNASeq ftp broken links from the Location view (multiple species)

Fixed FTP broken links available from the analysis description for RNASeq alignments.

Release 72

Update to Ensembl-Havana GENCODE gene set (release 17) (Human)

Updated Ensembl-Havana gene set (GENCODE release 17). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh37.p11 gene annotation is also included:

The patches for GRCh37.p11 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Human: assembly updated to GRCh37.p11 (Human)

The human genome assembly was updated to GRCh37.p11 and the assembly information in all human databases has been altered accordingly. This minor assembly update contains 187 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Human: GRCh37.p11 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 27,440 transcript models as part of an updated version (Mar 2013) of CCDS

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e72: The latest set of cDNAs for mouse (as of May 2013) from the European Nucleotide Archive and NCBI RefSeq (release 59) were aligned to the current genome using Exonerate.

Human: updated cDNA alignments (Human)

A new cdna database was created for e72: The latest set of cDNAs for human (as of May 2013) from the European Nucleotide Archive and NCBI RefSeq (release 59) were aligned to the current genome using Exonerate.

New rat otherfeatures DB (Rat)

The new rat otherfeatures database fixes an issue where the gene and transcript models for cDNAs and ESTs were miessing. In addtion the EST genebuilder pipeline has been run to create 'estgenes'.

Vega Human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega release 52.

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updatedand contains the data released in Vega release 52.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set includes new transcript models as part of an updated version of CCDS

Mouse: gene set updated (Mouse)

The Ensembl/HAVANA merged gene set was updated to incorporate the latest manual annotation from the HAVANA team.

Fixing zebrafish RNASeq data (Zebrafish)

Adding the missing RNASeq alignment for zebrafish

Beta cell transcriptome (Human)

Anonymised beta cell transcriptome data have been made available.

Chicken RNASeq database and Bam files (Chicken)

An updated rnaseq database will be released where liver tissue transcript models for Gallus_gallus_4.0 will be added to the existing rnaseq models.

Added missing mitochondrion to several species (multiple species)

The mitochondrial sequence and annotation has been added for several species

Release 71

Update to Ensembl-Havana GENCODE gene set (release 16) (Human)

Updated Ensembl-Havana gene set (GENCODE release 16). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Added Kidney RNASeq models and intron supporting features (Human)

Due to computational problems, models and supporting evidence for the kidney were missing from the RNASeq database

Havana merge for Zebrafish (Zebrafish)

An updated merged set of Zebrafish genes will be released.

New assembly and genebuild (Chicken) (Chicken)

The chicken assembly was updated to Gallus_gallus_4.0. We have produced new gene annotation on this assembly.

Chicken Otherfeatures database (Chicken)

Chicken cDNA and EST sequences have been aligned to the new Gallus_gallus_4.0 assembly and are made available through the website and otherfeatures database.

Chicken RNASeq database and Bam files (Chicken)

In addition to the gene annotation for Gallus_gallus_4.0, an rnaseq database will be released where users can view BAM files and transcript models for a number of tissues including embryo, liver, heart, breast and brain.

Mitochondrial sequence NC_001323.1 (Chicken)

Chicken mitochondrial sequence and annotation has been imported from the RefSeq record  NC_001323.1.

Human: updated cDNA alignments (Human)

A new cdna database was created for e71: The latest set of cDNAs for human (as of Feb 2013) from the European Nucleotide Archive and NCBI RefSeq (release 57) were aligned to the current genome using Exonerate.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 27,480 transcript models as part of an updated version (Jan 2013) of CCDS

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e70: The latest set of cDNAs for mouse (as of Nov 2012) from the European Nucleotide Archive and NCBI RefSeq (release 57) were aligned to the current genome using Exonerate.

Anole gene set updated using RNAseq data (Anole lizard)

RNAseq provided to us by the Anolis Genome Project has been used to update the protein-coding gene set.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set includes new transcript models as part of an updated version of CCDS

Vega Zebrafish annotation updated (Zebrafish)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 51.

Vega human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega release 51.

Anole RNASeq database and Bam files (Anole lizard)

In addition to the new core DB for Anole, an rnaseq database will be released where users can view BAM files and transcript models for a number of tissues including embryo, liver, heart, dewlap and brain.

Adding Anole Genome Project annotation to the Otherfeatures Database (Anole lizard)

We will add Anole Genome Project annotation to the Otherfeatures database.

Added missing intron supporting feature (Gibbon)

The intron supporting features are now displayable on the web site

Fixing mitochondrion karyotype rank (Horse)

The mitochondrion is now visible on the horse's karyotype.

Add missing MT genes for Anole (Anole lizard)

Anole is missing 3 MT genes.

INSDC accession added (multiple species)

INSDC accessions were added as synonyms for chromosomes.

Release 70

Renamed biotype: retrotransposed to processed_pseudogene (all species)

For all species, genes with the biotype 'retrotransposed' will now have the biotype 'processed_pseudogene'.

Update to Ensembl-Havana GENCODE gene set (release 15) (Human)

Updated Ensembl-Havana gene set (GENCODE release 15). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

The human GRCh37.p10 gene annotation is also included:

The patches for GRCh37.p10 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Update to Ensembl-Havana mouse gene set (Mouse)

Updated Ensembl-Havana gene set including the first GRC minor release annotation (GRCm38.p1). This gene set is a merge of complete mouse Ensembl gene models and the latest mouse Havana gene annotation. All CCDS genes are included in this gene set.

Update to the Human BodyMap - RNASeq database with associated BAM files (Human)

The Human BodyMap will be reanalysed using the recently updated RNASeq pipeline. PLEASE NOTE: gene and transcript model identifiers are not stable identifiers and therefore have not been mapped between release 70 and previous releases.

Gene models and introns for kidney are not available for release 70 but will be added for a future release. Kidney data are included in the merged BAM file.

Human: assembly updated to GRCh37.p10 (Human)

The human genome assembly was updated to GRCh37.p10 and the assembly information in all human databases has been altered accordingly. This minor assembly update contains 182 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

New assembly and genebuild (Rat)

The rat assembly was updated to Rnor_5.0. We have produced new gene annotation on this assembly. Please note that QTLs are not yet available on the new assembly. The QTLs will be added in a future Ensembl release.

Otherfeatures database (Rat)

Rat cDNA and EST sequences have been aligned to the new Rnor_5.0 assembly and are made available through the website and otherfeatures database.

Mitochondrial sequence NC_001665.2 (Rat)

Rat mitochondrial sequence and annotation has been imported from the RefSeq record NC_001665.2.

New assembly and genebuild (Cat) (Cat)

The cat assembly was updated to Felis_catus-6.2. We have produced new gene annotation on this assembly.

Added cat mitochondrial sequence and genes (Cat)

Added mitochondrion sequence NC_001700.1

Gibbon gene set updated using RNAseq data (Gibbon)

RNAseq provided to us by the Gibbon Consortium has been used to update the protein-coding gene set.

Human: GRCh37.p10 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 26,387 transcript models as part of an updated version (Oct 2012) of CCDS

Vega mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updatedand contains the data released in Vega release 50.

Vega human annotation updated (Human)

Manual annotation of human from Havana has been updatedand contains the data released in Vega release 50.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set includes 23,011 transcript models as part of an updated version (Sep 2012) of CCDS.

Human: updated cDNA alignments (Human)

A new cdna database was created for e70: The latest set of cDNAs for human (as of Nov 2012) from the European Nucleotide Archive and NCBI RefSeq (release 56) were aligned to the current genome using Exonerate. A total of 228,542 cDNAs were aligned to the genome.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e70: The latest set of cDNAs for mouse (as of Nov 2012) from the European Nucleotide Archive and NCBI RefSeq (release 56) were aligned to the current genome using Exonerate. A total of 248,527 cDNAs were aligned to the genome.

Updated human otherfeatures database (Human)

Update due to GRCh37 patch 10

Updated mouse otherfeatures database (Mouse)

Update due to GRCm38 patch 1

RNAseq data now available (Chimpanzee)

Chimpanzee RNAseq data from the Nonhuman Primate Reference Transcriptome Resource has been aligned to the current assembly. RNAseq-based gene models and intron features are available in the updated rnaseq database.

Added horse mitochondrial sequence and genes (Horse)

The sequence added is NC_001640.1.

RNAseq data available (Gibbon)

RNASeq-based models and intron features were predicted using the lymphoblastoid transcriptome data provided by the Gibbon Consortium.

Updated chimpanzee mitochondrial sequence and genes (Chimpanzee)

The sequence was updated to NC_001643.1.

Vega pig gene names (Pig)

The 'DUROC-' prefix present on the names of the Vega genes in the chromsome 7 MHC region have been removed, so for example DUROC-MIC-1 is now named MIC-1.

Updated coelacanth mitochondrial genes (Coelacanth)

Added missing genes from the mitochondrion

Release 69

New species: Ferret other features DB (Ferret)

We are releasing the first ferret assembly, MusPutFur1.0 from the Broad Institute. In addition to the gene annotation, a rnaseq database will be released where users can view BAM files and transcript models for a number of tissues including: brain, heart, kidney, liver, lung, lymph, pancreas, skin, skeletal muscle, spleen, testis and trachea.

Update to Ensembl-Havana GENCODE gene set (release 14) (Human)

Updated Ensembl-Havana gene set (GENCODE release 14). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Human: updated cDNA alignments (Human)

A new cdna database was created for e69: The latest set of cDNAs for human (as of Sep 2012) from the European Nucleotide Archive and NCBI RefSeq (release 54) were aligned to the current genome using Exonerate. A total of 228,168 cDNAs were aligned to the genome.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 26,387 transcript models as part of an updated version (Aug 2012) of CCDS

New species: Southern Platyfish (Platyfish)

The Southern Platyfish genome (Xiphophorus maculatus) has been added to Ensembl. The genebuild involved a combination of annotation approaches: the standard genebuild procedure was complemented with models generated by our RNASeq pipeline. In addition to the gene annotation, an rnaseq database has been released where users can view BAM files and transcript models for a number of samples including developmental stages. 

New species: Southern Platyfish other features DB (Platyfish)

The Southern Platyfish genome (Xiphophorus maculatus) has been added to Ensembl. The genebuild involved a combination of annotation approaches: the standard genebuild procedure was complemented with models generated by our RNASeq pipeline. In addition to the gene annotation, an rnaseq database has been released where users can view BAM files and transcript models for a number of samples including developmental stages. 

New species: Southern Platyfish RNASeq DB and BAM files (Platyfish)

The Southern Platyfish genome (Xiphophorus maculatus) has been added to Ensembl. The genebuild involved a combination of annotation approaches: the standard genebuild procedure was complemented with models generated by our RNASeq pipeline. In addition to the gene annotation, an rnaseq database has been released where users can view BAM files and transcript models for a number of samples including developmental stages. 

Pig: Ensembl-Havana gene set (Pig)

This is the first Ensembl-Havana merge for pig where Ensembl annotation is combined with  the manual annotation from the HAVANA team.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e69: The latest set of cDNAs for mouse (as of Mon YYYY) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate. A total of nnn,nnn cDNAs were aligned to the genome.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set includes 23,013 transcript models as part of an updated version (Aug 2012) of CCDS

New species: Ferret (Ferret)

We are releasing the first ferret assembly, MusPutFur1.0 from the Broad Institute. In addition to the gene annotation, a rnaseq database will be released where users can view BAM files and transcript models for a number of tissues including: brain, heart, kidney, liver, lung, lymph, pancreas, skin, skeletal muscle, spleen, testis and trachea.

Zebrafish VEGA Merge (Zebrafish)

Ensembl (automatic annotation) and Havana (manual annotation) merged gene set of zebrafish has been updated. This represents the annotation presented in Vega release 49.

Platypus annotation updated using RNA-Seq (Platypus)

RNA-Seq from a range of tissues has been used to update the Platypus gene set.

Vega zebrafish annotation updated (Zebrafish)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 49.

Vega pig annotation added (Pig)

Manual annotation of pig from Havana has been added. These data were released in Vega49.

Vega human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega 49.

Opossum annotation updated with RNA-Seq (Opossum)

RNA-Seq data from a range of tissues has been used to update the Opossum gene set.

Orangutan annotation updated with RNA-Seq (Orangutan)

RNA-Seq data from a range of tissues has been used to update the Orangutan gene set.

Opossum RNA-Seq models and alignments added (Opossum)

Oppossum RNA-Seq alignments and gene models have been added for a range of tissues

Platypus RNA-Seq models and alignments added (Platypus)

Platypus RNA-Seq alignments and gene models have been added for a range of tissues

Orangutan RNA-Seq models and alignments added (Orangutan)

Orangutan RNA-Seq models and alignments for a range of tissues have been addded

Zebrafish RNA-Seq alignments added (Zebrafish)

Zebrafish RNA-Seq alignments have been added for a range of tissues.

Tasmanian Devil RNA-Seq alignments added (Tasmanian devil)

Tasmanian Devil RNA-Seq alignments have been added for a range of tissues.

Coelacanth RNA-Seq alignments added (Coelacanth)

Coelacanth RNA-Seq alignments have been added for a range of tissues

New species: Ferret RNASeq database and Bam files (all species)

We are releasing the first ferret assembly, MusPutFur1.0 from the Broad Institute. In addition to the gene annotation, a rnaseq database will be released where users can view BAM files and transcript models for a number of tissues including: brain, heart, kidney, liver, lung, lymph, pancreas, skin, skeletal muscle, spleen, testis and trachea.

Updated lizard mitochondrial sequence and genes (Anole lizard)

The assembly and seq_region table for lizard has been updated. The mitochondrial sequence will now display correctly instead of displaying as Ns. Although there has been no change to coordinates, transcripts will no longer display as Ns and translations will no longer displays as Xs.

Updated turkey mitochondrial sequence and genes (Turkey)

The sequence was updated to JF275060.1.

Mouse estgene exon stable ids (Mouse)

Adding missing mouse estgene exon stable ids

Updated C. intestinalis mitochondrial sequence and genes (C.intestinalis)

The mitochondrial sequence was updated to HT000188.1. There are no genes annotated on this new MT sequence.

Removed version from contig coordinate system (multiple species)

These changes should not change the assembly: Updated the coord_system table so that the version is NULL for the contig coordinate system. Updated the meta table so that the assembly.mapping entries for contigs do not contain a version.

Imported ncRNAs (Pig)

We have imported a set of ncRNAs for pig. These were provided by the SGSC.

Release 68

Mouse: new assembly and gene set for GRCm38 (Mouse)

We have updated our mouse genome assembly to the latest GRCm38 primary assembly. A full Ensembl genebuild was done on this new assembly, and the results merged with manual annotation from the HAVANA team to provide the gene set for Ensembl release 68.

New dog assembly (Dog)

We have updated our dog genome assembly to the latest CanFam3.1. Gene annotation on this assembly involved a combination of annotation approaches: the standard genebuild procedure was complemented with models generated by our RNASeq pipeline. In addition to the gene annotation, an rnaseq database has been released where users can view BAM files and transcript models for a number of tissues including: blood, brain, heart, kidney, liver, lung, ovary, skin, skeletal muscle and testis.

Squirrel db rename (Squirrel)

Thirteen-lined ground squirrel was renamed from Spermophilus tridecemlineatus to Ictidomys tridecemlineatus. This is reflected in the new database name.

Update to Ensembl-Havana GENCODE gene set (release 13) (Human)

Updated Ensembl-Havana gene set (GENCODE release 13). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set. Please note that, compared to GENCODE 12, genes contributed by Ensembl have been filtered to remove 7231 transcripts believed to be poorly supported.

New species: Chinese softshell turtle (Chinese softshell turtle)

The Chinese softshell turtle genome (Pelodiscus sinensis) has been added to Ensembl. The genebuild involved a combination of annotation approaches: the standard genebuild procedure was complemented with models generated by our RNASeq pipeline. In addition to the gene annotation, an rnaseq database has been released where users can view BAM files and transcript models for a number of samples including developmental stages. 

Human: updated cDNA alignments (Human)

A new cdna database was created for e68: The latest set of cDNAs for human (as of Jun 2012) from the European Nucleotide Archive and NCBI RefSeq (release 52) were aligned to the current genome using Exonerate. A total of 227,153 cDNAs were aligned to the genome.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 26,395 transcript models as part of an updated version (Apr 2012) of CCDS

Human: GRCh37.p8 gene annotation (Human)

The patches for GRCh37.p8 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Human: assembly updated to GRCh37.p8 (Human)

The human genome assembly was updated to GRCh37.p8 and the assembly infomation in all human databases has been altered accordingly. This minor assembly update contains 140 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Human: GRCh37.p8 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Vega human annotation updated (Human)

Manual annotation of human from Havana has been updated and contains the data released in Vega 48.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set includes 22,106 transcript models as part of an updated version (April 2012) of CCDS

 

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e68: The latest set of cDNAs for mouse from the European Nucleotide Archive and NCBI RefSeq were aligned to the current genome using Exonerate. A total of 228,993 cDNAs were aligned to the genome, showing an increase of 96 cDNAs compared to release 67.

Vega mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated on the new assembly GRCm38. The data represent the annotation presented in Vega release 48.

Chicken MT contig renamed (Chicken)

The chicken mitochondrial contig (seq_region_id=102842) name was updated from AY172335 to X52392.1. The underlying DNA sequence was not changed.

Release 67

New species: Nile Tilapia (all species)

The Nile Tilapia genome (Oreochromis niloticus) has been added to Ensembl. The genebuild involved a combination of annotation approaches: the standard genebuild procedure and RNA-seq.

New squirrel assembly (Squirrel)

The thirteen-lined ground squirrel gene annotation in e67 is based on the high coverage assembly SpeTri2.0 provided by the Broad Institute of MIT and Harvard.

The final gene set consists of 18826 protein-coding genes, 387 pseudogenes, 19 retrotransposed genes and 3166 ncRNAs.

Sus scrofa 10.2 (Pig)

The new assembly of the pig genome, Sscrofa10.2, from the Swine Genome Sequencing Consortium.

Coelacanth annotation updated (Coelacanth)

The Latimeria chalumnae gene set has been expanded to include gene models built using RNASeq from Latimeria menadoensis.

Update to Ensembl-Havana GENCODE gene set (release 12) (Human)

Updated Ensembl-Havana gene set (GENCODE release 12) based on updated Ensembl gene set and latest Havana gene annotation.

Human: updated cDNA alignments (Human)

A new cdna database was created for e67: The latest set of cDNAs for human (as of 21/Mar/2012) from the European Nucleotide Archive and NCBI RefSeq (release 51) were aligned to the current genome using Exonerate. A total of 221,544 cDNAs were aligned to the genome.

Updated human otherfeatures db: New CCDS import (Human) (Human)

This release of the human gene set also includes 26,425 transcript models as part of an updated version (January 2012) of CCDS

Human: GRCh37.p7 gene annotation (Human)

The patches for GRCh37.p7 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.

Human: assembly updated to GRCh37.p7 (Human)

The human genome assembly was updated to GRCh37.p7 and the assembly infomation in all human databases has been altered accordingly. This minor assembly update contains 130 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Human: GRCh37.p7 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Vega human annotation updated (Human)

Manual annotation of human from Havana has been updated and includes annotation of the GRCh37.p5 fix and novel patches. The data is that released in Vega 47.

Vega Zebrafish annotation updated (Zebrafish)

Manual annotation of zebrafish from Havana has been updated. The data is that released in Vega 47.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e67: The latest set of cDNAs for mouse from the European Nucleotide Archive and NCBI RefSeq were aligned to the current genome using Exonerate.

Zebrafish Ensembl-Havana Merge (all species)

Manual annotation of zebrafish from Vega has been updated.  This represents the annotation presented in Vega release 47. 

Analyses updates (all species)

Update in analyses display for a more consistent system across species

Incorporating MT chromosome to zebra finch annotation (Zebra Finch)

Addition of mitochondrial chromosome for zebra finch gene annotation.

Release 66

Chimpanzee RNAseq data (Chimpanzee)

Chimpanzee RNAseq data were aligned to the chimpanzee 2.1.4 genome using BWA. The Ensembl RNAseq pipeline was used to generate transcript models from these reads. Both the original BAM files and the transcript models are available in the chimapmanzee rnaseq database and on the website.

New species: Coelacanth (Coelacanth)

The Coelacanth genome (Latimeria chalumnae) has been added to Ensembl. The genebuild involved a combination of annotation approaches: the standard genebuild procedure and RNA-seq. The RNA-seq gene models are stored in the otherfeatures database.

 

New ciona intestinalis assembly (C.intestinalis)

The Ciona intestinalis gene annotation in e66 is based on the high coverage assembly KH provided by the Kyoto University.

The final gene set consists of 16658 protein-coding genes, 27 pseudogenes and 429 ncRNAs.

Vega human annotation updated (Human)

Manual annotation of human from Havana has been updated and includes annotation of the GRCh37.p5 fix patches. The data is that released in Vega 46.

Human: updated cDNA alignments (Human)

A new cdna database was created for e66: The latest set of cDNAs for human (as of 18/Jan/2011) from the European Nucleotide Archive and NCBI RefSeq were aligned to the current genome using Exonerate. A total of 224,907 cDNAs were aligned to the genome, showing an increase of 491 cDNAs compared to release 65.

Updated human otherfeatures db: New CCDS import (Human) (Human)

This release of the mouse gene set also includes 26,437 transcript models as part of an updated version (Mon YYYY) of CCDS

Human: GRCh37.p6 gene annotation (Human)

The patches for GRCh37.p6 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the otherfeatures database.

Human: assembly updated to GRCh37.p6 (Human)

The human genome assembly was updated to GRCh37.p6 and the assembly infomation in all human databases has been altered accordingly. This minor assembly update contains 124 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Human: GRCh37.p6 Karyotype Bands (Human)

Karyotype bands were updated in regions overlapping patches

Update to Ensembl-Havana GENCODE gene set (release 11) (Human)

Updated Ensembl-Havana gene set (GENCODE release 11) based on updated Ensembl gene set and latest Havana gene annotation.

Mouse: gene set updated (Mouse)

The Ensembl/HAVANA merged gene set was updated to incorporate the latest manual annotation from the HAVANA team.

Vega mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated. The data represent the annotation presented in Vega release 46.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e66: The latest set of cDNAs for mouse  (as of 13/Jan/2011) from the European Nucleotide Archive and NCBI RefSeq were aligned to the current genome using Exonerate. A total of 227,829 cDNAs were aligned to the genome, showing an increase of 390 cDNAs compared to release 65.

Updated mouse otherfeatures db: New CCDS import (Mouse) (Mouse)

This release of the mouse gene set includes 22,144 transcript models as part of an updated version (Nov 2011) of CCDS

 

Zfin clone names (Zebrafish)

Zfin clone names have been added to the zebrafish databse as seq region synonyms.

Flagging obsolete Uniprot proteins (all species)

Flagging obsolete Uniprot proteins used as supporting evidence for the transcripts and the exons

Flagging obsolete Ensembl proteins (all species)

Flagging obsolete Human Ensembl proteins used as supporting evidence for the transcripts and the exons

Release 65

New Chimpanzee assembly (Chimpanzee)

The first genebuild on new Chimpanzee assembly CHIMP2.1.4.

New bushbaby assembly (Bushbaby)

The bushbaby gene annotation in e65 is based on the high coverage assembly OtoGar3 provided by the Broad Institute.

The gene set for bushbaby was generated using bushbaby and primate proteins as well as human ensembl translations.

The final gene set consists of 19506 protein_coding genes, 1151 pseudogenes and 7276 ncRNAs.

More detailled information can be found here.

New species: Atlantic cod (all species)

The Atlantic cod (Gadus morhua) has been added to Ensembl. The genebuild involved a combination of annotation approaches: the standard genebuild procedure and whole-genome alignment and projection from stickleback.

Human cdna update (Human)

Cdna alignments for human using the most up-to-date set of cDNAs from the European Nucleotide Archive and NCBI RefSeq

Update human otherfeatures db: new CCDS import (Human)

Update to CCDS set for human

Projection of annotation to GRC assembly patches (Human)

Annotation from the primary assembly is projected to the assembly patches. The projected annotation is then supplemented with annotation based on evidence alignment. This annotation is stored in the human otherfeatures database.

Update to Ensembl-Havana GENCODE gene set (release 10) (Human)

Updated Ensembl-Havana gene set (GENCODE release 10) based on updated Ensembl gene set and latest Havana gene annotation.

 

Vega human annotation updated (Human)

Manual annotation of human from Havana has been updated. The data represent the annotation presented in Vega release 45.

Mouse cDNA update (Mouse)

The latest set of cDNAs for mouse (as of 14/OCT/2011) from the European Nucleotide Archive and NCBI RefSeq were aligned to the current genome using Exonerate. There are 4.216 new cDNA and a total of 34,664 new alignments for Ensembl 65.

Zebrafish VEGA Merge (Zebrafish)

Manual annotation of zebrafish from Vega has been updated.  This represents the annotation presented in Vega release 45. 

Zebrafish Markers (Zebrafish)

Zebrafish SATMAP markers have been given a separate track so they can be easily distinguished from the other markers.

Vega zebrafish annotation updated (Zebrafish)

Manual annotation of zebrafish from Havana has been updated. The data represent the annotation presented in Vega release 45.

MT annotation for anole, elephant, panda, rabbit and turkey (Anole lizard, Elephant, Rabbit, Panda, Turkey)

MT sequences have been added to the main assembly.

Annotation for those sequences has also been provided

Flagging obsolete Uniprot proteins (all species)

Flagging obsolete Uniprot proteins used as supporting evidence for the transcripts and the exons

Flagging obsolete Ensembl proteins (all species)

Flagging obsolete Human Ensembl proteins used as supporting evidence for the transcripts and the exons

Analyses updates (all species)

Update in logic_names and descriptions for a more consistent system across species

Pfam version numbers removed (Cow, Mouse, Chimpanzee)

Pfam hit names have versions which will be removed

Assembly name update (Lamprey)

Assembly name for lamprey changed from Petromyzon_marinus_7.0 to Pmarinus_7.0

MT annotation for Tasmanian Devil (Tasmanian devil)

MT sequences have been added to the main assembly.

Annotation for those sequences has also been provided

 

Karyotype bands for patches (Human)

Store karyoptype bands on the patches so they can be displayed in the browser.

Release 64

Human: updated cDNA alignments (Human)

A new cdna database was created for e64: The latest set of cDNAs for human (as of 14/July/2011) from the European Nucleotide Archive and NCBI RefSeq were aligned to the current genome using Exonerate. A total of 245,557 cDNAs were aligned to the genome, showing an increase of 2,770 compared to release 63.

Human: assembly updated to GRCh37.p5 (Human)

The human genome assembly was updated to GRCh37.p5 and the assembly infomation in all human databases has been altered accordingly. This minor assembly update contains 105 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.

Human: GRCh37.p5 gene annotation (Human)

The patches for GRCh37.p5 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the otherfeatures database.

Human: gene set updated (Human)

The human gene set for e64 was updated and is equivalent to GENCODE (http://www.gencodegenes.org/) version 9. Annotation from HAVANA was retrieved from the Vega database release 44 (http://vega.sanger.ac.uk/Homo_sapiens/Info/Index). This includes the ABO gene on a GRC patch that has been annotated by HAVANA as well as the completed manual annotation of chromosome 14. This release of GENCODE contains 20900 protein-coding genes. The human set also includes an updated set (May 2011) of the CCDS (http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi) set of 25521 transcripts models.

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. The RefSeq gene count increased from 23733 in e63 to 24435 in e64. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Mouse: gene set updated (Mouse)

The Ensembl/HAVANA merged gene set was updated to incorporate the latest manual annotation from the HAVANA team. This annotation from HAVANA is also displayed in the VEGA release 44. This includes annotation of the MHC region on chromosome 17.

Mouse: cDNA alignments updated (Mouse)

A new mouse cdna database was created for e64: The latest set of cDNAs for mouse (as of 19/Jul/2011) from the European Nucleotide Archive and NCBI RefSeq were aligned to the
current genome using Exonerate. There are 1,223 new cDNAs with a total of 15,200 new alignments for Ensembl 64.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 22147 transcript models as part of an updated version (May 2011) of CCDS

New species: Lamprey (Lamprey)

Petromyzon_marinus_7.0 is a new assembly of the sea lamprey (Petromyzon marinus) provided by the lamprey consortium. The gene set for lamprey was built using the Ensembl pipeline. Gene models are based on genewise alignments of lamprey proteins and other proteins from Uniprot. The protein based gene models were then extended using lamprey cDNA. To improve the accuracy of models generated from distant species, transcriptome data was used to filter out the different gene models. In addition to the coding transcript models, non-coding RNAs and pseudogenes were annotated. The final gene set consists of 6,993 projected protein coding genes containing 14,104 transcripts, 47 pseudogenes and 2,628 ncRNAs. More detailed information on the genebuild can be found here.

New assembly and genebuild: Cow UMD3.1 (Cow)

The cow gene annotation in e64 is an entirely new genebuild based on the UMD 3.1 assembly produced by Center for Bioinformatics and Computational Biology (CBCB) at University of Maryland.

The assembly was downloaded from here:

ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/Eukaryotes/vertebrates_mammals/Bos_taurus/Bos_taurus_UMD_3.1/

The gene annotation was generated using three different sources, namely Bos taurus proteins from UniProtKB and NCBI, UniProt mammalian and vertebrate proteins and finally translations of Ensembl human genes. The alignment of above evidence to the genome followed procedures in the standard Ensembl genebuild pipeline. The gene-building procedure on the UMD3.1 assembly identified 19994 protein-coding genes and 797 pseudogenes.cDNA and EST sequences were aligned to the genome using Exonerate.

More detailed information about the cow genebuild can be found at http://www.ensembl.org/Bos_taurus/Info/Index

 

New species: Tasmanian devil (Tasmanian devil)

The Tasmanian devil (Sarcophilus harrisii) 7.0 assembly, provided by Illumina and the Wellcome Trust Sanger Institute, has been added as a new species to Ensembl for release 64. The Ensembl genome annotation pipeline was used to identify genes. Models built from Tasmanian devil proteins and cDNAs were given priority over predictions from other vertebrate species. 5,663 transcript models made from paired end Illumina RNASeq were added into the gene build where they added a novel model or splice variant. In total 22,391 protein-coding gene models were constructed. The final gene set consists of 18,775 protein coding genes containing 24,035 transcripts, 178 pseudogenes and 1,466 ncRNAs. More detailed information on the genebuild can be found here. An otherfeatures database containing RNASeq models is also provided.

Human: Vega annotation updated (Human)

Manual annotation of human from Havana has been updated. This represents the annotation presented in Vega release 44. Annotation by Havana of chromosome 14 has been completed.

Mouse: Vega annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated, including annotation of the MHC region on chromosome 17. The data represents the annotation presented in Vega release 44.

Updated assembly: Gorilla 3.1 (Gorilla)

The gorilla assembly was updated from gorGor3 to gorGor3.1. This new assembly uses the the same underlying contigs, but the gaps sizes were reduced from 1000 bases to 100 bases. Genes from the e63 release (assembly gorGor3) were projected onto the updated assembly (gorGor3.1). All genes from the e63 gene set were mapped to the new assembly. A small number of features from the 'RNASeq gene' and 'Human RefSeq/ENA cDNA' track were not mapped to the new assembly because their exons spanned across a gap that had changed size.

Flagging withdrawn UniProt proteins (all species)

Across all species, 175 human Ensembl translations were flagged on 1193 transcripts and exons, using human e64 as reference. In addition, 193 UniProt proteins were flagged on 555 transcripts and exons, using the August release of UniProt as reference. Please see our blog for more details.

Flagging withdrawn Ensembl proteins (all species)

Flagged withdrawn human Ensembl proteins used as supporting evidence. Please see our blog for more details.

Removal of ambiguous bases from Takifugu rubripes (Fugu)

It is Ensembl policy to only allow AGCTN bases in the DNA table; ambiguous bases are not allowed and are changed to N. Therefore, the 'R' and 'Y' ambiguous bases on scaffold_20 were changed to 'N' for takifugu_rubripes_core_64_4.

Release 63

New microbat assembly (Microbat)

A full gene annotation on the new high coverage microbat assembly, Myoluc2.0

Removed duplicated dna in panda (Panda)

Scaffold dna sequences removed from the dna table

Rabbit xrefs (Rabbit)

Missing xrefs added for ncRNAs

Human Vega annotation (Human)

Manual annotation of human from Havana has been updated. This represents the annotation presented in Vega release 43. Annotation by Havana of chromosome 12 has been completed and that of chromosome 14 is nearly finished.

 

Zebrafish Vega annotation (Zebrafish)

Manual annotation of zebrafish from Havana has been updated. This represents the annotation presented in Vega release 43. There is an increase of about 50Mb of annotated sequence, with an extra 1300 annotated loci.

Human cDNA update (Human)

The latest set of cDNAs for human (as of 05/May/2011) from the 
European Nucleotide Archive and NCBI RefSeq were aligned to the 
current genome using Exonerate. There are 1104 new cDNA alignments for Ensembl 63

 

Mouse cDNA update (Mouse)

The latest set of cDNAs for mouse (as of 05/May/2011) from the
European Nucleotide Archive and NCBI RefSeq were aligned to the
current genome using Exonerate. There are 1009 new cDNA alignments for Ensembl 63.

Update to Ensembl-Havana GENCODE gene set (release 8) (Human)

Update to Ensembl-Havana GENCODE gene set (release 8) - this is based on updated Ensembl gene set and latest Havana gene annotation.

Flagging obsolete Uniprot proteins (all species)

Flag the obsolete proteins in Uniprot used as supporting evidence. 122 proteins used as evidence were flagged as obsolete for this release.

Flagging obsolete Ensembl proteins (all species)

Flag obsolete human Ensembl proteins used as supporting evidence for other species. 364 proteins used as evidence were flagged as obsolete.

Logic name update (all species)

Whenever possible, logic names updated to be consistent across all databases

Zebrafish Vega merge (Zebrafish)

A new Vega gene set has been merged with the Ensembl geneset from release 61.

Update to CCDS and RefSeq models in human otherfeatures database (Human)

Imported CCDS models into human otherfeatures databases. (CCDS Data freeze 9 February 2011).

Imported RefSeq models into human otherfeatures database.

Release 62

Patch for panda (Panda)

Transcript supporting features added for pseudogenes

Patch for rabbit (Rabbit)

Geneset re-clustered Transcript supporting features added for pseudogenes Assembly updated to match the official ncbi one

Patch for mouse (Mouse)

Patched the mouse Ensembl-Havana merged gene set to maintain its consistency with the latest CCDS gene set (as of 9 February 2011).

Human Vega annotation (Human)

Manual annotation of human from Havana has been updated. This represents the annotation presented in Vega release 42

Patch for marmoset (Marmoset)

Deprecated contig sequences removed Raw-computes re-run Geneset re-clustered Mapping added Transcript supporting features for pseudogenes added New seq region synonyms

Human otherfeatures (Human)

Removed EST alignments with hcoverage <90 and perc_ident <94.

GENCODE gene set update (release 7) (Human)

Update to the Ensembl/Havana GENCODE gene set based on a complete re-annotation of the Ensembl gene set and combined with the latest Vega gene set

Human cDNA update (Human)

New cDNA db for human.

GRCh37.p3 (Human)

Adding the third patch release for the human assembly. This alters the assembly information in all human databases.

GRCh37.p3 annotation (Human)

Annotation of the patches in the other features db.

Gibbon build (Gibbon)

First release of gene build for Gibbon, Nomascus leucogenys (Northern white-cheeked gibbon). Assembly: Nleu1.0.

Zebrafish WGS/clone assembly track (Zebrafish)

Added a WGS/clone assembly track.

Flagging obsolete Uniprot proteins (all species)

Flagging Transcript attribute where the Uniprot evidence was removed

Flagging obsolete Ensembl proteins (all species)

Flagging Transcript attribute where the evidence was removed from 2x genomes

Mouse RefSeq import (Mouse)

RefSeq annotations imported into the mouse otherfeatures database

Xenopus tropicalis new assembly 4.2 (Xenopus)

New assembly of Xenopus tropicalis version 4.2

Human Body Map missing liver (Human)

Add the liver models

Mouse cDNA update (Mouse)

New cDNA db for mouse. ens-staging2: mus_musculus_cdna_62_37o

Updated human otherfeatures db: new CCDS import (Human)

Update to CCDS set for human

Updated mouse otherfeatures db: New CCDS import (Mouse)

Update to CCDS set for mouse

Release 61

Human cDNA update (Human)

Updated set of cDNA alignments to the human genome.

Haplotype correction (Human)

Correction of an error that added one extra N to the end of the alternative versions of the chromosomes for five of the haplotypes. The altered alternative chromosomes are: HSCHR6_MHC_MANN, HSCHR6_MHC_MCF, HSCHR6_MHC_SSTO, HSCHR4_1 and HSCHR17_1.

Zebrafish Havana merge (all species)

A merge of the zebrafish core gene set with Havana manual annotation. The core gene set has been altered to include missing genes that were lost in e60 due to a problem in gene clustering.

GENCODE gene set update (all species)

GENCODE gene set update (release 6)

Update to the Ensembl/Havana GENCODE gene set using the latest Vega gene
set

 

Updates to mouse and human Vega annotation (all species)

The Vega annotation for both human and mouse has been updated. This matches the annotation presented in Vega release 41.

new rnaseq database (all species)

I will provide a new databases which consists of the core tables ; the data will data from the human bodymap project ( rnasesq data ). This is a new database which has not been released before. This  was originally planned for e60.

mouse cDNA update (Mouse)

mouse cDNA update

Chromosome and haplotype synonyms (Human)

Addition of synonyms for the human chromosomes and haplotypes in the seq_region_synonym tables of the human databases.

Zebrafish Vega annotation (Zebrafish)

Manual annotation of zebrafish from Havana is now present in Ensembl. This represetns the annotation presented in Vega release 40

Mouse gene set update (all species)

A merge of Ensembl core gene set and Vega manual annotation.

The core gene set has been improved by incorporating new data resources which had become available since the last NCBIM37 genebuild (April 2007), resulting in the correction of existing gene models and the recovery of new mouse genes with human orthologues.

A new otherfeatures database is also available.

RepeatMask data have been updated by re-running RepeatMasker with options "-nolow -s -species mouse".

New assembly for lizard (Anole lizard)

A new assembly for lizard

Turkey (Turkey)

The first genebuild for turkey

New Canonical Transcript definition (all species)

For previous releases, the canonical transcript of a gene has been set to the transcript with the longest translation (for coding genes) or to the transcript with the longest mRNA (for noncoding genes). From release 61, the canonical transcript for human and mouse will now be set to the longest CCDS transcript. Where no CCDS transcript exists for the gene, the longest Ensembl-HAVANA merge transcript will be used.

Removal of ambiguous bases from human DNA sequence (all species)

Ambiguous bases have been replaced with 'N' for the following two human contigs:

  • contig::AF152363.1:1:185763:1. This contig held 28 ambiguous bases: S(4), W(6), M(5), K(4), R(5), Y(4).
  • contig::AF152364.1:1:170452:1. This contig held 4 ambiguous bases: S(1), W(1), Y(1), K(1).

Updated CCDS (all species)

Updated CCDS databases for Human and Mouse. Populates other_features with new gene models and serves data for CCDS Public Note DAS track.

Release 60

Update to human vega annotation (all species)

An update to Vega human annotation

Gencode gene set update (all species)

Update to the Ensembl/Havana Gencode gene set using the latest Vega gene set.

Human cDNA update (all species)

Updated set of cDNA alignments to the human genome.

Rabbit chromosomes (all species)

Chromosome mapping added for the rabbit genome Coordinates updated accordingly

Human (GRCh37) assembly patch release 2 (all species)

Addition of the GRCh37 patch release 2 patches. These are toplevel, non-reference regions of the assembly.

Updated human otherfeatures db: EST alignments (all species)

Human ESTs were realigned. New EST-based genes were produced from these EST alignments.

Panda genebuild (all species)

The first genebuild for the panda genome

Update human otherfeatures db: new CCDS import (all species)

Update to CCDS set for human

Updated mouse otherfeatures db: New CCDS import (all species)

Update to CCDS set for mouse

cDNA based gene annotation of human assembly patches (all species)

Annotate the human assembly patches using Exonerate's cDNA2genome model, which aligns cDNAs to the genome using annotation identifying the coding regions of the cDNAs.

Zebrafish genebuild (all species)

Full genebuild on the new Zv9 assembly

Mouse cDNA update (all species)

Updated set of cDNA alignments to the mouse genome.

Flagging Translation attribute where the evidence was removed (all species)

Add a flag to the translation where a human Ensembl translation used as evidence was removed from the current human database. These are indicated on the web display by colouring them in grey on transcript supporting evidence view.

Flagging Translation attribute where the Uniprot evidence was removed (all species)

Add a flag to the translation where a supporting evidence from Uniprot was removed from Uniprot database. These are indicated on the web display by colouring them in grey on transcript supporting evidence view.

Updating the ENCODE excluded regions (all species)

Update of the ENCODE excluded regions

Fix duplicate transcript attributes (multiple species)

Duplicate transcript attributes removed

Release 59

Update to human RepeatMasking (all species)

  • Update to human Repeatmasking: this involves re-running the RepeatMasker analysis on toplevel slices, with the '-nolow' flag, so that low complexity regions are not masked.

Update human otherfeatures db (all species)

Update human otherfeatures database:

  • New CCDS models

Update of mouse gene set (all species)

Update of mouse gene set incorporating new Vega genes, implementing new code for HavanaAdder.

Selenocysteine update (all species)

Update 172 transcripts across 25 species to remove 3 base pair introns where there should be a selenocysteine.

The following species are affected:

  • bos taurus
  • choloepus hoffmanni
  • dasypus novemcinctus
  • dipodomys ordii
  • echinops telfairi
  • equus caballus
  • erinaceus europaeus
  • felis catus
  • gallus gallus
  • gorilla gorilla
  • macropus eugenii
  • microcebus murinus
  • myotis lucifugus
  • ochotona princeps
  • ornithorhynchus anatinus
  • otolemur garnettii
  • pongo pygmaeus
  • procavia capensis
  • pteropus vampyrus
  • sorex araneus
  • spermophilus tridecemlineatus
  • tarsius syrichta
  • tupaia belangeri
  • tursiops truncatus
  • vicugna pacos

New human assembly patches (all species)

Update human assembly to include sequence for the new assembly patches provided by the GRC (patch_release_1).

  • Assembly patches are toplevel and non-reference

Additional mapping for marmoset (all species)

An additional contig to scaffold mapping for unplaced and unlocalized scaffolds

Update of the corresponding gene coordinates

Mouse cDNA update (all species)

Mouse cDNA update

cDNA based gene annotation of human assembly patches (all species)

Annotate human assembly patches with cDNA based gene models.

Saccharomyces cerevisiae core database (all species)

Core database from Ensembl Genomes (updated March 2010 from SGD, currently in EG5) in 58 schema, patched to 59 schema when sql available. Also corresponding other_features database containing ESTs.

assembly fixes (all species)

The assembly / seq region tables of otherfeatures databaess for these species will be sync. with their core databases  bos_taurus_otherfeatures_58_4g
sus_scrofa_otherfeatures_58_9b
pan_troglodytes_otherfeatures_58_21m
danio_rerio_otherfeatures_58_8d
pongo_pygmaeus_otherfeatures_58_1d

human cdna update (all species)

new human cdna update database with integrated assembly patches

Update mouse otherfeatures db (all species)

Update mouse otherfeatures db

  • New CCDS models

Future Plans

Read about our future plans on our blog!