EnsemblEnsembl Home

News for Ensembl Release 82 (September 2015)

News categories

New web displays and tools

Ensembl mobile site (all species)

we now have a mobile version of ensembl at http://m.ensembl.org

You can quickly search for a gene, variants or phenotypes on the mobile site. Have a look and let us know your comments/feedback.

VEP plugins (all species)

Support for VEP plugins through the VEP web interface

Marking a region on images (all species)

A new feature to mark a selected region is added to the location, gene and other images. Marking can be applied by drag-selecting a region and then using the zmenu to mark it, or by clicking on a feature on an image and then using the zmenu to mark that feature's location.

New species selector for region comparison config (all species)

The species selector drop-down box on the config page for Region comparison view has been moved to the top of the tracks list. This makes it easier to find the drop-down and thus configuring the species on the view.

New species, assemblies and genebuilds

Zebrafish developmental stage RNASeq data set (Zebrafish)

Models built based on different developmental stages and different tissue samples: 2 cells, 6 hours post fertilisation, 1 day post fertilisation (dpf), 2 dpf, 3 dpf, 5 dpf, ovary, male head, female head, male body, female body. We provide the alignment BAM files, the intron supporting evidence and the gene models

Vega Mouse annotation updated (Mouse)

Manual annotation of mouse from Havana has been updated and contains the data released in Vega 62

Mouse: update to Ensembl-Havana GENCODE gene set (Mouse)

Updated Ensembl-Havana mouse gene set. This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.

Other updates


Compara dumps (all species)

  • EMF / Fasta / OrthoXML / PhyloXML dumps for ProteinTrees + PhyloXML dumps for CAFE ProteinTrees
  • EMF / Fasta / OrthoXML / PhyloXML dumps for ncRNAtrees + PhyloXML dumps for CAFE ncRNAtrees

ncRNAtrees and homologies (all species)

Classification based on Rfam models (v11.0)
Multiple sequence alignments with Infernal
Phylogenetic reconstruction using RAxML
Phylogenetic reconstruction using FastTree2 and RAxML-Light for very big families
Additional multiple sequence alignments with Prank (w/ genomic flanks)
Additional phylogenetic reconstruction using PhyML and NJ
Phylogenetic tree merging using TreeBeST
Per family gene dynamics using CAFE
Homology inference
Secondary structure plots

Schema: first/last_release columns in genome_db (all species)

The two columns would allow to track with more precision when each genome was added, and which ones are current

ProteinTrees and homologies (all species)

GeneTrees (protein-coding) with new/updated genebuilds and assemblies

 -- all-vs-all blastp (ncbi-blast-2.2.30+)
 -- Clustering using hcluster_sg
 -- Multiple sequence alignments using MCoffee (Version_9.03.r1318) or Mafft (mafft-7.221)
 -- Phylogenetic reconstruction using TreeBeST
 -- Homology inference
 -- Pairwise gene-based dN/dS scores for high coverage species pairs only (both on orthologues and paralogues) (codeml/PAML v4.3)
 -- GeneTree stable ID mapping
 -- Per family gene dynamics using CAFE (v2.2)

Protein Families (all species)

Updated MCL families including all Ensembl transcript isoforms (including human non-reference haplotypes) and newest Uniprot Metazoa.

 -- Getting distances by NCBI BlastP (v.2.2.30+)
 -- Clustering by MCL (v.14-137)
 -- Multiple Sequence Alignments with MAFFT (v.7.221)
 -- Family stable ID mapping

Schema version update (all species)

81 -> 82

Schema: first/last_release columns in method_link_species_set (all species)

The two columns would allow to track with more precision when each dataset was produced, and which ones are current

Schema: New species_set_header table, with first/last_release columns (all species)

New header table that allows us to check the database integrity with foreign keys. The table would also contain information about when each set was used

Replace TBlat with LastZ (all species)

Recompute some TBlat pairwise comparisons with LastZ (which gives a higher coverage):

  • {M.mus, G.gal, T.nig} vs X.tro
  • X.tro vs L.cha
  • G.acu vs {L.cha, P.mar}
  • {M.mus, P.mar} vs C.int
  • G.gal vs C.sav


Ensembl VM Build (all species)

The Ensembl Virtual Machine applicance will be updated to version 82.

External database references update (multiple species)

Xrefs update for:

mus_musculus (mouse), homo_sapiens (human), poecilia_formosa (amazon molly), monodelphis_domestica (opposum), bos_taurus (cow), macaca_mulatta (rhesus monkey), gorilla_gorilla (gorilla), equus_caballus (horse), gadus_morhua (cod), meleagris_gallopavo (turkey), felis_catus (cat), tetraodon_nigroviridis (pufferfish), myotis_lucifugus (microbat)

LRG Import (Human)

Importing the latest version of Locus Reference Genomic dataset

patch_81_82_a.sql - schema_version update (all species)

Update schema_version in meta table to 82.

patch_81_82_a.sql - schema_version update in ontology db (all species)

Update schema_version in meta table to 82.

patch_81_82a.sql - schema_version update in production db (all species)

Update schema_version in production database to 82.

Stable ID lookup (all species)

Stable ID lookup provided for REST services

Includes lookup for RefSeq and CCDS entries

patch_81_82_b.sql - xref_width (all species)

Extend column width for xref display_label and dbprimary_acc

patch_81_82_c.sql - seq_synonym_key (all species)

update unique on seq_region_synonym table to include seq_region_id

VEP plugins (REST) (all species)

VEP REST endpoints to support use of VEP plugins


patch_81_82_a.sql - schema_version update (all species)

Update schema_version in meta table to 82


Human: updated cDNA alignments (Human)

A new cdna database was created for e82: The latest set of cDNAs for human (as of June 2015) from the European Nucleotide Archive and NCBI RefSeq (release 70) were aligned to the current genome using Exonerate.

Mouse: updated cDNA alignments (Mouse)

A new cdna database was created for e82: The latest set of cDNAs for mouse (as of June 2015) from the European Nucleotide Archive and NCBI RefSeq (release 70) were aligned to the current genome using Exonerate.

Updated mouse otherfeatures db: New CCDS import (Mouse)

This release of the mouse gene set also includes 23,830 transcript models as part of an updated version (May 2015) of CCDS

Stable id mapping for GRCz10 (Zebrafish)

Stable id events table for zebrafish is missing entries. This needs to be fixed for gene, transcript and translation tables

Mouse XREFs cleanup (Mouse)

The XREFs for mouse need to be cleaned up, We only want to keep Ens%, OTT% and Vega% and HGNC and LRG XREFS

Rat XREFs cleanup (Rat)

The XREFs for rat need to be cleaned up We only want to keep Ens%, OTT% and Vega% and HGNC and LRG XREFS


Ensembl 82 mart databases (all species)

  • Ensembl Genes 82
    • Renamed filters and attributes from variation to variant
    • Renamed the "Transcript length" attribute to "Transcript length (including UTRs and CDS)" 
  • Ensembl Variation 82
    • Renamed filters and attributes from variation to variant
  • Ensembl Regulation 82
    • Added "Ensembl Gene ID" filter for the miRNA Target Regions dataset
  • Vega 62

EMBL and Genbank Dumps (all species)

EMBL and Genbank dumps for all species.

External reference projection (all species)

Gene ontology (GO) identifiers and gene name projection to all species.

FASTA & GTF dumps (all species)

FASTA & GTF dumps for all the species


dbSNP 144 (Horse, Cat, Chicken, Human)

dbSNP 144 data imported

Structural variants (multiple species)

  • Added new studies and updated other studies from DGVa

HGMD data update (Human)

Import of the latest release of public HGMD data (version 2015.2) and remapping to GRCh38

Phenotype data updates (multiple species)

  • Human phenotype data has been updated from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet, GOA and Decipher.
  • OMIA data for Cow, Dog, Horse, Sheep
  • RGD data for Rat
  • AnimalQTL for Cow, Horse, Chicken, Pig, Sheep
  • IMPC data (release 3.1) for Mouse

New REST endpoint: Return variants in LD with a given SNP (all species)

We added a new REST API enpoint which returns variants in LD for a given SNP and cut off values for r2 and D'.

NHLBI Exome Sequencing Project data for GRCh38 (Human)

We imported the most recent data (v.0.0.30. (Nov. 3, 2014)) from the Exome Sequencing Project (ESP).

HumanCoreExome-12 variants GRCh38 (Human)

We imported variants form the HumanCoreExome-12 chip.

New REST endpoint: Return the variation sources list for a given species (all species)

We added a new REST API "info" enpoint which returns the variation sources list for a given species. This includes the source versions, descriptions and URL.


Improved data upload form (all species)

The design of the "Add your data" form has revised to make it easier to use. There are now only two input boxes, one for selecting a file on your computer and the other to paste data or a file URL. The form will attempt to identify the file format from the extension (if any), so you will only need to select the format manually if pasting data or if the file extension is ambiguous.

At the same time we are implementing more validation on the server side, so that if you accidentally select the wrong value in the dropdown, this will be reported back to you.

Improvements to PDF export (all species)

Line thicknesses have been increased in the PDF renderer for graph (wiggle) tracks so that they don't disappear.

Retirement of archive 68 (all species)

This release cycle we will be retiring archive 68 (July 2012) in accordance with our three-year rolling retirement policy. The data will remain available on our public database server; only the web interface will be removed.

Homologues - export as PhyloXML (all species)

We have added PhyloXML to the list of formats available for exporting Orthologues and Paralogues. Simply select 'PhyloXML' from the dropdown list or format thumbnails when exporting data from the Gene Orthologue or Paralogue pages.

Gene Expression Widget updated (all species)

The gene expression widget has been updated to the latest one.

Colouring of exons in text sequence (all species)

We have changed the colour scheme of our exon text view to help colour-blind users. UTRs are now displayed in dark orange instead of purple, to make them easier to distinguish from the neighbouring translated sequence. You can see an example here:


Export mode for projectors and print (all species)

To help with the display of Ensembl images on projectors, a new export option, hi-vis, customizes the exported image to be maximally visible on projectors. This link has been added to the standard image export menu.

In this hi-vis image, colours have been gamma-corrected to improve contrast in brightly lit environments and measures have been taken to broaden otherwise very thin lines.

The same parameters have been applied to our print presets, which are now labelled 'Journal' and 'Poster' and produce a x2 and x5 enlargement respectively. More information about outputting for print is available via the help icons in this section of the menu.

As these parameters are tunable, and each situation different, we welcome feedback on the effectiveness of this new feature that we might make it as broadly useful as possible.

Improved Variation Tables (all species)

Variation tables for genes and transcripts have been reimplemented to effectively handle the large number of variants now known for many genes. At the same time, the ability to filter, sort, and select this data has been improved.

Filtering by variant type is now achieved by selecting the "Type:" filter at the top of the main table, rather than by a preceding auxiliary table.

Further features and refinements are expected to be added in forthcoming releases.

Future Plans

Read about our future plans on our blog!