News for Human Ensembl Release 82 (September 2015)

News categories

New web displays and tools

Ensembl mobile site

we now have a mobile version of ensembl at

You can quickly search for a gene, variants or phenotypes on the mobile site. Have a look and let us know your comments/feedback.

VEP plugins

Support for VEP plugins through the VEP web interface

Marking a region on images

A new feature to mark a selected region is added to the location, gene and other images. Marking can be applied by drag-selecting a region and then using the zmenu to mark it, or by clicking on a feature on an image and then using the zmenu to mark that feature's location.

New species selector for region comparison config

The species selector drop-down box on the config page for Region comparison view has been moved to the top of the tracks list. This makes it easier to find the drop-down and thus configuring the species on the view.

Other updates


Compara dumps

  • EMF / Fasta / OrthoXML / PhyloXML dumps for ProteinTrees + PhyloXML dumps for CAFE ProteinTrees
  • EMF / Fasta / OrthoXML / PhyloXML dumps for ncRNAtrees + PhyloXML dumps for CAFE ncRNAtrees

ncRNAtrees and homologies

Classification based on Rfam models (v11.0)
Multiple sequence alignments with Infernal
Phylogenetic reconstruction using RAxML
Phylogenetic reconstruction using FastTree2 and RAxML-Light for very big families
Additional multiple sequence alignments with Prank (w/ genomic flanks)
Additional phylogenetic reconstruction using PhyML and NJ
Phylogenetic tree merging using TreeBeST
Per family gene dynamics using CAFE
Homology inference
Secondary structure plots

Schema: first/last_release columns in genome_db

The two columns would allow to track with more precision when each genome was added, and which ones are current

ProteinTrees and homologies

GeneTrees (protein-coding) with new/updated genebuilds and assemblies

 -- all-vs-all blastp (ncbi-blast-2.2.30+)
 -- Clustering using hcluster_sg
 -- Multiple sequence alignments using MCoffee (Version_9.03.r1318) or Mafft (mafft-7.221)
 -- Phylogenetic reconstruction using TreeBeST
 -- Homology inference
 -- Pairwise gene-based dN/dS scores for high coverage species pairs only (both on orthologues and paralogues) (codeml/PAML v4.3)
 -- GeneTree stable ID mapping
 -- Per family gene dynamics using CAFE (v2.2)

Protein Families

Updated MCL families including all Ensembl transcript isoforms (including human non-reference haplotypes) and newest Uniprot Metazoa.

 -- Getting distances by NCBI BlastP (v.2.2.30+)
 -- Clustering by MCL (v.14-137)
 -- Multiple Sequence Alignments with MAFFT (v.7.221)
 -- Family stable ID mapping

Schema version update

81 -> 82

Schema: first/last_release columns in method_link_species_set

The two columns would allow to track with more precision when each dataset was produced, and which ones are current

Schema: New species_set_header table, with first/last_release columns

New header table that allows us to check the database integrity with foreign keys. The table would also contain information about when each set was used

Replace TBlat with LastZ

Recompute some TBlat pairwise comparisons with LastZ (which gives a higher coverage):

  • {M.mus,, T.nig} vs X.tro
  • X.tro vs L.cha
  • G.acu vs {L.cha, P.mar}
  • {M.mus, P.mar} vs
  • vs C.sav


External database references update

Xrefs update for:

mus_musculus (mouse), homo_sapiens (human), poecilia_formosa (amazon molly), monodelphis_domestica (opposum), bos_taurus (cow), macaca_mulatta (rhesus monkey), gorilla_gorilla (gorilla), equus_caballus (horse), gadus_morhua (cod), meleagris_gallopavo (turkey), felis_catus (cat), tetraodon_nigroviridis (pufferfish), myotis_lucifugus (microbat)

LRG Import

Importing the latest version of Locus Reference Genomic dataset

Ensembl VM Build

The Ensembl Virtual Machine applicance will be updated to version 82.

patch_81_82_a.sql - schema_version update

Update schema_version in meta table to 82.

patch_81_82_a.sql - schema_version update in ontology db

Update schema_version in meta table to 82.

patch_81_82a.sql - schema_version update in production db

Update schema_version in production database to 82.

Stable ID lookup

Stable ID lookup provided for REST services

Includes lookup for RefSeq and CCDS entries

patch_81_82_b.sql - xref_width

Extend column width for xref display_label and dbprimary_acc

patch_81_82_c.sql - seq_synonym_key

update unique on seq_region_synonym table to include seq_region_id

VEP plugins (REST)

VEP REST endpoints to support use of VEP plugins


patch_81_82_a.sql - schema_version update

Update schema_version in meta table to 82


Human: updated cDNA alignments

A new cdna database was created for e82: The latest set of cDNAs for human (as of June 2015) from the European Nucleotide Archive and NCBI RefSeq (release 70) were aligned to the current genome using Exonerate.


Ensembl 82 mart databases

  • Ensembl Genes 82
    • Renamed filters and attributes from variation to variant
    • Renamed the "Transcript length" attribute to "Transcript length (including UTRs and CDS)" 
  • Ensembl Variation 82
    • Renamed filters and attributes from variation to variant
  • Ensembl Regulation 82
    • Added "Ensembl Gene ID" filter for the miRNA Target Regions dataset
  • Vega 62

EMBL and Genbank Dumps

EMBL and Genbank dumps for all species.

External reference projection

Gene ontology (GO) identifiers and gene name projection to all species.

FASTA & GTF dumps

FASTA & GTF dumps for all the species


dbSNP 144

dbSNP 144 data imported

Structural variants

  • Added new studies and updated other studies from DGVa

HGMD data update

Import of the latest release of public HGMD data (version 2015.2) and remapping to GRCh38

Phenotype data updates

  • Human phenotype data has been updated from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet, GOA and Decipher.
  • OMIA data for Cow, Dog, Horse, Sheep
  • RGD data for Rat
  • AnimalQTL for Cow, Horse, Chicken, Pig, Sheep
  • IMPC data (release 3.1) for Mouse

NHLBI Exome Sequencing Project data for GRCh38

We imported the most recent data (v.0.0.30. (Nov. 3, 2014)) from the Exome Sequencing Project (ESP).

HumanCoreExome-12 variants GRCh38

We imported variants form the HumanCoreExome-12 chip.

New REST endpoint: Return variants in LD with a given SNP

We added a new REST API enpoint which returns variants in LD for a given SNP and cut off values for r2 and D'.

New REST endpoint: Return the variation sources list for a given species

We added a new REST API "info" enpoint which returns the variation sources list for a given species. This includes the source versions, descriptions and URL.


Improved data upload form

The design of the "Add your data" form has revised to make it easier to use. There are now only two input boxes, one for selecting a file on your computer and the other to paste data or a file URL. The form will attempt to identify the file format from the extension (if any), so you will only need to select the format manually if pasting data or if the file extension is ambiguous.

At the same time we are implementing more validation on the server side, so that if you accidentally select the wrong value in the dropdown, this will be reported back to you.

Improvements to PDF export

Line thicknesses have been increased in the PDF renderer for graph (wiggle) tracks so that they don't disappear.

Retirement of archive 68

This release cycle we will be retiring archive 68 (July 2012) in accordance with our three-year rolling retirement policy. The data will remain available on our public database server; only the web interface will be removed.

Homologues - export as PhyloXML

We have added PhyloXML to the list of formats available for exporting Orthologues and Paralogues. Simply select 'PhyloXML' from the dropdown list or format thumbnails when exporting data from the Gene Orthologue or Paralogue pages.

Gene Expression Widget updated

The gene expression widget has been updated to the latest one.

Colouring of exons in text sequence

We have changed the colour scheme of our exon text view to help colour-blind users. UTRs are now displayed in dark orange instead of purple, to make them easier to distinguish from the neighbouring translated sequence. You can see an example here:;g=ENSG00000128573;r=7:114414997-114693768;t=ENST00000403559

Export mode for projectors and print

To help with the display of Ensembl images on projectors, a new export option, hi-vis, customizes the exported image to be maximally visible on projectors. This link has been added to the standard image export menu.

In this hi-vis image, colours have been gamma-corrected to improve contrast in brightly lit environments and measures have been taken to broaden otherwise very thin lines.

The same parameters have been applied to our print presets, which are now labelled 'Journal' and 'Poster' and produce a x2 and x5 enlargement respectively. More information about outputting for print is available via the help icons in this section of the menu.

As these parameters are tunable, and each situation different, we welcome feedback on the effectiveness of this new feature that we might make it as broadly useful as possible.

Improved Variation Tables

Variation tables for genes and transcripts have been reimplemented to effectively handle the large number of variants now known for many genes. At the same time, the ability to filter, sort, and select this data has been improved.

Filtering by variant type is now achieved by selecting the "Type:" filter at the top of the main table, rather than by a preceding auxiliary table.

Further features and refinements are expected to be added in forthcoming releases.