News categories

New web displays and tools

Gencode Basic Renderer (Human)

A new renderer, GENCODE Basics has been added for GENCODE. The GENCODE Basic set comprises only a subset of the transcripts ie. the fragments and other problematic biotypes are excluded from teh basic set.

New VEP interface (Human)

The VEP web interface has been completely overhauled and now offers:

  • new results page with summary charts, interactive filtering and more
  • more options and configuration
  • 750 variant limit removed - limits are now imposed only on uploaded file size
  • VEP now runs on a job submission system

Highlighting the current feature (Human)

In Region in detail and similar views, we currently highlight the feature you have come from so that it's easier to find amongst all the tracks. However if you would like to turn off this highlighting, e.g. in order to have a cleaner image to export, you can now do so via the control panel.

You'll find the option under 'Information and decorations', labelled 'Highlight current feature'.

New species, assemblies and genebuilds

Merged genes and transcripts can be fetched using 'source' column (Human)

From this release we will introduce a new use for the gene.source column and the new transcript.source column. These columns will now indicate whether genes have been annotated by both Ensembl and Havana ('ensembl_havana'), Ensembl only ('ensembl), or Havana only ('havana'). This will feed into BioMart to make it easier for users to fetch genes and transcripts from only the annotation sources they are interested in. In release 74 and earlier releases, this information could be found using the analysis.logic_name. Note: An addiitonal source, 'insdc', is used for genes and transcripts on the mitochondrial chromosome because they are imported from the MT genbank file.

Vega Zebrafish annotation updated (Human)

Manual annotation of zebrafish from Havana has been updated and contains the data released in Vega 55.

New alignments

New track - Age of Base (Human)

In release 75 we have added a new track for human, showing the timing of the most recent mutation as determined by inter-species whole genome alignments. You can find the track in the comparative genomics menu under "Conservation regions" (or search for "age of base").

Each base pair in which the human reference genome differs by substitution from one of its inferred ancestral genomes is coloured in either grey (event prior to the primate branch), blue (primate specific), red (human-specific, fixed variant), or yellow (human-specific segregating variant, i.e. SNP). Clicking on a mutation position reveals the sub-tree of species which have inherited the same mutation from their common ancestor. It also reveals a score that represents the age of the mutation in arbitrary units, and determines the intensity of the colouring. The more recent the mutation, the lower the score and the darker the colour.

Note that this is a beta version of the track - if you find it useful, please let us know!

Other updates

Human: updated RefSeq gene import (Human)

The imported RefSeq gene set was updated in the human otherfeatures database. Please note that RefSeq annotates gene models on cDNA sequence and not on the reference genome, meaning that when users choose to translate the RefSeq transcripts off the reference genome that the translations may contain stop codons.

Updated human otherfeatures db: New CCDS import (Human)

This release of the human gene set also includes 29,033 transcript models as part of an updated version (December 2013) of CCDS

Human: updated cDNA alignments (Human)

A new cdna database was created for e75: The latest set of cDNAs for human (as of December 2013) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Splicing events (Human)

The ASTD project computationally predicted genes in a similar way to Ensembl every release with a focus on alternative mRNA structures (splicing events, poly(A) sites, TSS) and features (ppt, exon-exon junction types)

Since 2010, the storage and display of the alternative information is now an entire part of Ensembl for the following species:

  • Homo sapiens,
  • Mus musculus,
  • Rattus norvegicus,
  • Danio rerio,
  • Caenorhabditis elegans,
  • Drosophila melanogaster

These data have been updated for this release for the species listed above.

Phenotype data updates (Human)

Human phenotype data will be updated from sources including ClinVar and Decipher.

Mouse phenotype data from IMPC will be updated.

Citation data update (Human)

Human variation citation data will updated from EPMC and UCSC

Structural variations (Human)

DGVa data will be updated and new studies imported

PolyPhen update (Human)

Polyphen predictions will be updated using code version 2.2.2, release 405 and the latest available databases.

HGMD data update (Human)

The latest release of public HGMD data (version 2013.3 from September 2013) will be imported

NHLBI ESP data update (Human)

Human NHLBI ESP data will be updated to version v.0.0.22.

result_set.name unique (patch_74_75_b) (Human)

The name field of the result set table now has a unique key, and the names have been updated by appending the relevant analysis logic name, in line with the other set tables.

External database references update (Human)

Xrefs update for:

human, sloth, seq squirt, chicken, kangaroo rat, stickelback, coelacanth, wallaby, pika, medaka, hyrax, megabat, fugu, tarsier, alpaca, dolphin, tree shrew, zebrafish

New TarBase microRNA target sites (Human)

Our conservative MiRanda miRNA targets set (which is no longer maintained), will be replaced by predictions from Diana TarBase:


TarBase v6.0 has replaced (drop-in) MiRanda targets as ExternalFeatures. Separate adapter classes  will follow in r76. 

TarBase analysis added (Human)

Added TarBase_v6.0 to analysis table

Ensembl 75 mart databases (Human)

  • Ensembl Genes 75
    • Renamed Saccharomyces cerevisiae assembly from EF4 to R64-1-1
    • Added new Transcript source filter and attribute for all the species
    • Added new filter and attribute for VEGA protein ID and WormBase Gene Sequence-name accesssion.
    • Added new variation species turkey (Meleagris gallopavo)
    • Renamed "Protein domains" filter and attribute sections to "Protein domains and families".
  • Ensembl Variation 75
    • Added new variation species turkey (Meleagris gallopavo)
    • Renamed Saccharomyces cerevisiae assembly from EF4 to R64-1-1

EMBL and Genbank Dumps (Human)

EMBL and Genbank dumps for all species.

External reference projection (Human)

Gene ontology (GO) identifiers and gene name projection to all species.

FASTA & GTF dumps (Human)

FASTA & GTF dumps for all the species

LRG Import (Human)

Importing the latest version of Locus Reference Genomic dataset

Gene and Transcript Adaptor support for fetch_all_by_Source() (Human)

GeneAdaptor and TranscriptAdaptor will support the retrieval of their respective feature objects by the new source column

SQLite Support (Human)

The Ensembl core API will support SQLite databases. This work has been contributed by the Anacode team at the WTSI.

input_subset.analysis_id (patch_74_75_c) (Human)

An analysis_id has been added to the input_subset table, which will mirror the input_set.analysis_id. 

Consequently, InputSubset has been changed to inherit from Set, and the feature_type validation in of Set subclass constructors has been moved to the Set constructor.

This work is a prerequisite to the retiring the InputSet class/table.

InputSet retired (patch_74_75_d) (Human)

The InputSet class has been retired and ResultSet will be used directly instead.  The result_set_input table has been patched to replace input_set entries with input_subset entries.  The ResultSet classes have been updated to make the association of dbfile_registry_entry record optional.

The InputSet class and table will remain in the schema, until all dependant code has been migrated to the new usage model.

Array size (Human)

The Array size attribute and associated methods/constructor parameters have been deprecated or removed.

patch_74_75a.sql - schema_version update in production db (Human)

Update schema_version in meta table to 75.

patch_74_75_c.sql (Human)

Adding a new table genome_statistics

Populated during the production run, it contains basic statistics on the number of genes, the length of the genome or the number of prediction

Mircroarray mapping (Human)

Microarray mapping has been updated for those species with new genome assemblies, new gene builds or new arrays.

patch_74_75_e.sql (Human)

Attrib related tables do not allow duplicates

Unique key constraints added to enforce this

Experiment FeatureType and CellType (patch_74_75_f) (Human)

The experiment table has had additional feature_type_id and cell_type_id fields added, and the associated API classes have been updated. This is in line with the current suage within the analysis pipelines and is to prevent the experiment class being used as a study where many feature/cell types can be associated.

ProteinTrees and homologies (Human)


GeneTrees (protein-coding) with new/updated genebuilds and assemblies

  • all-vs-all blastp (ncbi-blast-2.2.28+)
  • Clustering using hcluster_sg
  • Multiple sequence alignments using MCoffee (Version_9.03.r1318) or Mafft (mafft-7.113)
  • Phylogenetic reconstruction using TreeBeST
  • Homology inference
  • Pairwise gene-based dN/dS scores for high coverage species pairs only (both on orthologues and paralogues) (codeml/PAML v4.3)
  • GeneTree stable ID mapping
  • Per family gene dynamics using CAFE (v2.2)

ncRNAtrees and homologies (Human)


  • Classification based on Rfam models (v11.0)
  • Multiple sequence alignments with Infernal
  • Phylogenetic reconstruction using RAxML
  • Phylogenetic reconstruction using FastTree2 and RAxML-Light for very big families
  • Additional multiple sequence alignments with Prank (w/ genomic flanks)
  • Additional phylogenetic reconstruction using PhyML and NJ
  • Phylogenetic tree merging using TreeBeST
  • Per family gene dynamics using CAFE
  • Homology inference
  • Secondary structure plots

Protein Families (Human)


Updated MCL families including all Ensembl transcript isoforms (including human non-reference haplotypes) and newest Uniprot Metazoa.

  • Getting distances by NCBI BlastP (v.2.2.28+)
  • Clustering by MCL (v.12-135)
  • Multiple Sequence Alignments with MAFFT (v.7.113)
  • Family stable ID mapping

Compara dumps (Human)


  • [ ] Data dumps for ProteinTrees
  • [ ] Data dumps for ncRNAtrees
  • [ ] OrthoXML dumps for ProteinTrees
  • [ ] OrthoXML dumps for ncRNAtrees
  • [ ] PhyloXML dumps for ProteinTrees
  • [ ] PhyloXML dumps for ncRNAtrees

API/schema changes (Human)


  •  Extend genome_db table (and the corresponding API) with two extra fields (has_karyotype and is_high_coverage)
  •  Annotation of web display information in species_tree_node's instead of species_set_tags

Link from Region in Detail to individual exons (Human)

The popup menu that appears when you click on a transcript now includes a link to the Exon table if you click on an individual exon, and the exon you clicked on is shown in bold on the table. Note that this link will only appear when you are zoomed in enough for the click coordinates to clearly identify a single exon.

Retirement of archive 61 (Human)

This release cycle we will be retiring archive 61 (Feb 2011), in accordance with our three-year rolling retirement policy. The data will remain available on our public database server; only the web interface will be removed.

Remove read_coverage table (Human)

The read_coverage table and associated API support will be removed.

There are only a few individuals across our resequencing data with read coverage data, much of which has been remapped between assemblies and may no longer be reliable.

Removing this will speed up code and clean up some of the web displays.

patch_74_75_a.sql - schema_version update in ontology db (Human)

Update schema_version in meta table to 75.

patch_74_75b.sql - longer code in attrib_type (Human)

'code' column in master_attrib_type table expanded

patch_74_75_f.sql - longer code (Human)

'code' column in attrib_type table longer

Retirement of ensembl-draw (Human)

The ensembl-draw repository has been merged with ensembl-webcode.

The files that were in ensembl-draw can now be found in ensembl-webcode/modules/Sanger/Graphics and ensembl-webcode/modules/Bio/EnsEMBL.

Documentation move (Human)

To aid internal management of git permissions, we will be moving ensembl-webcode/htdocs/info into a separate public plugin, docs. No page URLs will change, but external developers will need to enable this plugin in order to display documentation for the API, etc on an Ensembl-powered website.

ensembl-webcode directory (Human)

Web code is now stored inside the top level directory ensembl-webcode.

A new variable has been added to SiteDefs, called $ENSEMBL_WEBROOT, which has the value of "$ENSEMBL_SERVERROOT/ensembl-webcode", and is used when locating files and directories that are inside ensembl-webcode. $ENSEMBL_SERVERROOT remains unchanged.

Stable ID lookup (Human)

Stable ID lookup provided for REST services

Removal of "default action" (Human)

We have removed the web behaviour whereby invalid URLs for genomic views were silently redirected to the default view for that gene/location/etc. This was causing issues with some scripts connecting to the website, including our own selenium testing. Invalid URLs now show a custom 404 component within the standard page template.

Search enhancments (Human)

Ensembl search has a number of improvements including (i) more extensive highlighting of the search term on the results page, (ii) improved ordering of results, and (iii) better handling of non alphanumeric characters in search queries.

patch_74_75_a.sql - schema_version update (Human)

Update schema_version in meta table to 75.

patch_74_75_b.sql - transcript source (Human)

Adding the source column to the transcript table

patch_74_75_d.sql - default source for transcripts (Human)

The new Transcript source column required a default value, that is now set to ensembl.