Ensembl News for Release 76 (August 2014)

Headlines

News categories

New web displays and tools

New BLAST interface (Elephant)

Ensembl are delighted to announce the release of our new BLAST/BLAT interface, which uses the same Tools infrastructure as the new web-based VEP that came out in release 75.

Highlights include:

  • Tickets saved to a user-friendly table so you can come back later and view or re-run your search
  • Automatic polling of the backend process - no more clicking "Retrieve"!
  • Better error-reporting, so that you can find out more easily why a job failed
  • We now use ncbi-blast instead of wu-blast - this gets around the licensing issues associated with the latter and enables us to distribute our blast code more freely

Please let us know what you think!

Datahub listings (Elephant)

Ensembl has had support for datahubs (also known as track hubs) for some time, and we have now added a page listing available public hubs for our species. Where appropriate, a link will take you to the sample location for a species and open the control panel so that you can easily configure the hub.

Please note that, particularly with the recent release of human GRCh38, many of these hubs are on old assemblies, so where possible* our links will take you to the most recent archive which included that assembly.

If you have a datahub that you would like including in our registry, please contact us.

* Some assemblies are only available on archives that predate datahub support

New Assembly Converter interface (Elephant)

In addition to the new BLAST interface, we have updated our online Assembly Converter to help our users migrate to GRCh38. The new tool currently uses CrossMap, which means input is limited to the following formats:

  • BED
  • GFF
  • GTF
  • VCF
  • WIG

For more information, see our online documentation on file format specifications.

(CrossMap also supports BigBED and BigWIG, but we are unable to offer upload of large files at the moment. We recommend using CrossMap locally if you have very large datasets to process.)

Initially we are offering this tool only for human assembly mappings GRCh37 -> GRCh38 and NCBI36 -> GRCh38. More species will be added as soon as we have the chain files available.

Important note: CrossMap discards file metadata such as track lines. We are working to implement an updated assembly mapper that preserves metadata, which will be released in a future version of Ensembl.

New page - news by topic (Elephant)

In addition to viewing all news for a given release, you can now view all news on a given topic, sorted in reverse chronological order. For example if you want to see all new web features for the last several releases, you can use this link (which also appears on the home page):

All web updates, by release

The following topics are also available:

We hope you find this page useful!

New interface for sequence export (Elephant)

We are embarking on an upgrade of our Export interface, to make it simpler to use and more directly related to the data you see on screen. The first component to be released is for "text sequence" pages, i.e. views that show DNA or peptide sequence as text.

Features of the new interface:

  • Single interface for FASTA and RTF output
  • Improved support for RTF fonts and background colours in OSX
  • Select "uncompressed" or "gzipped" in the same input form as your other output options, rather than having to click on a second link
  • Uncompressed output can be copied'n'pasted from the file preview or downloaded onto your computer

Note that we have removed the old "HTML" format for FASTA as it was irrelevant in the new interface.

[screenshot]

Also, the old "Export data" button has been disabled on pages that use this new interface, to avoid confusion. The old button will be phased out over the next few releases as we upgrade other components and make new export tools available.

Support for VCF format (Elephant)

Our web interface now supports two types of VCF file. Small files (up to 5MB) can now be uploaded to the website for display by selecting the "VCF" format from the dropdown menu; if you wish to attach a tabix-indexed VCF, please select "VCF (indexed)".

Other updates

Drawing code changing namespace (Elephant)

As part of our ongoing reorganisation of the webcode git repositories, we have moved all Bio::EnsEMBL drawing code to EnsEMBL::Draw.

This change will affect developers who have created custom drawing code tracks or components that draw images using the DrawableContainer or VDrawableContainer modules. A simple change of namespace in your code will ensure that it continues to work.

Ensembl 76 mart databases (Elephant)

  • Ensembl Genes 76
    • Retirement of the human expression data (eGenetics/SANBI EST and GNF/Atlas) and the zebrafish expression data (ZFIN). This will result in the removal of the "EXPRESSION" filter and attribute sections.
    • Retirement of the transcript splicing events data computed by the pipeline developed as part of the ASTD project. These data will be retired for human (Homo sapiens), mouse (Mus musculus), rat (Rattus norvegicus), zebrafish (Danio rerio), c.elegans (Caenorhabitis elegans) and fruitfly (Drosophila melanogaster). This will result in the removal of the "Transcript event" filter and attribute sections.
    • Updated human assembly to GRCh38
    • Added new species Amazon molly (Poecilia formosa) and Olive baboon (Papio anubis)
    • Added new Gene and Transcript version attributes for all the species
    • Added new Transcription Start Site (TSS) attribute in the structure and sequence sections for all the species.
    • Added "QTL chromosome name" and "QTL region" filters for sheep and chicken
    • Added marker start and end filters for sheep
  • Ensembl Variation 76
    • Updated human assembly to GRCh38
    • Added "QTL chromosome name" and "QTL region" filters for sheep and chicken
  • Vega 56
    • Updated human assembly to GRCh38
  • Ensembl Regulation 76
    • Updated human assembly to GRCh38

All species: updated RefSeq sequence synonyms (Elephant)

The imported RefSeq sequence synonyms have been updated for all species.

Human: updated cDNA alignments (Elephant)

A new cdna database was created for e75: The latest set of cDNAs for human (as of December 2013) from the European Nucleotide Archive and NCBI RefSeq (release nn) were aligned to the current genome using Exonerate.

Retiring of GBrowse format (Elephant)

In release 76 we will be discontinuing support for the GBrowse file format, as it is little-used, and will be focusing in future on improving support for more popular formats.

EMBL and Genbank Dumps (Elephant)

EMBL and Genbank dumps for all species.

External reference projection (Elephant)

Gene ontology (GO) identifiers and gene name projection to all species.

FASTA & GTF dumps (Elephant)

FASTA & GTF dumps for all the species

Retirement of the Alternative Splicing Event pipeline (Elephant)

As part of the retirement of the Alternative splicing event pipeline, we have removed the following directory from the ensembl-production git repository: ensembl-production/scripts/alternative_splicing

patch_75_76a.sql - schema_version update in production db (Elephant)

Update schema_version in production database to 76.

Stable ID lookup (Elephant)

Stable ID lookup provided for REST services

Includes lookup for RefSeq and CCDS entries

patch_75_76_a.sql - schema_version update in ontology db (Elephant)

Update schema_version in meta table to 76.

ProteinTrees and homologies (Elephant)

GeneTrees (protein-coding) with new/updated genebuilds and assemblies

-- all-vs-all blastp (ncbi-blast-2.2.27+)
-- Clustering using hcluster_sg
-- Multiple sequence alignments using MCoffee (Version_9.03.r1318) or Mafft (mafft-7.017)
-- Phylogenetic reconstruction using TreeBeST
-- Homology inference
-- Pairwise gene-based dN/dS scores for high coverage species pairs only (both on orthologues and paralogues) (codeml/PAML v4.3)
-- GeneTree stable ID mapping
-- Per family gene dynamics using CAFE (v2.2)

ncRNAtrees and homologies (Elephant)

Classification based on Rfam models (v11.0)

Multiple sequence alignments with Infernal

Phylogenetic reconstruction using RAxML

Phylogenetic reconstruction using FastTree2 and RAxML-Light for very big families

Additional multiple sequence alignments with Prank (w/ genomic flanks)

Additional phylogenetic reconstruction using PhyML and NJ

Phylogenetic tree merging using TreeBeST

Per family gene dynamics using CAFE

Homology inference

Secondary structure plots

Protein Families (Elephant)

New pipeline that makes the Families consistent with the gene-trees. It includes all the Ensembl transcript isoforms (including human non-reference haplotypes) and newest Uniprot Metazoa.

Clustering using the TreeFam 10 HMM library

Multiple Sequence Alignments with MAFFT (v.7.017)

Family stable ID mapping

API/Schema changes (Elephant)

- Several objects now inherit from Storable (methods: dbID(), adaptor(), new_fast(), new())

- Methods scheduled for deletion in e76 have been removed

- Split member into seq_member and gene_member + members depend on dnafrags

lastz alignments (Elephant)

lastz H.sap-C.hof (on H.sap)  ( choloepus_hoffmanni, homo_sapiens ) 

lastz H.sap.O.ari (on H.sap) ( ovis_aries, homo_sapiens )

lastz H.sap-C.por (on H.sap)  ( cavia_porcellus, homo_sapiens )

lastz H.sap-D.nov (on H.sap)  ( dasypus_novemcinctus, homo_sapiens )

lastz H.sap-D.ord (on H.sap)  ( dipodomys_ordii, homo_sapiens ) 

lastz H.sap-E.eur (on H.sap)  ( erinaceus_europaeus, homo_sapiens )

lastz H.sap-E.tel (on H.sap)  ( echinops_telfairi, homo_sapiens )

lastz H.sap-L.afr (on H.sap)  ( homo_sapiens, loxodonta_africana )

lastz H.sap-M.eug (on H.sap)  ( homo_sapiens, macropus_eugenii )

lastz H.sap-M.mur (on H.sap)  ( microcebus_murinus, homo_sapiens )

lastz H.sap-O.pri (on H.sap)  ( ochotona_princeps, homo_sapiens )

lastz H.sap-P.cap (on H.sap)  ( procavia_capensis, homo_sapiens )

lastz H.sap-P.vam (on H.sap)  ( pteropus_vampyrus, homo_sapiens )

lastz H.sap-S.ara (on H.sap)  ( sorex_araneus, homo_sapiens )

lastz H.sap-T.bel (on H.sap)  ( tupaia_belangeri, homo_sapiens )

lastz H.sap-T.syr (on H.sap)  ( tarsius_syrichta, homo_sapiens )

lastz H.sap-T.tru (on H.sap)  ( tursiops_truncatus, homo_sapiens )

lastz H.sap-V.pac (on H.sap)  ( vicugna_pacos, homo_sapiens )

lastz H.sap-I.tri (on H.sap)  ( homo_sapiens, ictidomys_tridecemlineatus )

lastz H.sap-M.fur (on H.sap)  ( homo_sapiens, mustela_putorius_furo ) 

lastz H.sap-M.luc (on H.sap)  ( homo_sapiens, myotis_lucifugus ) 

lastz H.sap-A.mel (on H.sap)  ( homo_sapiens, ailuropoda_melanoleuca )

lastz H.sap-E.cab (on H.sap)  ( equus_caballus, homo_sapiens )

lastz H.sap-M.dom (on H.sap)  ( monodelphis_domestica, homo_sapiens )

lastz H.sap-O.ana (on H.sap)  ( ornithorhynchus_anatinus, homo_sapiens )

lastz H.sap-B.tau (on H.sap)  ( homo_sapiens, bos_taurus )

lastz H.sap-C.fam (on H.sap)  ( homo_sapiens, canis_familiaris )

lastz H.sap-C.jac (on H.sap)  ( homo_sapiens, callithrix_jacchus ) 

lastz H.sap-F.cat (on H.sap)  ( homo_sapiens, felis_catus ) 

lastz H.sap-G.gor (on H.sap)  ( homo_sapiens, gorilla_gorilla ) 

lastz H.sap-M.mul (on H.sap)  ( macaca_mulatta, homo_sapiens )

lastz H.sap-M.mus (on H.sap)  ( homo_sapiens, mus_musculus )

lastz H.sap-N.leu (on H.sap)  ( homo_sapiens, nomascus_leucogenys )

lastz H.sap-O.cun (on H.sap)  ( homo_sapiens, oryctolagus_cuniculus )

lastz H.sap-O.gar (on H.sap)  ( homo_sapiens, otolemur_garnettii )

lastz H.sap-P.abe (on H.sap)  ( homo_sapiens, pongo_abelii )

lastz H.sap-P.tro (on H.sap)  ( homo_sapiens, pan_troglodytes )

lastz H.sap-R.nor (on H.sap)  ( homo_sapiens, rattus_norvegicus )

lastz H.sap-S.har (on H.sap)  ( homo_sapiens, sarcophilus_harrisii )

lastz H.sap-S.scr (on H.sap)  ( homo_sapiens, sus_scrofa )

lastz H.sap-A.pla (on H.sap)  ( homo_sapiens, anas_platyrhynchos )

lastz H.sap-F.alb (on H.sap)  ( homo_sapiens, ficedula_albicollis )

lastz H.sap-G.gal (on H.sap)  ( homo_sapiens, gallus_gallus )

lastz H.sap-P.sin (on H.sap)  ( homo_sapiens, pelodiscus_sinensis )

lastz H.sap (on H.sap) ( homo_sapiens )

lastz C.sav-H.sap (on H.sap) ( ciona_savignyi,homo_sapiens )

lastz H.sap-A.car (on H.sap) ( homo_sapiens,anolis_carolinensis )

lastz H.sap-C.int (on H.sap) ( homo_sapiens,ciona_intestinalis )

lastz H.sap-D.rer (on H.sap) ( homo_sapiens,danio_rerio )

lastz H.sap-G.acu (on H.sap) ( gasterosteus_aculeatus,homo_sapiens )

lastz H.sap-G.mor (on H.sap) ( homo_sapiens,gadus_morhua )

lastz H.sap-L.cha (on H.sap) ( homo_sapiens,latimeria_chalumnae )

lastz H.sap-O.lat (on H.sap) ( oryzias_latipes,homo_sapiens )

lastz H.sap-O.nil (on H.sap) ( homo_sapiens,oreochromis_niloticus )

lastz H.sap-P.mar (on H.sap) ( homo_sapiens,petromyzon_marinus )

lastz H.sap-T.gut (on H.sap) ( taeniopygia_guttata,homo_sapiens )

lastz H.sap-T.nig (on H.sap) ( tetraodon_nigroviridis,homo_sapiens )

lastz H.sap-T.rub (on H.sap) ( takifugu_rubripes,homo_sapiens )

lastz H.sap-X.mac (on H.sap) ( homo_sapiens,xiphophorus_maculatus )

lastz H.sap-X.tro (on H.sap) ( homo_sapiens,xenopus_tropicalis ) 

lastz M.gal-H.sap (on H.sap) ( homo_sapiens,meleagris_gallopavo )

lastz M.mus-C.por (on M.mus) ( mus_musculus,cavia_porcellus )

lastz C.fam-F.cat (on C.fam) ( canis_familiaris,felis_catus )

lastz C.fam-M.fur (on C.fam) (canis_familiaris,mustela_putorius_furo )

lastz M.mus-S.ara (on M.mus) ( mus_musculus,sorex_araneus)

lastz B.tau-T.tru (on B.tau) (bovis_taurus,tursiops_truncatus)

lastz B.tau-F.cat (on B.tau) (bovis_tarurus,felis_catus)

lastz B.tau-M.fur (on B.tau) (bovis_taurus,mustela_putorius_furo)

lastz B.tau-P.vam (on B.tau) (bovis_taurus,pteropus_vampyrus)

lastz H.sap-P.anu (on H.sap) (homo_sapiens, papio_anubis)

lastz H.sap-C.pyg (on H.sap) (homo_sapiens, chlorocebus pygerythrus)

lastz G.gal-M.dom (on G.gal) (gallus_gallus,monodelphis_domestica)

lastz O.lat-M.mus (on O.lat) (oryzias_latipes, mus_musculus)

Syntenies (Elephant)

H.sap-M.dom (on H.sap)

H.sap-O.ana (on H.sap)

H.sap-B.tau (on H.sap) 

H.sap-C.fam (on H.sap) 

H.sap-E.cab (on H.sap)

H.sap-M.gal (on H.sap)

H.sap-C.jac (on H.sap)

H.sap-F.cat (on H.sap)

H.sap-G.gor (on H.sap)

H.sap-M.mul (on H.sap)

 H.sap-M.mus (on H.sap)

H.sap-O.cun (on H.sap)

H.sap-P.tro (on H.sap)

H.sap-R.nor (on H.sap)

H.sap-S.scr (on H.sap) 

H.sap-G.gal (on H.sap)

H.sap-A.car (on H.sap)

H.sap-D.rer (on H.sap)

H.sap-G.acu (on H.sap) 

H.sap-O.lat (on H.sap)

H.sap-T.gut (on H.sap) 

H.sap-T.nig (on H.sap)

C.fam-F.cat (on C.fam)

B.tau-F.cat (on B.tau)

H.sap-O.ari (on H.sap)

Retitrement of archives 62 and 63 (Elephant)

This release cycle we will be retiring archive 62 (April 2011) and 63 (June 2011), in accordance with our three-year rolling retirement policy. The data will remain available on our public database server; only the web interface will be removed.

patch_75_76_a.sql - schema_version update (Elephant)

Update schema_version in meta table to 76.

patch_75_76_b.sql - allow null karyotype (Elephant)

Band and stain column in karyotype table can be null

Ensembl VM Build (Elephant)

The Ensembl Virtual Machine applicance will be updated to version 76.

Includes a larger disk space for VEP compatibility

patch_75_76_c.sql - alternative splicing event retirement (Elephant)

Remove splicing event tables which are not used any more.

Retirement of splicing event related material (Elephant)

Match the removal of the splicing_event tables, by retiring the corresponding modules:

- Bio::EnsEMBL::DBSQL::SplicingEventAdaptor

- Bio::EnsEMBL::DBSQL::SplicingEventFeatureAdaptor

Bio::EnsEMBL::DBSQL::SplicingTranscriptPairAdaptor

- Bio::EnsEMBL::SplicingEvent

- Bio::EnsEMBL::SplicingEventFeature

- Bio::EnsEMBL::SplicingTranscriptPair