- ncRNA secondary structure now displayed on the Gene Summary page
- New matrix configuration for RNASeq models
- New species: sheep (Ovis aries), cave fish (Astyanax mexicanus) and spotted gar (Lepisosteus oculatus)
- Updated patches for the human assembly (GRCh37.p13) and mouse assembly (GRCm38.p2)
New web displays and tools
Matrix configuration for RNASeq models (Human)
Configuration for RNASeq models for Human and Opossum now appears as a matrix, in order to more easily manage the large number of tracks available.
Species list now exportable (Human)
In response to user requests we have reformatted the Ensembl species page as an exportable table; where relevant the columns are sortable. In addition, we have included taxon IDs and whether or not the species has variation and regulation databases.
ncRNA secondary structure (Human)
We now show secondary structure of non-coding RNAs on the Gene Summary page, using the R2R package.
If you click on the 'Enlarge' link, you will be taken to a page that shows a more detailed image of the structure, including base pair annotations.
Note for developers:
If you run an Ensembl mirror and wish to use this functionality will need to compile R2R on their webserver and then configure the R2R_BIN path in SiteDefs.pm, e.g.
$SiteDefs::R2R_BIN = '/usr/bin/r2r';
Without these steps, the webcode will omit the SVG image.
Phenotypes from orthologs (Human)
The gene phenotype view will show phenotypes associated with orthologues of the current gene.
Updated configuration management interface (Human)
You will now be able to share configurations and configuration sets with other users, and with groups you administer.
Configurations and sets from groups you are in will appear in the interface, so you can use them.
You will also see a section called "Suggested Configurations". This contains configurations and sets created by our outreach team, in order to provide a quick and easy means of configuring the site in a number of predefined ways.
The interface can now be found in the Manage Configurations section of the left menu of the Personal Data tab of the popup window, reachable by clicking the Manage your Data button.
New variation data
New dbSNP imports (Human)
dbSNP Build 138 data will be imported
New regulation data
Updated microarrays (Human)
- Illumina HumanHT-12 had its version number '_V3' appended, and the new HumanHT-12_V4 was added.
- The Illumina HumanRef-8_V3 was added.
- Agilent SurePrint G3 GE 8x60k had its version number '_V2' appended
- The Affy HuGene-1_0-st-v1 was updated to HuGene-2_0-st-v1
- The Illumina MouseRef-8_V2 was added.
- The Illumina RatRef-12 array had its version '_V1' appended.
Update to Ensembl-Havana GENCODE gene set (release 19) (Human)
Updated Ensembl-Havana gene set (GENCODE release 19). This gene set is a merge of complete Ensembl gene models and the latest Havana gene annotation. All CCDS genes are included in this gene set.
The human GRCh37.p13 gene annotation is also included:
The patches for GRCh37.p13 were annotated using a combination of manual annotation, annotation projected from the primary assembly and annotation derived from cDNA and protein alignment evidence. Annotation of the patches is stored in the core database.
Human: assembly updated to GRCh37.p13 (Human)
The human genome assembly was updated to GRCh37.p13 and the assembly information in all human databases has been altered accordingly. This minor assembly update contains 204 assembly patches. The DNA sequence for the primary assembly (chromosomes 1-22, X, Y, unlocalized scaffolds and unplaced scaffolds) remains unchanged.
Human: updated cDNA alignments (Human)
A new cdna database was created for e74: The latest set of cDNAs for human (as of October 2013) from the European Nucleotide Archive and NCBI RefSeq (release 61) were aligned to the current genome using Exonerate.
Updated human otherfeatures db: New CCDS import (Human)
This release of the human gene set also includes 27,732 transcript models as part of an updated version (Aug 2013) of CCDS
Human: GRCh37.p13 Karyotype Bands (Human)
Karyotype bands were updated in regions overlapping patches
Splicing events (Human)
The ASTD project computationally predicted genes in a similar way to Ensembl every release with a focus on alternative mRNA structures (splicing events, poly(A) sites, TSS) and features (ppt, exon-exon junction types)
Since 2010, the storage and display of the alternative information is now an entire part of Ensembl for the following species:
- Homo sapiens,
- Mus musculus,
- Rattus norvegicus,
- Danio rerio,
- Caenorhabditis elegans,
- Drosophila melanogaster
These data have been updated for this release for the species listed above.
HumanOmni5 imported (Human)
The list of variants assayed with the Illumina HumanOmni5 array will be imported as a variation_set and browser track.
Variation citation update (Human)
Variation citation data will be updated. Data mined by UCSC will be included for the first time.
External database references update (Human)
Xrefs update for:
human, mouse, sarcophilus_harrisii, otolemur_garnettii, equus_caballus, pongo_abelii, ornithorhynchus_anatinus, monodelphis_domestica, mustela_putorius_furo, xiphophorus_maculatus, macaca_mulatta, tetraodon_nigroviridis, bos_taurus, anolis_carolinensis, pan_troglodytes, gadus_morhua, loxodonta_africana, cavea_porcellus, callithrix_jacchus, sus_scrofa, rattus_norvegicus, meleagris_gallopavo, ailuropoda_melanoleuca, xenopus_tropicalis, nomascus_leucogenys, myotis_lucifugus, oreochromis_niloticus
Vega Human annotation updated (Human)
Manual annotation of human from Havana has been updated and contains the data released in Vega release 54.
Structural variations (Human)
- Update studies
- Import new studies.
Import COSMIC variants (Human)
Import COSMIC's version 67.
Human phenotype data (Human)
Update phenotype data for most of the current phenotype sources
Update NHLBI ESP data for human (Human)
We import the new version v.0.0.21 of NHLBI ESP data.
Ensembl 74 mart databases (Human)
- Ensembl Genes 74
- Updated human assembly to GRCh37.p13, mouse assembly to GRCm38.p2 and armadillo assembly to Dasnov3.0.
- Added new species cave fish (Astyanax mexicanus), sheep (Ovis aries) and spotted gar (Lepisosteus oculatus)
- Added a new "Phenotype" filter section and new phenotype attributes in the GENE section
- Replacement of the "possible orthologs" filter and attribute sections by new "Orthology confidence" in the orthologs attribute section and "Paralogy confidence" in the paralogs section. Definition for these two attributes can be found at the following location: http://www.ensembl.org/info/genome/compara/homology_method.html#homology_types
- Added "Phase", "cDNA coding start", "cDNA coding end", "Genomic coding start" and "Genomic coding end" in the sequence attribute section
- Added new filters and attributes including ChEMBL, RGD transcript name and new probes
- Ensembl Variation 74
- Updated the human somatic variation database to COSMIC 67 data
- Added a new "UCSC ID" and "Digital Object Identifier" attribute in the Variation citation section
- Added Phenotype filters and attributes in the Human structural variation and Human structural variation somatic datasets.
- Added new variation species sheep (Ovis aries)
- Vega 54
- Updated human assembly to GRCh37.p13 and mouse assembly to GRCm38.p2
- Added "ENA transcript ID" in the human filter and attribute sections
- Ensembl Regulation 74
FASTA & GTF dumps (Human)
FASTA (now includes CDS) & GTF dumps for all the species
External reference projection (Human)
Gene ontology (GO) identifiers and gene name projection to all species.
EMBL and Genbank Dumps (Human)
EMBL and Genbank dumps for all species.
LRG Import (Human)
Importing the latest version of Locus Reference Genomic dataset
Global UniParc xref update (Human)
UniParc, as xref source, is randomly scattered in the configuration file. It is specified as source for some species but not for all.
UniParc will be used for all species with no exception.
Replacement of BiotypeMapper by SO mapper and EnsEMBL ORM API (Human)
The module Bio::EnsEMBL::Utils::BiotypeMapper encodes both the logic to convert biotypes to sequence ontology terms and the logic to deal with biotype groups.
As for the mapping between biotypes and SO terms, we will replace BiotypeMapper with a more dedicated SequenceOntologyMapper whose role is to convert EnsEMBL feature or related objects to SO terms.
The responsibility to deal with biotype groups in the production database will be granted to the Biotype and the corresponding Manager object in the EnsEMBL ORM framework.
Archive stable IDs (Human)
Archive stable IDs in the stable id lookup DB.
Stable ID lookup (Human)
Stable ID lookup provided for REST services
New GenomeContainer (Human)
New GenomeContainer module to replace the AssemblyAdaptor
More generic container for genome related information
- various annotation statistics
- various assembly lengths and slices
patch_73_74a.sql - schema_version update in production db (Human)
Update schema_version in meta table to 74.
patch_73_74_a.sql - schema_version update in ontology db (Human)
Update schema_version in meta table to 74.
input_set_input_subset_split (patch_73_74b) (Human)
The input_set and input_subset tables we patched to allowe input_subsets to exist independant of input_set records. The input_set format and vendor feilds were dropped in favour of an analysis field, and cell_type, feature_type and experiment fields were added to the input_subset table.
Schema change (Human)
Add clinical_significance column to variation_feature (copied from variation).
Add data_types column to source table.
15 way-mammal-epo alignments (Human)
with new Sheep assemby and Cat
21 way-amniota-pecan alignments (Human)
with new Sheep assembly
10way teleost fish EPO_LOW_COVERAGE (Human)
with new spotted gar and cave fish assemblies
ProteinTrees and homologies (Human)
GeneTrees (protein-coding) with new/updated genebuilds and assemblies
- all-vs-all blastp (ncbi-blast-2.2.27+)
- Clustering using hcluster_sg
- Multiple sequence alignments using MCoffee (Version_9.03.r1318) or Mafft (mafft-7.017)
- Phylogenetic reconstruction using TreeBeST
- Homology inference
- Pairwise gene-based dN/dS scores for high coverage species pairs only (both on orthologues and paralogues) (codeml/PAML v4.3)
- GeneTree stable ID mapping
- Per family gene dynamics using CAFE (v2.2)
ncRNAtrees and homologies (Human)
- Classification based on Rfam models (v11.0)
- Multiple sequence alignments with Infernal
- Phylogenetic reconstruction using RAxML
- Phylogenetic reconstruction using FastTree2 and RAxML-Light for very big families
- Additional multiple sequence alignments with Prank (w/ genomic flanks)
- Additional phylogenetic reconstruction using PhyML and NJ
- Phylogenetic tree merging using TreeBeST
- Per family gene dynamics using CAFE
- Homology inference
- Secondary structure plots
Protein Families (Human)
Updated MCL families including all Ensembl transcript isoforms (including human non-reference haplotypes) and newest Uniprot Metazoa.
- Getting distances by NCBI BlastP (v.2.2.27+)
- Clustering by MCL (v.12-135)
- Multiple Sequence Alignments with MAFFT (v.7.017)
- Family stable ID mapping
Compara dumps (Human)
- Data dumps for ProteinTrees
- Data dumps for ncRNAtrees
- OrthoXML dumps for ProteinTrees
- OrthoXML dumps for ncRNAtrees
- PhyloXML dumps for ProteinTrees
- PhyloXML dumps for ncRNAtrees
- EMF dumps for 4 way EPO sauropsids
- EMF dumps for 7 way sauropsidEPO_LOW_COVERAGE multiple alignments
- EMF dumps for 15 way mammal EPO multiple alignments
- EMF dumps for 37 way EPO_LOW_COVERAGE multiple alignments
- EMF dumps for 21 way amniota-pecan multiple alignments
- EMF dumps for 10 way teleost fish EPO_LOW_COVERAGE multiple alignments
- BED files for 37 way EPO_LOW_COVERAGE alignments
- BED files for 10 way teleost fish EPO_LOW_COVERAGE
- BED files for 7 way sauropsids EPO_LOW_COVERAGE
API/Schema change: new Locus object (Human)
base class for DnaFragRegion, GenomicAlign and Member
API/Schema change: New SpeciesTree API (Human)
- New API to deal with species trees (+schema change in the species_tree_* tables)
API/Schema change: New API methods (+ schema change) to link gene tree nodes and homologues to the ancestral taxa (Human)
via the new fields in the species_tree_node table
API/Schema change: Inclusion of all the alternative alleles in the gene projections (Human)
between the reference sequence and the alternative sequences
API/Schema change: Changes in the CAFEGeneFamily API (Human)
to work with the new SpeciesTree API
Retirement of archive 60 (Human)
In accordance with our archive policy, we will be retiring the Ensembl 60 (Nov 2010) archive when version 74 is released. Release 60 data will still be available from our FTP site and public MySQL database - only the web front end will be retired.
Probe design support (patch_74_74_e) (Human)
The probe_design support has been dropped from the API and the schema. This does not effect the current expression array designs, but was related to an historical table used in the design of tiling arrays.
SiteDefs rewrite (Human)
SiteDefs has been rewritten so that it no longer exports variables - they MUST now be referenced in other packages with $SiteDefs::VAR_NAME.
$SiteDefs::VERSION has been removed - use $SiteDefs::ENSEMBL_VERSION instead.
$SiteDefs::ENSEMBL_FLAG_NAMES_HR and $SiteDefs::ENSEMBL_FLAG_NAMES have been removed - use $SiteDefs::ENSEMBL_DEBUG_FLAG_NAMES instead.
$SiteDefs::SAMTOOLS_HTTP_PROXY and $SiteDefs::SOAP_PROXY have been replaced by $SiteDefs::HTTP_PROXY.
The following other variables have also been removed, with no replacements:
Status name length increased (patch_73_74_d) (Human)
The length of the name field of that status_name table was increased.
result_set input_subset support (patch_73_74_c) (Human)
The ResultSet API and schema was patched to allow input_subsets as supporting sets, and a replicate field was also added to reflect the input_subset replicate value.
Ensembl VM Build (Human)
The Ensembl Virtual Machine applicance will be updated to version 74.
patch_73_74_a.sql - schema_version update (Human)
Update schema_version in meta table to 74.
patch_73_74_b.sql - dnac removal (Human)
Remove dnac table which is not used any more.
patch_73_74_c.sql - unconventional_transcript_association removal (Human)
Remove unconventional_transcript_association table which is not used any more.
patch_73_74_d.sql - QTL removal (Human)
Remove of the qtl tables (qtl, qtl_feature, qtl_synonym) which are not used any more.
patch_73_74_e.sql - Canonical_annotation removal (Human)
Removal of the canonical_annotation column in the gene table, as it is not used any more.
patch_73_74_f.sql - Pair_dna_align_feature removal (Human)
Removal of the pair_dna_align_feature_id column in the dna_align_feature table, as it is not used any more.
patch_73_74_g.sql - Adding transcript index to transcript_intron_supporting_evidence (Human)
Adding an index on transcript id to transcript_intron_supporting_evidence to
speed up retrieval of supporting features from a Transcript object
Retirement of dnac related material (Human)
Match the removal of table dnac, by retiring misc-scripts/utilities/dna_compress.pl (driver script to populate and test the experimental dnac table of ensembl) and, correspondigly, the module Bio::EnsEMBL::DBSQL::CompressedSequenceAdaptor.
Retirement of unconventional transcript association adaptor (Human)
Bio::EnsEMBL::DBSQL::UnconventionalTranscriptAssociationAdaptor will be retired, to match patch_73_74_c.sql which removes the table unconventional_transcript_association.
patch_73_74_h.sql - Adding unique index to alt_allele(gene_id) (Human)
Adding an index on gene_id in alt_alelle to enforce a 1 gene 1 group policy.