- ncRNA secondary structure now displayed on the Gene Summary page
- New matrix configuration for RNASeq models
- New species: sheep (Ovis aries), cave fish (Astyanax mexicanus) and spotted gar (Lepisosteus oculatus)
- Updated patches for the human assembly (GRCh37.p13) and mouse assembly (GRCm38.p2)
New web displays and tools
Species list now exportable (Gorilla)
In response to user requests we have reformatted the Ensembl species page as an exportable table; where relevant the columns are sortable. In addition, we have included taxon IDs and whether or not the species has variation and regulation databases.
ncRNA secondary structure (Gorilla)
We now show secondary structure of non-coding RNAs on the Gene Summary page, using the R2R package.
If you click on the 'Enlarge' link, you will be taken to a page that shows a more detailed image of the structure, including base pair annotations.
Note for developers:
If you run an Ensembl mirror and wish to use this functionality will need to compile R2R on their webserver and then configure the R2R_BIN path in SiteDefs.pm, e.g.
$SiteDefs::R2R_BIN = '/usr/bin/r2r';
Without these steps, the webcode will omit the SVG image.
Updated configuration management interface (Gorilla)
You will now be able to share configurations and configuration sets with other users, and with groups you administer.
Configurations and sets from groups you are in will appear in the interface, so you can use them.
You will also see a section called "Suggested Configurations". This contains configurations and sets created by our outreach team, in order to provide a quick and easy means of configuring the site in a number of predefined ways.
The interface can now be found in the Manage Configurations section of the left menu of the Personal Data tab of the popup window, reachable by clicking the Manage your Data button.
Ensembl 74 mart databases (Gorilla)
- Ensembl Genes 74
- Updated human assembly to GRCh37.p13, mouse assembly to GRCm38.p2 and armadillo assembly to Dasnov3.0.
- Added new species cave fish (Astyanax mexicanus), sheep (Ovis aries) and spotted gar (Lepisosteus oculatus)
- Added a new "Phenotype" filter section and new phenotype attributes in the GENE section
- Replacement of the "possible orthologs" filter and attribute sections by new "Orthology confidence" in the orthologs attribute section and "Paralogy confidence" in the paralogs section. Definition for these two attributes can be found at the following location: http://www.ensembl.org/info/genome/compara/homology_method.html#homology_types
- Added "Phase", "cDNA coding start", "cDNA coding end", "Genomic coding start" and "Genomic coding end" in the sequence attribute section
- Added new filters and attributes including ChEMBL, RGD transcript name and new probes
- Ensembl Variation 74
- Updated the human somatic variation database to COSMIC 67 data
- Added a new "UCSC ID" and "Digital Object Identifier" attribute in the Variation citation section
- Added Phenotype filters and attributes in the Human structural variation and Human structural variation somatic datasets.
- Added new variation species sheep (Ovis aries)
- Vega 54
- Updated human assembly to GRCh37.p13 and mouse assembly to GRCm38.p2
- Added "ENA transcript ID" in the human filter and attribute sections
- Ensembl Regulation 74
FASTA & GTF dumps (Gorilla)
FASTA (now includes CDS) & GTF dumps for all the species
External reference projection (Gorilla)
Gene ontology (GO) identifiers and gene name projection to all species.
EMBL and Genbank Dumps (Gorilla)
EMBL and Genbank dumps for all species.
LRG Import (Gorilla)
Importing the latest version of Locus Reference Genomic dataset
Global UniParc xref update (Gorilla)
UniParc, as xref source, is randomly scattered in the configuration file. It is specified as source for some species but not for all.
UniParc will be used for all species with no exception.
Replacement of BiotypeMapper by SO mapper and EnsEMBL ORM API (Gorilla)
The module Bio::EnsEMBL::Utils::BiotypeMapper encodes both the logic to convert biotypes to sequence ontology terms and the logic to deal with biotype groups.
As for the mapping between biotypes and SO terms, we will replace BiotypeMapper with a more dedicated SequenceOntologyMapper whose role is to convert EnsEMBL feature or related objects to SO terms.
The responsibility to deal with biotype groups in the production database will be granted to the Biotype and the corresponding Manager object in the EnsEMBL ORM framework.
Archive stable IDs (Gorilla)
Archive stable IDs in the stable id lookup DB.
Stable ID lookup (Gorilla)
Stable ID lookup provided for REST services
New GenomeContainer (Gorilla)
New GenomeContainer module to replace the AssemblyAdaptor
More generic container for genome related information
- various annotation statistics
- various assembly lengths and slices
patch_73_74a.sql - schema_version update in production db (Gorilla)
Update schema_version in meta table to 74.
patch_73_74_a.sql - schema_version update in ontology db (Gorilla)
Update schema_version in meta table to 74.
15 way-mammal-epo alignments (Gorilla)
with new Sheep assemby and Cat
21 way-amniota-pecan alignments (Gorilla)
with new Sheep assembly
10way teleost fish EPO_LOW_COVERAGE (Gorilla)
with new spotted gar and cave fish assemblies
ProteinTrees and homologies (Gorilla)
GeneTrees (protein-coding) with new/updated genebuilds and assemblies
- all-vs-all blastp (ncbi-blast-2.2.27+)
- Clustering using hcluster_sg
- Multiple sequence alignments using MCoffee (Version_9.03.r1318) or Mafft (mafft-7.017)
- Phylogenetic reconstruction using TreeBeST
- Homology inference
- Pairwise gene-based dN/dS scores for high coverage species pairs only (both on orthologues and paralogues) (codeml/PAML v4.3)
- GeneTree stable ID mapping
- Per family gene dynamics using CAFE (v2.2)
ncRNAtrees and homologies (Gorilla)
- Classification based on Rfam models (v11.0)
- Multiple sequence alignments with Infernal
- Phylogenetic reconstruction using RAxML
- Phylogenetic reconstruction using FastTree2 and RAxML-Light for very big families
- Additional multiple sequence alignments with Prank (w/ genomic flanks)
- Additional phylogenetic reconstruction using PhyML and NJ
- Phylogenetic tree merging using TreeBeST
- Per family gene dynamics using CAFE
- Homology inference
- Secondary structure plots
Protein Families (Gorilla)
Updated MCL families including all Ensembl transcript isoforms (including human non-reference haplotypes) and newest Uniprot Metazoa.
- Getting distances by NCBI BlastP (v.2.2.27+)
- Clustering by MCL (v.12-135)
- Multiple Sequence Alignments with MAFFT (v.7.017)
- Family stable ID mapping
Compara dumps (Gorilla)
- Data dumps for ProteinTrees
- Data dumps for ncRNAtrees
- OrthoXML dumps for ProteinTrees
- OrthoXML dumps for ncRNAtrees
- PhyloXML dumps for ProteinTrees
- PhyloXML dumps for ncRNAtrees
- EMF dumps for 4 way EPO sauropsids
- EMF dumps for 7 way sauropsidEPO_LOW_COVERAGE multiple alignments
- EMF dumps for 15 way mammal EPO multiple alignments
- EMF dumps for 37 way EPO_LOW_COVERAGE multiple alignments
- EMF dumps for 21 way amniota-pecan multiple alignments
- EMF dumps for 10 way teleost fish EPO_LOW_COVERAGE multiple alignments
- BED files for 37 way EPO_LOW_COVERAGE alignments
- BED files for 10 way teleost fish EPO_LOW_COVERAGE
- BED files for 7 way sauropsids EPO_LOW_COVERAGE
API/Schema change: new Locus object (Gorilla)
base class for DnaFragRegion, GenomicAlign and Member
API/Schema change: New SpeciesTree API (Gorilla)
- New API to deal with species trees (+schema change in the species_tree_* tables)
API/Schema change: New API methods (+ schema change) to link gene tree nodes and homologues to the ancestral taxa (Gorilla)
via the new fields in the species_tree_node table
API/Schema change: Inclusion of all the alternative alleles in the gene projections (Gorilla)
between the reference sequence and the alternative sequences
API/Schema change: Changes in the CAFEGeneFamily API (Gorilla)
to work with the new SpeciesTree API
Retirement of archive 60 (Gorilla)
In accordance with our archive policy, we will be retiring the Ensembl 60 (Nov 2010) archive when version 74 is released. Release 60 data will still be available from our FTP site and public MySQL database - only the web front end will be retired.
SiteDefs rewrite (Gorilla)
SiteDefs has been rewritten so that it no longer exports variables - they MUST now be referenced in other packages with $SiteDefs::VAR_NAME.
$SiteDefs::VERSION has been removed - use $SiteDefs::ENSEMBL_VERSION instead.
$SiteDefs::ENSEMBL_FLAG_NAMES_HR and $SiteDefs::ENSEMBL_FLAG_NAMES have been removed - use $SiteDefs::ENSEMBL_DEBUG_FLAG_NAMES instead.
$SiteDefs::SAMTOOLS_HTTP_PROXY and $SiteDefs::SOAP_PROXY have been replaced by $SiteDefs::HTTP_PROXY.
The following other variables have also been removed, with no replacements:
Ensembl VM Build (Gorilla)
The Ensembl Virtual Machine applicance will be updated to version 74.
patch_73_74_a.sql - schema_version update (Gorilla)
Update schema_version in meta table to 74.
patch_73_74_b.sql - dnac removal (Gorilla)
Remove dnac table which is not used any more.
patch_73_74_c.sql - unconventional_transcript_association removal (Gorilla)
Remove unconventional_transcript_association table which is not used any more.
patch_73_74_d.sql - QTL removal (Gorilla)
Remove of the qtl tables (qtl, qtl_feature, qtl_synonym) which are not used any more.
patch_73_74_e.sql - Canonical_annotation removal (Gorilla)
Removal of the canonical_annotation column in the gene table, as it is not used any more.
patch_73_74_f.sql - Pair_dna_align_feature removal (Gorilla)
Removal of the pair_dna_align_feature_id column in the dna_align_feature table, as it is not used any more.
patch_73_74_g.sql - Adding transcript index to transcript_intron_supporting_evidence (Gorilla)
Adding an index on transcript id to transcript_intron_supporting_evidence to
speed up retrieval of supporting features from a Transcript object
Retirement of dnac related material (Gorilla)
Match the removal of table dnac, by retiring misc-scripts/utilities/dna_compress.pl (driver script to populate and test the experimental dnac table of ensembl) and, correspondigly, the module Bio::EnsEMBL::DBSQL::CompressedSequenceAdaptor.
Retirement of unconventional transcript association adaptor (Gorilla)
Bio::EnsEMBL::DBSQL::UnconventionalTranscriptAssociationAdaptor will be retired, to match patch_73_74_c.sql which removes the table unconventional_transcript_association.
patch_73_74_h.sql - Adding unique index to alt_allele(gene_id) (Gorilla)
Adding an index on gene_id in alt_alelle to enforce a 1 gene 1 group policy.