EnsemblEnsembl Home

Ensembl Variation News

Release 89

COSMIC data update (Human)

Imported cancer data from COSMIC version 80.

This import excludes the COSMIC alleles, populations and the mutations types.

Structural variants (multiple species)

  • Added new studies from DGVa
  • Updated some of the existing studies from DGVa
  • Updated 1000 Genomes study which now include structural variants on the X and Y chromosomes

Phenotype data updates (all species)

  • Updated Human phenotype data from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Cosmic Gene Census, DDG2P, MIM Morbid and Orphanet.
  • OMIA data for several species
  • AnimalQTL data for several species
  • RGD data for Rat
  • ZFIN data for Zebrafish
  • IMPC data for Mouse
  • MGI data for Mouse

strain_gtype_poly table to be dropped (all species)

The strain_gtype_poly table will be dropped.

PhenCode records merged with dbSNP records (Human)

PhenCode records will be merged with dbSNP records. PhenCode names will be available as variation synonyms.

Beacon endpoint (Human)

GA4GH Beacon endpoint to be implemented.

Release 88

New dbSNP data for Human (Human)

Human is updated with the latest version of dbSNP (149)

New dbSNP data for Rat (Rat)

Rat is updated with the latest version of dbSNP (149)

COSMIC data update (Human)

Imported cancer data from COSMIC version 79.

This import excludes the COSMIC alleles, populations and the mutations types.

Structural variants (Human, Pig, Sheep)

  • Added new studies from DGVa
  • Updated some of the existing studies from DGVa

HGMD-Public dataset (Human)

HGMD data will be updated to version 2016.4 (December 2016)

Phenotype data updates (all species)

  • Updated Human phenotype data from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Cosmic Gene Census, DDG2P, MIM Morbid and Orphanet.
  • OMIA data for several species
  • AnimalQTL data for several species
  • RGD data for Rat
  • ZFIN data for Zebrafish
  • IMPC data for Mouse
  • MGI data for Mouse

New dbSNP data for Platypus (Platypus)

Platypus is updated with the latest version of dbSNP (149)

New dbSNP data for Opossum (Opossum)

Opossum is updated with the latest version of dbSNP (149)

PolyPhen version update (Human)

The version of PolyPhen run in Ensembl will be updated to 2.2.2r405c

Update LD Rest endpoints (all species)

We update the ld/id and ld/region endpoints. The population name is now a required parameter for the ld/id and ld/region endpoints.

VEP switching to ensembl-vep (all species)

The officially supported VEP repository will move from ensembl-tools to ensembl-vep.

The ensembl-tools version will remain available for one release, and after that be available only on archive branches.

New REST API phenotype endpoint (all species)

Creation of new REST API endpoint to get the phenotype associations overlapping a defined region.

Release 87

New dbSNP data for Sheep (Sheep)

Sheep will be updated to the latest version of dbSNP

Phenotype data updates (all species)

  • Updated Human phenotype data from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet and GOA.
  • Other GOA data for several species
  • OMIA data for several species
  • AnimalQTL data for several species
  • RGD data for Rat
  • ZFIN data for Zebrafish
  • IMPC data for Mouse
  • MGI data for Mouse

COSMIC data update (Human)

Imported cancer data from COSMIC version 78.

This import excludes the COSMIC alleles, populations and the mutations types.

Structural variants (Cow, Human, Macaque)

  • Added new studies from DGVa
  • Updated some of the existing studies from DGVa

Release 86

dbSNP update for Chicken, Cow, Horse and Zebra Finch (Cow, Horse, Chicken, Zebra Finch)

Chicken will be updated to dbSNP version 147.

Horse, cow and zebra finch will be updated to dbSNP version 148.

Structural variants (Cow, Dog, Horse, Human)

  • Added new studies from DGVa
  • Updated some of the existing studies from DGVa

Phenotype data updates (all species)

  • Updated Human phenotype data from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet and GOA.
  • Other GOA data for several species
  • OMIA data for several species
  • AnimalQTL data for several species
  • RGD data for Rat
  • ZFIN data for Zebrafish
  • IMPC data for Mouse
  • MGI data for Mouse

HGMD (Human)

HGMD data will be updated to version 2016.2 (June 2016)

Patches (all species)

patch_85_86_a.sql - update schema version

patch_85_86_a.sql - add qualifier & index to phenotype_onology_accession

patch_85_86_a.sql - add index on study.external_reference

 

Release 85

COSMIC data update (Human)

Imported cancer data from COSMIC version 77.

This import excludes the COSMIC alleles, populations and the mutations types.

Structural variants (Dog, Human)

  • Added new studies from DGVa
  • Updated some of the existing studies from DGVa

Phenotype data updates (multiple species)

  • Updated Human phenotype data from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet and GOA.
  • Other GOA data for several species
  • OMIA data for several species
  • AnimalQTL data for several species
  • RGD data for Rat
  • ZFIN data for Zebrafish
  • IMPC data for Mouse
  • MGI data for Mouse

SQL schema changes (all species)

  • Added new tables sample_synonym and phenotype_ontology_accession
  • Dropped column moltype in table variation_synonym
  • Changed the column attrib_id (table attrib) so it is auto_increment
  • Changed the type of the column description (table source) to varchar(400)

New dbSNP data (Human, Mouse)

The human database has been updated to dbSNP147 and the mouse database to dbSNP146.

Phenotype ontology support (Human, Mouse)

Mapping of phenotype descriptions to ontology terms will be provided where available

Equivalent alleles will be flagged (Human, Mouse)

Variation data is not always normalised at source, so an insertion or deletion in repetitive sequence may be described in diffferent positions and assigned different identifiers. Variants sharing equivalent alleles will be flagged.

GA4GH REST endpoints updated (Human)

The GA4GH REST endpoints have been updated to API version v0.6.0a4 which includes support for variant annotation and sequence features.

Release 84

dbSNP 146 for Human, Cow and Dog (Cow, Dog, Human)

Update the Human, Cow and Dog Variation databases with dbSNP version 146.

Pairwise LD calculation on LD variant page (Human)

Starting from a variant page, you can calculate LD between the variant from the variant page and your favourite variant. You can enter your favourite variant into a text box and select the population from which to draw the genotype data from from a dropdown list.

Phenotype data updates (multiple species)

  • Update of the Human phenotype data from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet and GOA.
  • Other GOA data for several species
  • OMIA data for several species
  • RGD data for Rat
  • ZFIN data for Zebrafish
  • IMPC data for Mouse
  • MGI data for Mouse

Structural variants (multiple species)

  • Add new studies from DGVa
  • Update some of the existing studies from DGVa

HGMD data update (Human)

Import of the latest release of public HGMD data (version 2015.4) and remapping to GRCh38

COSMIC data update (Human)

Import of the cancer data from COSMIC version 75.

This import excludes the COSMIC alleles, populations and the mutations types.

API update: StrainSlice (all species)

We moved StrainSlice.pm and StrainSliceAdaptor.pm from the core API to the variation API.

1000 Genomes (Human)

Rename the African population "Mandinka in The Gambia" (MAG) to "Gambian in Western Division, The Gambia" (GWD).

New Pig chips (Pig)

New chips for pig:

  • GeneSeek Genomic Profiler Porcine - HD (Illumina)
  • GeneSeek Genomic Profiler Porcine - LD BeadChip (Illumina)
  • Axiom Porcine Genotyping Array (Affymetrix)

REST - GA4GH update (Human)

The GA4GH REST endpoints will be updated to version 0.6.0.

Release 83

Chicken and pig dbSNP 145 update (Chicken, Pig)

The chicken and pig databases have been updated to use dbSNP145

Phenotype data updates (multiple species)

  • Human phenotype data has been updated from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet and GOA.
  • New Human phenotype association data: Cancer Gene Census
  • Other GOA data for Cow, Dog, Sheep, Zebrafish, Chicken, Macaque, Turkey
  • OMIA data for Cow, Dog, Horse, Sheep
  • RGD data for Rat
  • AnimalQTL for Cow, Horse, Chicken, Pig, Sheep
  • IMPC data for Mouse

Structural variants (multiple species)

  • Added new studies from DGVa
  • Updated some of the existing studies from DGVa

Move VEP_plugins git repository to Ensembl organisation (all species)

The VEP_plugins repo (https://github.com/ensembl-variation/VEP_plugins) was created before Ensembl had a git organisation.

The repo has been moved to the Ensembl organisation to aid discovery. Old links should continue to work.

COSMIC data update (Human)

The cancer data from COSMIC version 74 has been imported.

This import excludes the COSMIC alleles, populations and the mutations types.

dbSNP 145 rsIDs mapping (Gibbon)

The Gibbon database has been updated to include the rsIDs from dbSNP 145 for the variants submitted by Ensembl

ExAC data (Human)

  • New "ExAC" variation set available
  • New "ExAC" track available on the website
  • New evidence type of "ExAC" added

Schema changes (all species)

  • New 'ExAC' evidence (column evidence_attribs) in the variation and variation_feature tables.
  • Remove the column 'validation_status' from the variation and variation_feature tables.

Release 82

dbSNP 144 (Horse, Cat, Chicken, Human)

dbSNP 144 data imported

Structural variants (multiple species)

  • Added new studies and updated other studies from DGVa

HGMD data update (Human)

Import of the latest release of public HGMD data (version 2015.2) and remapping to GRCh38

Phenotype data updates (multiple species)

  • Human phenotype data has been updated from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Orphanet, GOA and Decipher.
  • OMIA data for Cow, Dog, Horse, Sheep
  • RGD data for Rat
  • AnimalQTL for Cow, Horse, Chicken, Pig, Sheep
  • IMPC data (release 3.1) for Mouse

New REST endpoint: Return variants in LD with a given SNP (all species)

We added a new REST API enpoint which returns variants in LD for a given SNP and cut off values for r2 and D'.

NHLBI Exome Sequencing Project data for GRCh38 (Human)

We imported the most recent data (v.0.0.30. (Nov. 3, 2014)) from the Exome Sequencing Project (ESP).

HumanCoreExome-12 variants GRCh38 (Human)

We imported variants form the HumanCoreExome-12 chip.

New REST endpoint: Return the variation sources list for a given species (all species)

We added a new REST API "info" enpoint which returns the variation sources list for a given species. This includes the source versions, descriptions and URL.

Release 81

New dbSNP import for cow (Cow)

dbSNP143 data has been imported for cow

Schema update: Storing sample data (all species)

We updated the way how we store populations, individuals and samples. With the updated schema we can store samples for an individual. All genotypes and read coverage data will be stored on the sample level.

New table:

  • sample

Rename tables:

  • individual_population to sample_population
  • individual_genotype_multiple_bp to sample_genotype_multiple_bp

The change is reflected in tables that store individual data. Individual_id is changed to sample_id in the following tables:

  • compressed_genotype_region
  • read_coverage
  • structural_variation_sample

Update columns in individual table: display, has_coverage and variation_set_id columns moved into the new sample table and have been deleted from the individual table.

API support for the new sample schema (all species)

We updated our API to work with the new sample schema.

Add new modules for representing, creating and storing sample objects.:

  • Sample.pm and SampleAdaptor.pm

Rename modules:

  • IndividualGenotype.pm to SampleGenotype.pm
  • IndividualGenotypeFeature.pm to SampleGenotypeFeature.pm
  • IndividualGenotypeAdaptor.pm to SampleGenotypeAdaptor.pm
  • IndividualGenotypeFeatureAdaptor.pm to SampleGenotypeFeatureAdaptor.pm

Updated variable names from individual to sample in almost all modules in the variation API. 

Updated scripts and pipelines.

Our test suite has been updated accordingly.

Phenotype data updates (multiple species)

  • Human phenotype data has been updated from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt and Decipher.
  • OMIA data for Cow, Dog, Horse, Sheep
  • RGD data for Rat
  • AnimalQTL for Cow, Horse, Chicken, Pig, Sheep
  • IMPC data (release 3.1) for Mouse

Structural variations (Zebrafish, Human, Mouse, Sheep)

  • Added new studies and updated other studies from DGVa

Release 80

1000 Genomes Phase 3 (Human)

Genotypes from 1000 Genomes Phase 3 will be available ( this replaces the phase 1 data)

HGMD data update (Human)

Import of the latest release of public HGMD data (version 2014.4) and remapping to GRCh38

Phenotype data updates (multiple species)

  • Human phenotype data will be updated from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, DDG2P and Decipher.
  • OMIA data for Cow, Dog, Zebrafish, Horse, Cat, Chicken, Macaque, Turkey, Sheep and Chimpanzee
  • RGD data for Rat
  • AnimalQTL for Cow, Horse, Chicken, Pig
  • ZFIN for Zebrafish
  • EuroPhenome, 3i, IMPC, MGP for Mouse

Structural variations (Human)

  • Added new studies and updated other studies from DGVa.
  • New human study for 1000 Genomes - phase 3

dbSNP142 import for human (Human)

The human GRCh38 variation database will be updated to dbSNP142

Personal Genomes Data (Human)

Data from the Personal Genomes project will no longer be imported

Update ESP data GRCh38 (Human)

Update Exome Sequencing Project data for human GRCh38 v.0.0.30. (Nov. 3, 2014).

dbSNP142 import for mouse (Mouse)

The mouse GRCm38 variation database will be updated to dbSNP142

dbSNP142 import for cow (Cow)

The cow variation database will be updated to dbSNP142

dbSNP142 import for zebrafish (Zebrafish)

The zebrafish variation database will be updated to dbSNP142 and remapped to GRCz10

dbSNP143 import for sheep (Sheep)

The sheep variation database will be updated to dbSNP143

dbSNP143 import for pig (Pig)

The pig variation database will be updated to dbSNP143

Remap Rat data (Rat)

The rat variation database will be remapped to Rnor_6.0.

start_lost to replace initiator_codon_variant consequence type (all species)

We will replace the use of initiator_codon_variant with the more specific start_lost. The difference between the two is largely semantic.

The new term protein_altering_variant will be used for variants within the protein which are not better described by any of its child terms 

Release 79

Global Alliance REST Endpoints (all species)

New Global Alliance standard REST endpoints will be available for sets of variation data

Sift (multiple species)

Sift predictions will be updated to the latest version - 5.2.2.

Predictions with little evidence will be flagged and a details of the evidence quality stored (new table protein_function_predictions_attrib)

NextGen Project genotype data (Sheep)

Genotypes from 3 populations:

  • Iranian Ovis aries
  • Iranian Ovis orientalis
  • Moroccan Ovis aries

Update IMPC data (Mouse)

Update data from the International Mouse Phenotyping Consortium (IMPC) to release 2.0 (Published: 06 November 2014).

Phenotypes and diseases from RGD (Rat)

Import phenotypes/diseases from the Rat Genome Database (RGD)

RGD QTL data (Rat)

Update the RGD QTL data for Rat.

Release 78

Import COSMIC variants (Human)

Import COSMIC's version 71 and remap the data to GRCh38

Phenotype data updates (Human)

Human phenotype data will be updated from different sources including NHGRI GWAS, OMIM, ClinVar, UniProt and Decipher.

Database schema change (all species)

  • Add a column "copy_number" in the table "structural_variation" to store the number of copies for the CNV, at the supporting evidence level.
  • Delete the table "study_variation"
  • Update the index "type_val_idx" in the "attrib" table by extending the indexed size for the "value" column (currently limited to 40 characters).

Variation API changes (all species)

  • Add a new object "Source" handling the source data used in several Variation API objects.

HGMD data update (Human)

Import of the latest release of public HGMD data (version 2014.2 from June 2014) and remapping to GRCh38

Add HumanCoreExome chip (Human)

We add the HumanCoreExome chip to our database. All variants located on the chip will be added to the set HumanCoreExome.

Update data from Animal QTL database (Cow, Horse, Chicken, Pig, Sheep)

We update our data from the Animal Quantitative Trait Loci (QTL) Database.

Structural variations (Human)

Added new studies and updated other studies from DGVa.

Release 77

Phenotype data updates (Human, Mouse)

  • Human phenotype data will be updated from different sources including ClinVar and Decipher.
  • Mouse phenotype data from IMPC will be updated.

Structural variations (Human)

Added new studies and updated other studies from DGVa.

Update Sequence Ontology terms (all species)

We update terms from the Sequence Ontology:

  • nc_transcript_variant will be updated to non_coding_transcript_variant
  • non_coding_exon_variant will be updated to non_coding_transcript_exon_variant

Introduce new variation class (all species)

Add genetic_marker as a new variation class.

Add variation_attrib table (all species)

Add variation_attrib table.

For now this will be used by Ensembl Genomes to link homeologous variants on polyploid genomes, though main Ensembl may find use cases for this in future.

Citation data (all species)

Citation data will be updated from Europe PMC and UCSC. Cited variants will now be flagged when they fail standard QC filters but will still be displayed in the usual tracks.

A new column 'display' will be added to the variation table to facilitate this.

dbSNP SubSNP ids no longer held as synonymns (multiple species)

dbSNP SubSNP ids will no longer be held as synonyms. They will be retained in allele records and retrieval of variants by SubSNP id through the web site will still be supported.

Add Sequence Ontology type to regulatory feature consequences (all species)

The SO type for each regulatory feature overlapped by a variant will be reported both on the Variation web page and the VEP results.

Release 76

Update variation data to GRCh38 (Human)

We update our variation data to the new human assembly GRCH38. New locations for our variation data are computed with methods from the Ensembl core API. 

dbSNP updates (Cow, Chicken, Pig, Sheep)

Chicken, cow, pig and sheep will be updated to dbSNP build 140

Human Genotyping Chips update (Human)

A browser track/ variation set of variants from the Illumina Human OmniExpress-12v1 genotyping chip will be created.

Phenotype data updates (Human, Mouse)

  • Human phenotype data will be updated from different sources including ClinVar and Decipher.
  • Mouse phenotype data from IMPC will be updated.

Structural variations (multiple species)

DGVa data will be updated.

For human, the data will be projected to the new assembly (GRCh38).

Import COSMIC variants (Human)

Import COSMIC's version 69 and remap the data to GRCh38

NHLBI ESP data update (Human)

Human NHLBI ESP data will be updated to version v.0.0.26 and remapped to GRCh38

Map variations to new alternate loci in GRCh38 (Human)

We map variations to all new alternate loci in GRCh38

Sift version update (multiple species)

Sift analysis will be updated to version 5.1.0 for human, cow, chicken, horse, pig, mouse and sheep   

schema changes (all species)

The evidence column in the variation and variation_feature tables will be  replaced by an evidence_attribs column to allow configurability.

A new table 'display_group' will be added to hold information on which populations should be shown in separate tables on the PopulationGenetics page. The population table will gain an extra display_group_id to link to this table.

The read_coverage table is back in the schema (was removed in 75)

HGMD data update (Human)

Import of the latest release of public HGMD data (version 2013.4 from December 2013) and remapping to GRCh38

Release 75

New dbSNP imports (Dog, Horse, Opossum, Platypus, Zebra Finch)

Dog, horse, opossum, platypus and zebrafinch will be updated to dbSNP 139

New variation database for Turkey (Turkey)

The first ensembl variation database will be created for turkey using dbSNP139

Sift protein impact predictions for horse (Horse)

Sift predictions will be available for the ensembl horse proteome

Phenotype data updates (Human, Mouse)

Human phenotype data will be updated from sources including ClinVar and Decipher.

Mouse phenotype data from IMPC will be updated.

Citation data update (Human)

Human variation citation data will updated from EPMC and UCSC

Structural variations (Human, Mouse, Pig)

DGVa data will be updated and new studies imported

PolyPhen update (Human)

Polyphen predictions will be updated using code version 2.2.2, release 405 and the latest available databases.

HGMD data update (Human)

The latest release of public HGMD data (version 2013.3 from September 2013) will be imported

NHLBI ESP data update (Human)

Human NHLBI ESP data will be updated to version v.0.0.22.

Remove read_coverage table (all species)

The read_coverage table and associated API support will be removed.

There are only a few individuals across our resequencing data with read coverage data, much of which has been remapped between assemblies and may no longer be reliable.

Removing this will speed up code and clean up some of the web displays.

Release 74

New dbSNP imports (Cow, Human, Mouse)

dbSNP Build 138 data will be imported

New Sheep database (Sheep)

A sheep variation database will be created containing variants from dbSNP128 and the available genotyping chips.

HumanOmni5 imported (Human)

The list of variants assayed with the Illumina HumanOmni5 array will be imported as a variation_set and browser track.

Variation citation update (Human)

Variation citation data will be updated. Data mined by UCSC will be included for the first time. 

Structural variations (Zebrafish, Human, Mouse)

  • Update studies
  • Import new studies.

Import COSMIC variants (Human)

Import COSMIC's version 67.

Human phenotype data (Human)

Update phenotype data for most of the current phenotype sources

Schema change (all species)

Add clinical_significance column to variation_feature (copied from variation).

Add data_types column to source table.

Mouse phenotype data (Mouse)

We update our phenotype data for mouse with newly available data from the IMPC (International Mouse Phenotyping Consortium).

Update NHLBI ESP data for human (Human)

We import the new version v.0.0.21 of NHLBI ESP data.

Zebrafish knockout data (Zebrafish)

Knockout data from zfin.org will be imported into the variation database's phenotype schema.

Phenotypes from orthologs (all species)

The gene phenotype view will show phenotypes associated with orthologues of the current gene.

Release 73

New dbSNP imports (Zebrafish, Chicken, Rat, Pig)

dbSNP Build 138 data will be imported

Import of genotyping chip assay lists (Cow, Horse, Chicken)

Variant lists from the Affymetrix Axiom Chicken Genotyping Array and the Illumina EquineSNP50, BovineHD, BovineLD and BovineSNP50 arrays will be imported and made available as tracks in the browser and variation_sets for API access.

Add new value to evidence classification (Human)

A new evidence value 'ESP' will be added to our current list of classifications for summarising the data supporting a variant. The new evidence value indicates that the variant was dicovered in the NHLBI GO Exome Sequencing Project.

Schema changes (Human)

- add column phased_gt to genotype tables to indicate that data is phased

- add column year to publication table to store year of publication

Update ESP data (Human)

We update data from NHLBI GO Exome Sequencing Project (ESP) to EVS-v.0.0.20.

Import HGMD-PUBLIC (Human)

Import the HGMD-PUBLIC data from the release 2013.2, with regulatory data.

Import COSMIC variants (Human)

Import COSMIC's version 65.

Structural variations (Human, Mouse)

  • Update studies
  • Import new studies.

Odds ratio data (Human)

We add odds ratio data from the NHGRI GWAS catalog.

PhenCode import (Human)

Variant data from the PhenCode project will be imported and presented as a new variation_set and track

Mouse phenotype data (Mouse)

We add phenotype data for mouse from EuroPhenome, International Mouse Phenotyping Consortium and WTSI Mouse Genetics Project.

dbGaP phenotype data (Human)

Phenotype data associated with the dbSNP variants from dbGaP

Release 72

Sample schema redesign (all species)

We will improve the way individuals, strains and populations are stored in the variation database

Phenotype data (Human)

Add phenotype associations from OMIM and Orphanet to variation PhenotypeFeature schema.

Also add data from GIANT and MAGIC association studies.

Import HGMD-PUBLIC (Human)

Import the HGMD-PUBLIC data from the release 2013.1, with regulatory data.

Import COSMIC variants (Human)

Import COSMIC's version 64.

Structural variations (Cow, Dog, Human, Mouse, Pig)

  • Update studies
  • Import new studies.
  • Create a set for the hight-quality structural variant from the 1000 Genomes - phase 1 study (human, study estd199).
  • Add studies for pig.

Structural variation schema change (all species)

  • Sample data are moved from the phenotype_feature_attrib table to a new structural variation table.
  • Clinical significance data are moved from the phenotype_feature_attrib table to a new column "clinical_significance_attrib_id" in the table structural_variation.
  • New column "alias" in the structural_variation table.
  • New columns "study_id" and "length" (mainly for insertion) in the structural_variation_feature table .

Additional variation database changes (all species)

  • Store mulitple clinical annotation types in the variation table, replacing single attribute
  • Change the type of the column "description" to text, in the table study

Import of Mouse Genomes Project data (Mouse)

Import of the genotypes from the Mouse Genomes Project, SNP Release Version 3.

Import ClinVar (Human)

Import of ClinVar release version 20130226

Web display change (all species)

  • Add a form to search an individual in the "Individual genotypes" page.
  • Add a new page listing the publications where the variation has been cited.
  • Add new tracks for structural variations

Import EPMC data (multiple species)

Import variation data mined by Europe PMC

New variation database for gibbon (Gibbon)

We build a new variation database for gibbon.

Partial dbSNP import for Pig (Pig)

Basic information from dbSNP Build 138 will be imported

Release 71

Phenotype data (all species)

The phenotype schema and API will be redesigned and Rat eQTL data on current assembly will be imported.

Protein consequence data (multiple species)

Sift results will be provided for human, mouse, zebrafish, pig, cow, chicken, rat and dog.

Supporting evidence classification (all species)

A new evidence status will be produced for summarising the data supporting variants.

VCF exports (all species)

Variation data will be available for ftp download in VCF format in addition to the GVF normal exports for all supported species.

Chicken variants remapped (Chicken)

Chicken variants from dbSNP 131 will be mapped to the new assembly.

Additional variation database changes (all species)

- removal of `feature_idx index from regulatory_feature_variation and transcript_variation

- increase size of minor_allele in variation & variation_feature

- new study_variation table to allow the storage of PubMed IDs in studies against variants

Additional variation display changes (all species)

- structural variants to be coloured by class

- cell types to be listed for regulatory features on variant page

- new tracks for Human showing ClinVar variants

Pig variant record merging (Pig)

Additional pig variants will have records held under dbSNP and Illumina chip names merged according to information from http://www.animalgenome.org/repository/pig/

Import HGMD-PUBLIC (Human)

Import the HGMD-PUBLIC data from the release 2012.4, with regulatory data.

Import COSMIC variants (Human)

Import COSMIC's version 63.

Structural variations (Cow, Dog, Zebrafish, Human, Mouse)

  • Update data for COSMIC structural variants (Human).
  • Update studies
  • Import new studies.

Release 70

Cat and Rat remapped (Cat, Rat)

The Cat and Rat variation databases will be remapped to their respective new assemblies

New dbSNP imports (Cow, Mouse)

dbSNP release 137 will be imported for Cow and Mouse

Phenotype annotations (Human)

Updated the annotations from the following sources:

  • NHGRI GWAS catalog
  • EGA
  • OMIM
  • UniProt

Import HGMD-PUBLIC (Human)

Import the HGMD-PUBLIC data from the release 2012.3, with regulatory data.

Import COSMIC variants (Human)

Import COSMIC's version 61.

Structural variations (Cow, Human, Macaque, Mouse)

  • Update data for COSMIC structural variants (Human).
  • Update several studies (Human, Mouse, Cow, Macaque).
  • Import new studies (Human).

Structural variant consequences (all species)

Structural variants will have a table displaying their consequences relative to transcripts and regulatory features, similar to the existing "Genes and regulation" view for simple variations.

Flanking sequence table dropped (all species)

The flanking_sequence table will be dropped. A flag indicating if the dbSNP-provided flanking sequence matches the reference will be retained in the variation_feature table, along with a web view showing the reference sequence flanking the variant position.

Links are provided to view the original flanking sequence on the dbSNP website.

New regulatory_feature_variation tables (all species)

We have added two new tables: regulatory_feature_variation and motif_feature_variation. Both tables are similar to our transcript_variation table. The tables relate a single allele of a variation_feature to a regulatory_feature or a motif_feature.

Web development (all species)

Structural variation:

  • Add a phenotype page for structural variation and updated the structural variation menu.
  • Add icons for the structural variation portal.
  • Update the table in the "Gene and regulation" page by adding consequence types, coordinates and transcript coverage.
  • Improve the display for the COSMIC structural variant: now we use a "zap" drawing to display them on the Genome browser.

Frequency calculation flag (all species)

A flag will be added to the sample table for population-type entries to indicate that the individual genotypes from this population can be used to calculate allele and population genotype frequencies. These will be calculated on the fly in place of storing them in the database in the allele and population_genotype tables.

Clinical significance and global MAF in GVF dumps (all species)

Clinical significance and global minor allele frequency values will be included in GVF dump files

Release 69

Human dbSNP 137 import (Human)

Import of dbSNP Build 137 for human.

dbSNP QC (all species)

We fail variants with more than one mapping to the genome.

Import HGMD-PUBLIC (Human)

Import the HGMD-PUBLIC data from the release 2012.2, with regulatory data.

Phenotype annotations (Human)

Updated the annotations from the following sources:

  • NHGRI GWAS catalog
  • EGA
  • OMIM
  • UniProt

Import COSMIC release 60 (Human)

Import COSMIC's latest release.

Import ESP data set (Human)

Import the Exome Sequencing Project data set ESP6500.

Structural variations (Human, Mouse)

  • Import 1000 Genomes phase 1 structural variation set + other studies (Human).
  • Import a new mouse study.

Change to flanking sequence handling (all species)

As increasing numbers of newly released variants are position justified rather than supplied with flanking sequences, Ensembl will no longer hold any submitted flanks and will use genomic reference sequence to report these where required. Alignment quality data will be imported from dbSNP to indicate variants with flanking sequences which differ from the reference, allowing poorly matching sequences to be identified.  

Additional schema changes (all species)

Column name in table source will be restricted to 24 characters to be compatible with BioMart

Release 68

Mouse: Remap variations (Mouse)

The mouse database is updated to the new assembly GRCm38.

Dog: New dbSNP import and remap variations (Dog)

The new dbSNP data for dog (dbSNP131) is imported into the dog variation database. All variations in the database are remapped to the new dog assembly CamFam3.1.

Phenotype annotations (Human)

Updated the annotations from the following sources:

  • NHGRI GWAS catalog
  • EGA
  • OMIM
  • UniProt

 

Import COSMIC release 59 (Human)

Import COSMIC's latest release.

Schema changes (all species)

  • The type of the column study_type (study table) will be changed.
  • The column hgvs_coding is renamed to hgvs_transcript in the transcript_variation table.

HGVS annotation updated (all species)

HGVS annotation was updated to support variants in coding as well as noncoding transcripts.

Use SO terms as primary consequences (all species)

Sequence ontology terms are used as primary terms to describe consequences of variations.

Structural variations (multiple species)

The structural variations are imported from DGVa:

  • New structural variations for Cow, Horse and Zebrafish.
  • Updated structural variations and new studies for Human and Mouse.
  • Removed the structural variations for Dog, because of the new assembly (no remapping available).

Release 67

dbSNP 136 import (multiple species)

- Updates for Pig, Zebrafish, Rat, Chimpanzee and Orangutan

- New Variation database for Macaque

Phenotype annotations (Human)

Updated the annotations from the following sources:

  • NHGRI GWAS catalog
  • EGA
  • OMIM
  • UniProt

Import HGMD-PUBLIC (Human)

Import the HGMD-PUBLIC data from the release 2012.1, with regulatory data.

Structural variations (Human, Macaque, Pig)

  • Update structural variation data from DGVa for Human.
  • Add structural variation data from DGVa for Macaque.
  • Add COSMIC structural variations for Human, through DGVa.
  • Remove structural variation for Pig (new assembly).

Import COSMIC release 58 (Human)

Import COSMIC's latest release

Schema changes (all species)

  • Add index in the table structural_variation_association
  • Update the tables structural_variation and structural_variation_feature to store the COSMIC structural variants.
  • New table translation_tag and rebuilt the protein_function_prediction table.

1000 genomes data (Human)

Imported the genotypes of the 1000 Genomes - Phase 1

Release 66

Human dbSNP 135 import (Human)

Imports of the dbSNP Build 135 for human.

Schema changes (all species)

  • New columns in the table structural_variation_feature:
    • is_evidence, like in the structural_variation table
    • variation_set_id, like in the variation_feature table
  • Removed unused tables: 
    • variation_group
    • variation_group_variation
    • variation_group_feature
    • allele_group
    • allele_group_allele
    • httag
  • Updated the tagged_variation_feature table
  • Add coord_system table
    • add coord_system_id column to seq_region

Phenotype annotations (Human)

Updated the annotations from the following sources:

  • NHGRI GWAS catalog
  • EGA
  • OMIM
  • UniProt

Import COSMIC release 56 (Human)

Import COSMIC's latest release

Web display (Human)

  • Added new tracks for the structural variation sets (1000 Genomes sets).
  • New "Regulatory consequences" table in the Gene/Transcript page (renamed "Genes and regulation"). Now you can see the SIFT and PolyPhen scores in the Gene and Transcript consequences table.
  • Added the Clinical significance in the Variation summary panel

Import HGMD-PUBLIC (Human)

Import the HGMD-PUBLIC data from the release 2011.3

ENSSNP IDs (Human, Mouse)

ENSSNP IDs in human and mouse will no longer be imported and have been deleted from those two databases.

Mapping files between the ENSSNP IDs and dbSNP rsIDs are provided here:

  • Human: ftp://ftp.ebi.ac.uk/pub/databases/ensembl/snp/human/human_ENSEMBL_IDs.txt.gz
  • Mouse: ftp://ftp.ebi.ac.uk/pub/databases/ensembl/snp/mouse/mouse_ENSEMBL_IDs.txt.gz

Release 65

Human dbSNP 134 import (Human)

Imports of the dbSNP Build 134 for human.

 

Import new data types:

  • Global minor allele frequencies
  • Clinical significance
  • Suspect variants (will be failed with a new reason code)

Co-locating dbSNP variants will not be merged.

 

 

Schema changes (all species)

Changes in the structural variation tables:

  • Added features for the supporting evidences
  • Merged the structural_variation and supporting_structural_variation tables
  • Added phenotype and sample information
  • Added a failed_structural_variation table
  • Created a table to link the structural variants to their supporting evidences

Changes in the genotype tables: rebuilt most of the genotype tables

 

Changes to support the new data from dbSNP:

  • Added columns minor_allele, minor_allele_freq and minor_allele_count to the variation table
  • Added clinical_significance_attrib_id column to the variation table, and added new attributes to the attrib table to identify clinical significance under the attrib_type 'dbsnp_clin_sig'
  • Added a new failed description to identify variants marked as suspect by dbSNP (failed_description_id = 16)

 

Structural variations (Dog, Human, Mouse, Pig)

Updates data and adds new studies

Remapping Chimpanzee variations (Chimpanzee)

Remaps the chimp variations to the new assembly (CHIMP2.1.4).

Phenotype annotations (Human)

Updates from the following sources:

  • NHGRI GWAS catalog
  • EGA
  • OMIM
  • UniProt

Ancestral alleles (Chimpanzee, Orangutan)

Added ancestral alleles using Compara alignments

Import COSMIC release 55 (Human)

Import COSMIC's latest release

Protein function predictions for new human genes (Human)

We will do an 'update' run of the protein function prediction pipeline to compute predictions for new and updated human transcripts.

 

We will also attempt a complete new run using Compara alignments in place of SIFT and PolyPhen's own alignment pipelines, depending on how this goes we may release this set or the update set described above.

 

The attempt to use Compara alignments didn't work out this release (for a number of reasons), so we're going with the previous approach. We will investigate this further for future releases.

 

LRG variation (Human)

Submitted variation data for the gene CYBB (LRG_53)

Release 64

Cow dbSNP 133 import (Cow)

We imported dbSNP Build 133 for cow based on the UMD_3.1 assembly.

 

Schema changes (all species)

  • Schema changes for structural variations
    • Add a structural_variation_feature table: store the coordinates
    • Modification of the structural_variation table: remove the coordinates
  • Additional enum in variation source table for LSDBs

Web display updates (all species)

  • For the structural variation we have changed to using the same colours as NCBI.
  • We added a Phenotype panel (MIM diseases + variation annotations) in the Gene section

Updated consequences for transcript alleles (Human, Mouse)

The variation consequences were recalculated for human and mouse as a result of changes to the gene sets.

New phenotype page (all species)

New phenotype page added  - see here for human CAV3. Let us know what you think at helpdesk@ensembl.org.

LRG variant import (Human)

There was an import of variation data from LRGs for CRTAP, FKBP10, LEPRE1 and PPIB. These come from an LSDB for Osteogenesis Imperfecta.

Phenotype annotations (Human)

All (non-structural) somatic mutations were imported from the latest release 54 of COSMIC, increasing the number of mutations imported from 46080 in release 63 to 49692 for release 64. There have been some minor changes in COSMIC sample names. Variation data was updated with new human phenotype annotations from COSMIC, OMIM, NHGRI GWAS catalog, UniProt and EGA.

Structural variation (Dog, Human, Mouse, Pig)

  • Update structural variation data from DGVa for Human, Mouse, Dog and Pig.

New LRG alignments (all species)

New LRG alignments added - see here for an example on LRG_53.

Release 63

Updates to human phenotype associations (Human)

OMIM, UniProt, NHGRI GWAS catalog, HGMD mutations, COSMIC

New mouse variation database (Mouse)

Based on dbSNP 132

Add attrib_id column to variation_set (all species)

An attrib_id column is added to variation_set in order to be able to provide general and human-friendly names to variation sets without breaking the web display.

Update structural variation data from DGVa (Dog, Human, Macaque, Mouse, Pig)

Update done only for Human, Mouse, Pig and Dog.

Schema changes (all species)

# structural variation schema changes:

- Change the columns name from bound_start to inner_start and bound_end to inner_end

- Add a column for validation status

- Change the column class to class_attrib_id, using more detailled SO terms.


# moved failed descriptions into attribute table <-- Postponed

LRG data (Human)

import LRG variant data

add LRG consequences to the database

New individual genotypes (Human)

Individual genotypes from Penn State University:

  • Han Chinese Individual (YanHuang Project)
  • Seong-Jin Kim (SJK, GUMS/KOBIC)
  • Anonymous Irish Male
  • Individual from the Extinct Palaeo-Eskimo Saqqaq (Saqqaq Genome Project)
  • Individual from the Extinct Palaeo-Eskimo Saqqaq, high confidence SNPs (Saqqaq Genome Project)
  • Anonymous Korean individual, AK1 (Genomic Medicine Institute) : Individual genotype
  • Misha Angrist (Personal Genome Project)
  • Henry Louis Gates Jr (Personal Genome Project)
  • Henry Louis Gates Sr (Personal Genome Project)
  • Rosalynn Gill (Personal Genome Project)
  • Marjolein Kriek (Leiden University Medical Centre)
  • Stephen Quake (Stanford)

update variation consequences (Cow, Zebrafish, Human)

update variation consequences on human, zebrafish and cow due to new gene sets

Update ancestral alleles (Human)

Update of the ancestral allele annotation for variations to primarily use data from the Ensembl Compara Ortheus calls. The ancestral allele data source priority is::

1. Compara high confidence

2. Compara low confidence

3. Ancestral allele calls reported by Dr Jim Mullikin (http://www.ncbi.nlm.nih.gov/books/NBK44409/)

Pie graphs to display alleles frequencies by population (all species)

Web display: Add pie graphs in the Variation -> Population genetics pages, for the 1000 genomes populations

Variation Effect Predictor 2.1 (all species)


    * option to filter the output based on frequencies in 1000 Genomes populations
    * new US East database server available for querying
    * ability to use local file cache in place of or alongside connecting to an Ensembl database
    * new "standalone" mode does not depend on API installation or network connection
    * significant improvements to speed of script
    * whole-genome mode now default (no disadvantage for smaller datasets)
    * improved status output with progress bars
    * regulatory region consequences now reinstated and improved
    * modification to output file - Transcript column is now Feature, and is followed by a Feature_type column

Release 62

New variation consequences (all species)

New variation consequences due to a schema change linking consequences to allele and transcript rather than just to a variation and transcript

HGVS coordinates stored in database (all species)

HGVS coordinates for variant alleles will be pre-calculated and stored in the database. These were previously calculated on the fly.

New variation database (Human)

The human variation database will be built fresh from dbSNP release 132 due to data updates by dbSNP.

Data import/update from external sources (Human)

Allele frequencies from 1000 Genomes Project. Variation submissions on LRGs from UniProt. Structural variation data from DGVa. Somatic mutation data from Cosmic. Variation phenotype data from OMIM, NHGRI, UniProt and EGA. Variation synonyms from UniProt.

Data import/update from external sources (Dog, Mouse, Pig)

Structural variation data from DGVa.

patch_61_62_a: Meta schema version (all species)

Meta schema version update

patch_61_62_b: Alter failed_variation (all species)

Drop the subsnp_id column from failed_variation

patch_61_62_c: Introduce failed_allele table (all species)

Add a table to store failed alleles

patch_61_62_d: Add type column to source table (all species)

Introduce a type column (enum) to indicate the type of a source

patch: Table to store study data (all species)

A new table to store description of studies will be introduced and foreign keys to this table will be introduced in variation_annotation and structural_variation tables.

patch: Rationalize data type for allele columns (all species)

The data type of allele columns in e.g. allele, variation and variation_feature will be harmonized to use varchar.

patch: Table to store supporting structural variations (all species)

A new table to store supporting structural variations will be introduced

patch: Re-design of the transcript_variation table (all species)

Variation consequences will be stored by allele instead of by variation. The transcript_variation table will be modified to accommodate this. In addition, HGVS coordinates will be stored as well.

patch: Drop somatic column from source table (all species)

The somatic column will be dropped from source and instead introduced in the variation table.

API changes (all species)

The API will be updated to accommodate schema patches.

SIFT and PolyPhen consequences (Human)

Non-synonymous coding consequences evaluated by SIFT and PolyPhen will be calculated

Add a variation set for variations flagged as failed (all species)

Variations that have been flagged as failed will be grouped in a variation set named 'Failed variations'

Release 61

Data (all species)

- import dbSNP 132 (human), zebrafinch, Tetraodon and new variation databases for cat, opossum

- import dbSNP for further species if available in time (mouse, rat, zebrafish)

- import new release of HGMD database

- corrections to Affymetrix CNV probe data

- import PorcineSNP60 BeadChip

- update of zebrafish variation consequences for new gene build

- variations will now be flagged and retained instead of failed and deleted for species with a new import of dbSNP

- produce GVF file dumps of all variants by species


- for Tetraodon, the Ensembl-assigned variation names ('ENSTNISNP...') used prior to release 61 have been replaced with the dbSNP-assigned rsIds. If the
mapping between the Ensembl and rsIDs is required, there is a tab-separated file available for download on the ftp-site:
 ftp://ftp.ebi.ac.uk/pub/software/ensembl/snp/tetraodon/Tetraodon_Ensembl_SNP_id_to_dbSNP_rsid.txt.gz.

Schema and API changes (all species)

- import dbSNP 132 (human including 1000 genomes data, zebrafinch) and also new variation species (cat, opossum, Tetraodon)
- import new release of HGMD database
- import of new Cosmic (somatic mutation) data
- import Illumina PorcineSNP60 BeadChip for pig
- import OMIM, NHGRI and UniProt phenotype data for variants
- import new DGVa data sets for structural variants
- import new variants submitted using LRG sequences that have been accessioned by dbSNP
- move CNV probe data to structural variation table
- update of zebrafish and mouse variation consequences for new gene build
- produce GVF file dumps of all variants and their consequence by species
- variants from 1kG unique to Jan09 submission to be flagged as "withdrawn by 1000 genomes"
- new set for OMIM data, failed variants
- split InDel variation class into insertions and deletions

- For speed purposes, store the variation set membership of each variation feature in the variation_feature table


API fix:
to correct TranscriptAlleles.pm translated transcript sequence to include the * stop codon at the end of the string so that STOP_LOSTs are not missed.

Release 60

data (all species)

update of UniProt identifier links including phenotype information
import of new information from NHGRI
import of new data sets for structural variants from DGVa
import of an expanded data set for all short somatic sequence variants from COSMIC
GVF (Genome Variation Format) dumps for all variants
update of variant consequences for new human gene set

update of variant consequences for new zebrafish assembly and gene setimport new set of 150,000 Zebrafish variants

import of variants submitted on LRG_7 from Uniprot

 

API and schema change (all species)

schema change for ensembl genomes to store the population size for each frequency calculation

Release 59

Variation data (all species)

Import of dbSNP 131 for Human and calculation of variation consequences and tag SNPs
Import of the CNV probes from the Affymetrix Genome 6 array
Import new LRG sequences
Correct population sizes (currently 1 or 0 only)
Correct zebrafish display strain defaults
somatic mutation substitutions from 70 genes involved in cancer (COSMIC) Update UniProt and DGVa data

Update variation consequences (in transcript_variation table ) for new gene sets in: cow, horse, chicken, platypus, mouse, orangutan

Variation API (all species)

Add a new 1000 Genomes set
modify call for nearest gene to variation feature
HGVS nomenclature on proteins

Changes to API and schema to separate somatic and germline variations

Variation schema (all species)


New enums for 1000 Genomes and also for precious SNPs in the validation status column
add schema type key into the meta table

Saccharomyces cerevisiae variation database (all species)

Variation database containing data from Saccharomyces Resequencing Project at the Sanger Institute. Present in Ensembl Genomes release 5 as schema 58. Will be patched to 59.

Drosophila melanogaster (all species)

Drosophila melanogaster variation database based on DPGP 1.0

variation displays (all species)

The LRG display now has tables listing differences between the LRG and the reference sequence.
The context panel of the variation page has been updated to show if the variant overlaps a regulatory region, structural variants or conserved region

Future Plans

Read about our future plans on our blog!