EnsemblEnsembl Home

About Ensembl Variation

The Ensembl Variation database stores areas of the genome that differ between individual genomes ("variants") and, where available, associated disease and phenotype information.
There are different types of variants for several species:

  • single nucleotide polymorphisms (SNPs)
  • short nucleotide insertions and/or deletions
  • longer variants classified as structural variants (including CNVs)

Below is a list of the most common types of variants stored in the Ensembl Variation databases:

Sequence variants

Type Description Example (Reference / Alternative)
SNP Single Nucleotide Polymorphism Ref:
...TTGACGTA...
Alt:
...TTGGCGTA...
Insertion Insertion of one or several nucleotides Ref:
...TTGACGTA...
Alt:
...TTGATGCGTA...
Deletion Deletion of one or several nucleotides Ref:
...TTGACGTA...
Alt:
...TTGGTA...
Indel An insertion and a deletion, affecting 2 or more nucleotides Ref:
...TTGACGTA...
Alt:
...TTGGCTCGTA...
Substitution A sequence alteration where the length of the change in the variant is the same as that of the reference. Ref:
...TTGACGTA...
Alt:
...TTGTAGTA...

Structural variants

Type Description Example (Reference / Alternative)
CNV Copy Number Variation: increases or decreases the copy number of a given region Reference:
"Gain" of one copy:

"Loss" of one copy:
Inversion A continuous nucleotide sequence is inverted in the same position Reference:
Alternative:
Translocation A region of nucleotide sequence that has translocated to a new position Reference:
Alternative:

These are only some example of variation types you can find in Ensembl. The full list is available here.


We predict the effects of variants on the Ensembl transcripts and regulatory features for each species. You can run the same analysis on your own data using the Variant Effect Predictor.
These data are integrated with other data sources in Ensembl, and can be accessed using the API (see links on the right handside menu) or website.

Here are some webpage examples:


Perl API

A comprehensive Perl Application Programme Interface (API) provides efficient access to the Ensembl Variation database.


VCF import

The import_vcf.pl script populates an Ensembl Variation database from a VCF (Variant Call Format) file. A description of the VCF file format can be found on the 1000 Genomes project website.
The script can either populate a database from scratch, or add data to an existing database.


References

  • Flicek P, Amode MR, Barrell D, Beal K, Billis K, Brent S, Carvalho-Silva D, Clapham P, Coates G, Fitzgerald S, Gil L, García-Girón C, Gordon L, Hourlier T,
    Hunt S, Johnson N, Juettemann T, Kähäri AK, Keenan S, Kulesha E, Martin FJ, Maurel T, McLaren WM, Murphy DN, Nag R, Overduin B, Pignatelli M, Pritchard B,
    Pritchard E, Riat HS, Ruffier M, Sheppard D, Taylor K, Thormann A, Trevanion SJ, Vullo A, Wilder SP, Wilson M, Zadissa A, Aken BL, Birney E, Cunningham F,
    Harrow J, Herrero J, Hubbard TJ, Kinsella R, Muffato M, Parker A, Spudich G, Yates A, Zerbino DR, Searle SM.
    Ensembl 2014
    Nucleic Acids Research 42(1):D749-55 (2014)
    doi:10.1093/nar/gkt1196

  • McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F.
    Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor.
    BMC Bioinformatics 26(16):2069-70(2010)
    doi:10.1093/bioinformatics/btq330

  • Rios D, McLaren WM, Chen Y, Birney E, Stabenau A, Flicek P, Cunningham F.
    A Database and API for variation, dense genotyping and resequencing data
    BMC Bioinformatics 11:238 (2010)
    doi:10.1186/1471-2105-11-238

  • Chen Y, Cunningham F, Rios D, McLaren WM, Smith J, Pritchard B, Spudich GM, Brent S, Kulesha E, Marin-Garcia P, Smedley D, Birney E, Flicek P.
    Ensembl Variation Resources
    BMC Genomics 11(1):293 (2010)
    doi:10.1186/1471-2164-11-293