EnsemblEnsembl Home

About Ensembl Variation

The Ensembl Variation database stores areas of the genome that differ between individual genomes ("variants") and, where available, associated disease and phenotype information.
There are different types of variants for several species:

  • single nucleotide polymorphisms (SNPs)
  • short nucleotide insertions and/or deletions
  • longer variants classified as structural variants (including CNVs)

Below is a list of the most common types of variants stored in the Ensembl Variation databases:

Sequence variants

Type Description Example (Reference / Alternative)
SNP Single Nucleotide Polymorphism Ref:
Insertion Insertion of one or several nucleotides Ref:
Deletion Deletion of one or several nucleotides Ref:
Indel An insertion and a deletion, affecting 2 or more nucleotides Ref:
Substitution A sequence alteration where the length of the change in the variant is the same as that of the reference. Ref:

Structural variants

Type Description Example (Reference / Alternative)
CNV Copy Number Variation: increases or decreases the copy number of a given region Reference:
"Gain" of one copy:

"Loss" of one copy:
Inversion A continuous nucleotide sequence is inverted in the same position Reference:
Translocation A region of nucleotide sequence that has translocated to a new position Reference:

These are only some example of variant types you can find in Ensembl. The full list is available here.

We predict the effects of variants on the Ensembl transcripts and regulatory features for each species. You can run the same analysis on your own data using the Variant Effect Predictor.
These data are integrated with other data sources in Ensembl, and can be accessed using the API (see links on the right handside menu) or website.

Here are some webpage examples:

Perl API

A comprehensive Perl Application Programme Interface (API) provides efficient access to the Ensembl Variation database.

MySQL database

VCF import

The import_vcf.pl script populates an Ensembl Variation database from a VCF (Variant Call Format) file. A description of the VCF file format can be found on the 1000 Genomes project website.
The script can either populate a database from scratch, or add data to an existing database.


  • Fiona Cunningham, M. Ridwan Amode, Daniel Barrell, Kathryn Beal, Konstantinos Billis, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Stephen Fitzgerald, Laurent Gil, Carlos García-Girón, Leo Gordon, Thibaut Hourlier, Sarah E. Hunt, Sophie H. Janacek, Nathan Johnson, Thomas Juettemann, Andreas K. Kähäri, Stephen Keenan, Fergal J. Martin, Thomas Maurel, William McLaren, Daniel N. Murphy, Rishi Nag, Bert Overduin, Anne Parker, Mateus Patricio, Emily Perry, Miguel Pignatelli, Harpreet Singh Riat, Daniel Sheppard, Kieron Taylor, Anja Thormann, Alessandro Vullo, Steven P. Wilder, Amonida Zadissa, Bronwen L. Aken, Ewan Birney, Jennifer Harrow, Rhoda Kinsella, Matthieu Muffato, Magali Ruffier, Stephen M.J. Searle, Giulietta Spudich, Stephen J. Trevanion, Andy Yates, Daniel R. Zerbino and Paul Flicek
    Ensembl 2015
    Nucleic Acids Research
    doi: 10.1093/nar/gku1010

  • McLaren W, Pritchard B, Rios D, Chen Y, Flicek P, Cunningham F.
    Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor.
    BMC Bioinformatics 26(16):2069-70(2010)

  • Rios D, McLaren WM, Chen Y, Birney E, Stabenau A, Flicek P, Cunningham F.
    A Database and API for variation, dense genotyping and resequencing data
    BMC Bioinformatics 11:238 (2010)

  • Chen Y, Cunningham F, Rios D, McLaren WM, Smith J, Pritchard B, Spudich GM, Brent S, Kulesha E, Marin-Garcia P, Smedley D, Birney E, Flicek P.
    Ensembl Variation Resources
    BMC Genomics 11(1):293 (2010)