EnsemblEnsembl Home

Variation database documentation

Database schema diagram

The variation database schema diagram is available in a PDF file: Variation Schema Diagram

Schema documentation

The tables of the database are described in the Variation Schema Documentation page.

Specifications

  • Database management system: MySQL
  • Storage engine: MyISAM

Database load

The variations databases can be loaded using dumped files from the Ensembl FTP.

For Ensembl 61, it takes a couple of minutes to load the largest tables for Human on our servers, e.g:

TableLoading time
variation24 minutes
variation_feature13 minutes
flanking_sequence3 minutes
compressed_genotype_single_bp13 minutes
population_genotype30 minutes

The load of the largest table in the Human variation database (allele table) takes almost 3 hours.

See below some settings of our server:

VariableValue
myisam_data_pointer_size6
myisam_max_sort_file_size9223372036853727232
myisam_mmap_size18446744073709551615
myisam_recover_optionsOFF
myisam_repair_threads1
myisam_sort_buffer_size67108864
myisam_stats_methodnulls_unequal
myisam_use_mmapOFF

Loading from a VCF file

Ensembl provides a script to populate a variation database schema with data from a VCF file. See documentation for details.


Ensembl Software Support

Ensembl is an open project and we would like to encourage correspondence and discussions on any subject on any aspect of Ensembl.
Please see the Ensembl Contacts page for suitable options getting in touch with us.