Ensembl Variant Effect Predictor Input form

When you reach the Ensembl VEP web interface, you will be presented with a form to enter your data and alter various options.

Note that the listed options change depending on the selected species.

Data input

First select the correct species for your data. Ensembl hosts many vertebrate genomes; genomes for plants, protists and fungi can be found at Ensembl Genomes.
You can optionally choose a name for the data you upload - this can make it easier for you to identify jobs and files that you have uploaded to the Ensembl VEP at a later point.
You have three options for uploading your data:
- File upload - click the "Choose file" button and locate the file on your system. Please ensure your files are sorted by location if you are using VCF or other input formats using location. This greatly improves speed of analysis
- Paste file - simply copy and paste the contents of your file into the large text box
- File URL - point the Ensembl VEP to a file hosted on a publically accessible address. This can be either a http:// or ftp:// address.
Once you have uploaded some data, you can select it as the input for future jobs by choosing the data from the drop down menu.

The format of your data is automatically detected; see the examples or the input format documentation.
For pasted data you can get an instant preview of the results of your first variant by clicking the button that appears when you paste your data. This quickly shows you the consequence type, the IDs of any overlapping variants, genes, transcripts and regulatory features, as well as SIFT and PolyPhen predictions. To see the full results set submit your job as normal.
For some species you can select which transcript database to use. The default is to use Ensembl transcripts, which offer the most rich annotation through Ensembl VEP.

GENCODE Basic is a subset of the GENCODE gene set, and is intended to provide a simplified, high-quality subset of the GENCODE transcript annotations that will be useful to the majority of users. GENCODE Basic includes all genes in the GENCODE gene set, with a representative subset of the transcripts (splice variants).

GENCODE Primary is a new transcript subset which covers all human exons in a minimal set of transcripts. This aims to enable annotation of all potential variant consequences without duplication across multiple transcripts.

You can also select to use RefSeq transcripts from the otherfeatures database; note though that these transcripts are simply aligned to the reference genome and the database is missing much of the annotation found when using the main Ensembl database (e.g. protein domains, CCDS identifiers).

Identifiers

Ensembl VEP can provide additional identifiers for genes, transcripts, proteins and variants.

Gene symbol
Add the gene symbol for the gene to the output. This will typically be, for example, the HGNC identifier for genes in human. Equivalent to --symbol in the Ensembl VEP script.
Transcript version
Add the transcript version to the transcript identifier. Equivalent to --transcript_version.
CCDS
Add the Consensus CDS transcript identifier where available. Equivalent to --ccds.
Protein
Add the Ensembl protein identifer (ENSP). Equivalent to --protein.
UniProt
Add identifiers for translated protein products from three UniProt-related databases (SWISSPROT, TREMBL and UniParc). Equivalent to --uniprot.
HGVS
Generate HGVS identifiers for your input variants relative to the transcript coding sequence (HGVSc) and the protein sequence (HGVSp). Equivalent to --hgvs.

Variants and frequency data

Additional annotations

Predictions

Filtering options

Ensembl VEP allows you to pre-filter your results e.g. by MAF or consequence type. Note that it is also possible to perform equivalent operations on the results page for Ensembl VEP, so if you aren't sure, don't use any of these options!

By frequency
Filter variants by minor allele frequency (MAF). Two options are provided:
- Exclude common variants
  Filter out variants that are co-located with an existing variant that has a frequency greater than 0.01 (1%) in the 1000 Genomes global population. Equivalent to --filter_common in the Ensembl VEP script.
- Advanced filtering
  Enabling this option allows you to specify a population and frequency to compare to, as well whether matching variants should be included or excluded from the results.
Return results for variants in coding regions only
Exclude variants that don't fall in a coding region of a transcript. Equivalent to --coding_only.
Restrict results
For many variants Ensembl VEP will report multiple consequence types - typically this is because the variant overlaps more than one transcript. For each of these options Ensembl VEP uses consequence ranks that are subjectively determined by Ensembl. This table gives all of the consquence types predicted by Ensembl, ordered by rank. Note that enabling one of these options not only loses potentially relevant data, but in some cases may be scientifically misleading. Options:
- Show one selected consequence
  Pick one consequence type across all those predicted for the variant; the output will include transcript- or feature-specific information. Consequences are chosen by the canonical, biotype status and length of the transcript, along with the ranking of the consequence type according to this table. This is the best method to use if you are interested only in one consequence per variant. Equivalent to --pick.
- Show one selected consequence per gene
  Pick one consequence type for each gene using the same criteria as above. Note that if a variant overlaps more than one gene, output for each gene will be reported. Equivalent to --per_gene.
- Show only list of consequences per variant
  Give a comma-separated list of all observed consequence types for each variant. No transcript-specific or gene-specific output will be given. Equivalent to --summary.
- Show most severe per variant
  Only the most severe of all observed consequence types is reported for each variant. No transcript-specific or gene-specific output will be given. Equivalent to --most_severe.

Advanced options

The Ensembl VEP web interface allows you to use/setup advanced options:

Buffer size
By default Ensembl VEP process the variants by blocks of 5000 (i.e. what we call "buffer size").
In some cases, reducing the size of the blocks (buffer size) could prevent memory issues for large Ensembl VEP queries (e.g. use of regulatory data, many plugins or custom annotations).
This is why the maximum buffer size is automatically set to 500 on the Ensembl VEP Web interface when the "Regulatory data" option is selected.
Right align variants prior to consequence calculation
By default Ensembl VEP performs consequence calculation at the given input coordinates.
Optionally, Ensembl VEP can shift insertions and deletions found within repeated regions as far as possible in the 3' direction, normalising output.

Jobs

Once you have clicked Run, your input will be checked and submitted to the Ensembl VEP as a job. All jobs associated with your session or account are shown in the Recent Tickets table. You may submit multiple jobs simultaneously.

The Jobs column of the table shows the current status of the job.

Queued - your job is waiting to be submitted to the system
Running - your job is currently running
Done - your job is finished - click the [View results] link to be taken to the results page
Failed - there is a problem with your job - click the magnifying glass icon to see more details

The following actions are available for each job:

Save icon: save the job (you need to login with an Ensembl account).
Edit icon: resubmit a job (for example, to slightly tweak the data or parameters before re-running).
Magnifying glass icon: see summary of the options that you selected for your Ensembl VEP job, as well as data versions associated with this run.
Share icon: display URL to share with other users. You can also disable URL sharing here.
Trash can icon: delete a job.

Show entries

Show/hide columns

Filter

Analysis	Jobs	Submitted at (GMT)
Ensembl Variant Effect Predictor	Ensembl VEP analysis of pasted data in Bos_taurusDone[View results]	2023051117004011/05/2023, 17:00	Share ticket via URL
Ensembl Variant Effect Predictor	Ensembl VEP analysis of pasted data in Ovis_ariesDone[View results]	2023051116553211/05/2023, 16:55	Share ticket via URL This will delete the following job permanently: Ensembl VEP analysis of pasted data in Homo_sapiens
Ensembl Variant Effect Predictor	Ensembl VEP analysis of pasted data in Homo_sapiensFailed	2023042009570511/05/2023, 16:54	Share ticket via URL
Ensembl Variant Effect Predictor	Ensembl VEP analysis of pasted data in Homo_sapiensRunning	2023042009570511/05/2023, 16:51	Share ticket via URL
Ensembl Variant Effect Predictor	Ensembl VEP analysis of pasted data in Homo_sapiensQueued	2023042009570511/05/2023, 16:49	Share ticket via URL

Ensembl Variant Effect Predictor Input form

Data input

Identifiers

Identifiers

Variants and frequency data

Variants and frequency data

Additional annotations

Gene tolerance to change

Transcript annotation

Protein annotation

Functional effect

Regulatory data

Regulatory impact

Phenotype data and citations

Predictions

Pathogenicity predictions

Splicing predictions

Conservation

Filtering options

Filters

Advanced options

Advanced options

Jobs

About Us

Get help

Our sister sites

Follow us