Advanced notice of Ensembl Tools maintenance
Please note that due to planned maintenance, Ensembl Tools will be unavailable from Tuesday, September 9, at 08:30 AM BST until Wednesday, September 10, at 08:30 AM BST.
Jobs submitted during the maintenance window will be queued and processed once maintenance is complete. We apologise for the inconvenience.

External references

Ensembl pulls in links from other gene, transcript, protein and RNA databases to the same features in Ensembl, allowing you to easily navigate to see your genes of interest elsewhere. We do not guarantee a 100% match between Ensembl features and external references, instead we are looking to find the best match between databases.

We have four methods of mapping between databases. We employ them according to the following priority:

  1. Direct We use a third party to agree the mapping between the Ensembl feature and the external feature.
  2. Location overlap We match up the genomic location of the feature between Ensembl and the external database.
  3. Perfect matches We compare sequences to find perfect matches between databases. This is not done using the sequences themselves, rather using a checksum, which is a shortened code that represents the sequence. The checksum will only match if the sequence match is perfect.
  4. Alignment We use Exonerate to align the sequences between the Ensembl features and the external features. We select the top hit or hits that meet a minimum threshold of identity. The alignment and the percentage identity is maintained.

We also use indirect links, using other databases as an intermediary. For example, we use direct links and alignment to map to Uniprot proteins, then use their links out to PDBe protein structures.