Orthologues inferred from gene trees are determined using all species in that particular database, i.e. all the (mostly) chordates in Ensembl, all the fungi in Ensembl Fungi, all the plants in Ensembl plants, all the metazoa in Ensembl Metazoa, all the protists in Ensembl Protists, or all the species in the Pan-Compara set for Pan-Compara orthologues in Ensembl Genomes. A detailed description of the method is provided here.
Unaligned sequences (nucleotide and/or amino acids) of orthologous genes can be exported in FASTA format by clicking on Sequence export. The Compara API and BioMart can also be used to export orthologues.
Species are grouped by clades in the top table, such as Primate, Rodents, and Fish. By default, the full list of orthologues is shown below the table. Click on Show details to display only the orthologues for species in one clade.
The number of species for each orthologue type is shown in the top table. Orthologue types are assigned by comparing two species, and are as follows:
- 1-to-1 orthologues: only one copy is found in each species
- 1-to-many orthologues: one gene in one species is orthologous to multiple genes in another species
- Many-to-many orthologues: multiple orthologues are found in both species
Orthologues are defined in Ensembl as genes for which the most common ancestor node is a speciation event. These ancestral speciation events are represented by blue nodes in the gene trees.
Possible orthologues are homologues between species where the common ancestor is a weakly supported duplication event. Although they should be called paralogues according to the Compara rules, the low confidence on the duplication node might suggest an error in the phylogenetic reconstruction. We list these cases here as they might be real orthologues, especially in cases where no better orthologue is found.
List of selected orthologues
The list of orthologues underneath the top table shows the species, the orthologue type, the dN/dS value (if calculated), the Ensembl gene ID and name, links to other views, the Target %ID and the Query %ID, the Gene Order Conservation (GOC) score, the Whole Genome Alignment (WGA) coverage, and an indication of confidence of orthology.
If you are searching for a gene in human, for example, and looking for its homologue in another species such as mouse, the Query %ID refers to the percentage of the query sequence (human) that matches to the homologue (the mouse protein). Target %ID refers to the percentage of the target sequence (mouse) that matches to the query sequence (human).