Orangutan assembly and gene annotation


This site presents the 6X whole genome shotgun assembly from a female Sumatran orangutan (Pongo pygmaeus abelii) named Susie, housed at the Gladys Porter Zoo (Brownsville, TX). The primary donor-derived reads were assembled using PCAP (Huang, 2006) using stringent parameters; by aligning the orangutan genome against the human genome, it was possible to identify interchromosomal cross-overs and thus eliminate global mis-assemblies larger than 50kb.

Of the 3.09Gb of total sequence, 3.08Gb are ordered and oriented along the chromosomes. Gap sizes between supercontigs were estimated based on their size in human, with a maximum allowed gap size of 30kb.

The genome assembly represented here corresponds to GCA_000001545.1

Gene annotation

Due to the high sequence similarity to the human genome, the Orangutan genebuild was based on a projection of human gene structures. The projections were made through chained whole genome BLASTz alignments. These projected genes were combined with orangutan-specific proteins, and additional human genes were added using exonerate where the projection was unable to make satisfactory gene models. UTRs were added using orangutan-specific ESTs and cDNAs as well as human cDNAs.

More information

General information about this species can be found in Wikipedia.



AssemblyPPYG2, INSDC Assembly GCA_000001545.1, Sep 2007
Database version82.1
Base Pairs3,109,347,532
Golden Path Length3,446,771,396
Genebuild byEnsembl
Genebuild methodProjection build
Genebuild startedOct 2007
Genebuild releasedMar 2008
Genebuild last updated/patchedAug 2012

Gene counts

Coding genes20,424
Non coding genes6,996
Small non coding genes5,796
Misc non coding genes1,200
Gene transcripts29,447


Genscan gene predictions53,999
Short Variants10,004,323