Platypus assembly and gene annotation


The platypus (Ornithorhynchus anatinus) genome of a female nicknamed "Glennie" (collected at the Upper Barnard River on Glen Rock Station, New South Wales) was sequenced to a total of 6x whole genome coverage. The sequencing strategy we utilized, combined whole genome shotgun plasmid, fosmid and BAC end sequences. The combined sequence reads were assembled using the PCAP software (Genome Res. 13(9):2164-70 2003). This draft sequence assembly submitted to Genbank is referred to as Ornithorhynchus_anatinus-5.0. The database now contains the longer range mapping of the sequence onto Ultracontigs and Chromosomes. Although some of the Supercontigs are mapped to chromosomes, these only represent 21% of the platypus DNA, so we have not emphasised a chromosomal view of platypus for the current release.Future improvements to the platypus draft sequence assembly will be dependent on the availability of funding and improvements to existing assembler software. Funding for the sequencing of the platypus genome was provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH).

The genome assembly represented here corresponds to GCF_000002275.2

Gene annotation

The gene set for Platypus was built using a modified version of the standard Ensembl genebuild pipeline, using available cDNA evidence to add UTRs and improve the protein-based gene models. However, this initial geneset was limited by the lack of species-specific evidence. The gene models were assessed by generating sets of potential orthologs to genes from other mammalian species and chicken. Potentially missing predictions and partial gene predictions were identified by examining the orthologs, and exonerate was to align orthologous human and chicken peptides in order to build new gene models. We have now extended the initial gene set using recently released cDNA data from 454 sequencing, plus additional annotation from the Oxford Functional Genomics group. These data have enabled us both to clarify existing models and to add additional transcripts.

More information

General information about this species can be found in Wikipedia.



AssemblyOANA5, INSDC Assembly GCF_000002275.2, Dec 2005
Database version82.1
Base Pairs1,917,748,604
Golden Path Length2,073,148,626
Genebuild byEnsembl
Genebuild methodFull genebuild
Genebuild startedJan 2007
Genebuild releasedAug 2007
Genebuild last updated/patchedAug 2012

Gene counts

Coding genes21,698
Non coding genes3,871
Small non coding genes3,844
Misc non coding genes27
Gene transcripts28,002


Genscan gene predictions133,723
Short Variants1,487,771