Fugu assembly and gene annotation


This site presents version 4 of the Fugu genome, released in June 2005 by the International Fugu Genome Consortium. Takifugu rubripes has a very compact genome, with less than 15% consisting of dispersed repetitive sequence, which makes it ideal for gene discovery.

The latest assembly includes 7,213 scaffolds, constituting 390 Mb of the genome, and the mitochondrion. 90% of the genome is on 1118 scaffolds. 74 scaffolds are larger than 1 Mb each and the largest scaffold is 7 Mb. Please refer to the Fugu Project webpage for more details of the sequencing effort.

Other assemblies

Gene annotation

This is the first full Ensembl genebuild of this genome. It was carried out in an incremental fashion, using fugu proteins initially then adding in other fish, mammal, vertebrate and finally non-vertebrate protein sequences.

More information

General information about this species can be found in Wikipedia.



AssemblyFUGU 4.0, Jun 2005
Database version80.4
Base Pairs393,312,790
Golden Path Length

The golden path is the length of the reference assembly. It consists of the sum of all top-level sequences in the seq_region table, omitting any redundant regions such as haplotypes and PARs (pseudoautosomal regions).

Genebuild byEnsembl
Genebuild methodFull genebuild
Genebuild startedNov 2007
Genebuild releasedMar 2008
Genebuild last updated/patchedMay 2010

Gene counts

Coding genes

Genes and/or transcript that contains an open reading frame (ORF).

Non coding genes703
Small non coding genes

Small non coding genes are usually fewer than 200 bases long. They may be transcribed but are not translated. In Ensembl, genes with the following biotypes are classed as small non coding genes: miRNA, miscRNA, rRNA, scRNA, snlRNA, snoRNA, snRNA, and also the pseudogenic form of these biotypes. The majority of the small non coding genes in Ensembl are annotated automatically by our ncRNA pipeline. Please note that tRNAs are annotated separately using tRNAscan. tRNAs are included as 'simple fetaures', not genes, because they are not annotated using aligned sequence evidence.

Misc non coding genes10

A pseudogene shares an evolutionary history with a functional protein-coding gene but it has been mutated through evolution to contain frameshift and/or stop codon(s) that disrupt the open reading frame.

Gene transcriptsNucleotide sequence resulting from the transcription of the genomic DNA to mRNA. One gene can have different transcripts or splice variants resulting from the alternative splicing of different exons in genes.48,706


Genscan gene predictions29,699