C.intestinalis assembly and gene annotation

Assembly

The Ciona intestinalis genome is the smallest of any experimentally manipulable chordate, and thus provides a good system for exploring vertebrate evolutionary origins. This Ensembl website presents the sequence data provided by the Kyoto University, with additional Ensembl genebuild (see below).

The current size of the assembly (which includes unmapped scaffolds) is 115Mb, with 78Mb of the assembly mapped to chromosome arms The N50 size is the length such that 50% of the assembled genome lies in blocks of the N50 size or longer. The N50 size of scaffolds is 98.07 kb.

Other assemblies

JGI2 (Ensembl release 54)

Gene annotation

The standard Ensembl mammalian pipeline was modified for annotation of the Ciona genome, owing to the lack of genomic information from closely related species. Thus, in addition to aligning known Ciona proteins to the genome sequence (as per the standard pipeline), the large quantities of Ciona-specific cDNA and EST sequences were aligned against the genome and then protein data from other species was used to build additional gene models.

In addition to the coding transcript models, non-coding RNAs and pseudogenes were annotated.

Detailed information on genebuild (PDF)

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

Assembly	KH, INSDC Assembly GCA_000224145.1, Apr 2011
Base Pairs	115,227,500
Golden Path Length	115,227,500
Annotation provider	Ensembl
Annotation method	Full genebuild
Genebuild started	Aug 2011
Genebuild released	Mar 2012
Genebuild last updated/patched	Mar 2014
Database version	116.3

Gene counts

Gene/transcipt that contains an open reading frame (ORF).Coding genes	16,671
Non coding genes	455
Small non coding genes	442
Misc non coding genes	13
A gene that has homology to known protein-coding genes but contain a frameshift and/or stop codon(s) which disrupts the ORF. Thought to have arisen through duplication followed by loss of function.Pseudogenes	27
A transcript is the operational unit of a gene. In a genomic context, transcripts consist of one or more exons, with adjoining exons being separated by introns. The exons/introns are transcribed and then the introns spliced out. Transcripts may or may not encode a proteinGene transcripts	17,784

Other

Genscan gene predictions

10,697

Upcoming Ensembl Platform Transition

C.intestinalis assembly and gene annotation

Assembly

Other assemblies

Gene annotation

More information

Statistics

Summary

Gene counts

Other

About Us

Get help

Our sister sites

Follow us

Upcoming Ensembl Platform Transition

Favourite species

All species

C.intestinalis assembly and gene annotation

Assembly

Other assemblies

Gene annotation

More information

Statistics

Summary

Gene counts

Other

About Us

Get help

Our sister sites

Follow us