C.savignyi assembly and gene annotation

Assembly

The genome of a single Ciona savignyi from San Francisco Bay was shotgun-sequenced by the Broad Institute and assembled using Arachne2. The Sidow lab at Stanford used this as a basis for the assembly.

The assembly consists of 374 Reftigs, totalling 174 Megabases, with a Contig N50 of 141Kb and Reftig N50 size of 1800Kb.

Gene annotation

The standard Ensembl mammalian pipeline was modified for annotation of the Ciona savignyi genome, owing to the lack of genomic information from closely-related species. Thus, in addition to aligning known Ciona proteins to the sequence (as per the standard pipeline), we aligned Ciona-specific cDNA and EST sequences against the genome, and then used these in conjunction with protein data from other species to build additional gene models.

More information

General information about this species can be found in Wikipedia.

Statistics

Summary

AssemblyCSAV 2.0, Oct 2005
Database version76.2
Base Pairs177,003,750
Golden Path Length177,003,750
Genebuild byEnsembl
Genebuild methodFull genebuild
Genebuild startedApr 2006
Genebuild releasedJun 2006
Genebuild last updated/patchedApr 2013

Gene counts

Coding genes

Genes and/or transcript that contains an open reading frame (ORF).

11,616
Small non coding genes

Small non coding genes are usually fewer than 200 bases long. They may be transcribed but are not translated. In Ensembl, genes with the following biotypes are classed as small non coding genes: miRNA, miscRNA, rRNA, tRNA, scRNA, snlRNA, snoRNA, snRNA, tRNA, and also the pseudogenic form of these biotypes. The majority of the small non coding genes in Ensembl are annotated automatically by our ncRNA pipeline.

340
Pseudogenes

A pseudogene shares an evolutionary history with a functional protein-coding gene but it has been mutated through evolution to contain frameshift and/or stop codon(s) that disrupt the open reading frame.

216
Gene transcriptsNucleotide sequence resulting from the transcription of the genomic DNA to mRNA. One gene can have different transcripts or splice variants resulting from the alternative splicing of different exons in genes.20,711

Other

FGENESH gene prediction13,464
Genefinder gene prediction12,480
Genscan gene predictions12,655
Snap gene prediction35,571