The genome of a single Ciona savignyi from San Francisco Bay was shotgun-sequenced by the Broad Institute and assembled using Arachne2. The Sidow lab at Stanford used this as a basis for the assembly.
The assembly consists of 374 Reftigs, totalling 174 Megabases, with a Contig N50 of 141Kb and Reftig N50 size of 1800Kb.
The standard Ensembl mammalian pipeline was modified for annotation of the Ciona savignyi genome, owing to the lack of genomic information from closely-related species. Thus, in addition to aligning known Ciona proteins to the sequence (as per the standard pipeline), we aligned Ciona-specific cDNA and EST sequences against the genome, and then used these in conjunction with protein data from other species to build additional gene models.
General information about this species can be found in Wikipedia.
|Assembly||CSAV 2.0, Oct 2005|
|Golden Path Length|
The golden path is the length of the reference assembly. It consists of the sum of all top-level sequences in the seq_region table, omitting any redundant regions such as haplotypes and PARs (pseudoautosomal regions).
|Genebuild method||Full genebuild|
|Genebuild started||Apr 2006|
|Genebuild released||Jun 2006|
|Genebuild last updated/patched||Apr 2013|
Genes and/or transcript that contains an open reading frame (ORF).
|Small non coding genes|
Small non coding genes are usually fewer than 200 bases long. They may be transcribed but are not translated. In Ensembl, genes with the following biotypes are classed as small non coding genes: miRNA, miscRNA, rRNA, scRNA, snlRNA, snoRNA, snRNA, and also the pseudogenic form of these biotypes. The majority of the small non coding genes in Ensembl are annotated automatically by our ncRNA pipeline. Please note that tRNAs are annotated separately using tRNAscan. tRNAs are included as 'simple fetaures', not genes, because they are not annotated using aligned sequence evidence.
A pseudogene shares an evolutionary history with a functional protein-coding gene but it has been mutated through evolution to contain frameshift and/or stop codon(s) that disrupt the open reading frame.
|Gene transcriptsNucleotide sequence resulting from the transcription of the genomic DNA to mRNA. One gene can have different transcripts or splice variants resulting from the alternative splicing of different exons in genes.||20,711|
|FGENESH gene prediction||13,464|
|Genefinder gene prediction||12,480|
|Genscan gene predictions||12,655|
|Snap gene prediction||35,571|