Assembly and annotation
The Mouse Genomes Project is an ongoing effort to sequence the genomes of the common laboratory mouse strains, cataloguing all forms of molecular variation. They have produced de novo assemblies and strain-specific gene annotation for 16 laboratory and wild-derived strains (129S1/SvImJ, A/J, ARK/J, BALB/cJ, C3H/HeJ, C57BL/6NJ, CAST/EiJ, CBA/J, DBA/2J, FVB/NJ, LP/J, NOD/ShiLtJ, NZO/HlLtJ, PWK/PhJ, SPRET/EiJ, and WSB/EiJ) from a mixture of short- and long-range illumina libraries, optical maps, and third generation sequencing.
The strain-specific genome annotation was created by a combination of mapping over Gencode M8 mouse transcripts, strain-specific RNA-seq to refine mapped transcripts with Augustus, and Augustus CGP (comparative gene prediction) with strain-specific RNA-seq to annotate novel or private transcripts.
The assemblies and annotation were loaded into the Ensembl framework and additional analyses were run. These included: repeatmasking using Repeatmasker; ab initio gene predictions from Genscan; CpG island identification; prediction of transcription start sites using Eponine; tRNA predictions from tRNAscan; alignments of sequences from UniProt, UniGene and the ENA vertebrate RNA collection. Protein domains were annotated using InterProScan.
The Mouse Genomes Project has provided a multiple alignment of all their genomes with Mus musculus and Rattus norvegicus in the form of a HAL file generated by the progressive-Cactus aligner. Additionally, we have computed a LastZ alignment of human and Mus spretus.
We provide two sets of gene-trees and orthologues in Ensembl. The standard gene-trees and orthologues comprise genes from one representative for every Ensembl species, whilst the Murinae-specific gene-trees and orthologues comprise genes from all mouse strains and include genes from Mus musculus, Mus spretus and Rattus norvegicus. A stepwise approach via one these three species is required in order to compare genes from mouse strains to genes from species not in the Murinae set.