Pig assembly and gene annotation


The Sscrofa10.2 assembly of the pig genome was produced in August 2011 by the Swine Genome Sequencing Consortium (SGSC). It consists of 20 chromosomes (1-18, X and Y) and 4562 unplaced scaffolds. This genome assembly has GCA_000003025.4 as its GenBank assembly accession.

The genome assembly represented here corresponds to GenBank Assembly ID GCA_000003025.4

Gene annotation

Sscrofa10.2 was annotated using the standard Ensembl automatic gene annotation system, incorporating RNA-Seq data provided by the (SGSC). The annotation process is described in the document below. The Ensembl annotations were then merged with Vega annotations at the transcript level. Transcripts were merged if they shared the same internal exon-intron boundaries (i.e. had identical splicing pattern) with slight differences in the terminal exons allowed. Importantly, all Vega source transcripts were included in the final merged gene set. The Vega annotations comprised manual annotation of 2,000 genes both from Havana and from the Immune Response Annotation Group (IRAG) community annotation initiative, which was performed under the guidance of the Havana group.

More information

General information about this species can be found in Wikipedia.



AssemblySscrofa10.2, INSDC Assembly GCA_000003025.4, Aug 2011
Database version82.102
Base Pairs3,024,658,544
Golden Path Length2,808,525,991
Genebuild byEnsembl
Genebuild methodFull genebuild
Genebuild startedSep 2011
Genebuild releasedMay 2012
Genebuild last updated/patchedFeb 2014

Gene counts

Coding genes21,630 (incl 10 readthrough)
Non coding genes3,124
Small non coding genes2,804
Long non coding genes135 (incl 1 readthrough)
Misc non coding genes185
Gene transcripts30,585


Genscan gene predictions52,372
Short Variants52,684,746
Structural variants85