News for Chimpanzee Ensembl Release 89 (May 2017)
News categories
New regulation data
Microarray Probe Mapping Update
Update microarray probe mappings for all arrays of all species
Map array probes onto 15 mouse strains
Map array probes onto the below mouse strains:
mus_musculus_129s1svimj
mus_musculus_aj
mus_musculus_akrj
mus_musculus_balbcj
mus_musculus_c3hhej
mus_musculus_c57bl6nj
mus_musculus_casteij
mus_musculus_cbaj
mus_musculus_dba2j
mus_musculus_fvbnj
mus_musculus_lpj
mus_musculus_nodshiltj
mus_musculus_nzohlltj
mus_musculus_pwkphj
mus_musculus_wsbeij
Other updates
Compara
ncRNAtrees and homologies
- Classification based on Rfam models (v12.1)
- Multiple sequence alignments with Infernal
- Phylogenetic reconstruction using RAxML
- Phylogenetic reconstruction using FastTree2 and ExaML for very big families
- Additional multiple sequence alignments with Prank (w/ genomic flanks)
- Additional phylogenetic reconstruction using PhyML and NJ
- Phylogenetic tree merging using TreeBeST
- Per family gene dynamics using CAFE
- Homology inference
- Secondary structure plots
patch_88_89_a.sql - Schema version update
88 -> 89
Protein Families
Updated HMM families including all Ensembl transcript isoforms (including human non-reference haplotypes) and newest Uniprot Metazoa.
-- Clustering by PantherScore (based on Ensembl HMM library)
-- Multiple Sequence Alignments with MAFFT (v.7.221)
ProteinTrees and homologies
GeneTrees (protein-coding) with new/updated genebuilds and assemblies
-- all-vs-all blastp (ncbi-blast-2.2.30+)
-- Clustering using hcluster_sg
-- Multiple sequence alignments using MCoffee (Version_9.03.r1318) or Mafft (mafft-7.221)
-- Phylogenetic reconstruction using TreeBeST
-- Homology inference
-- Pairwise gene-based dN/dS scores for high coverage species pairs only (both on orthologues and paralogues) (codeml/PAML v4.3)
-- GeneTree stable ID mapping
-- Per family gene dynamics using CAFE (v2.2)
-- computation of pairwise gene-order conservation score
-- comparison of orthologies with whole-genome alignments
-- high-confidence calls
Core
GO terms for transcripts
GO terms have been introduced for some miRNAs. As a result, GO terms are now linked to transcripts rather than translations.
Regulation
Database schema changes
patch_88_89_a - Schema change
patch_88_89_b - Create table probe_seq
patch_88_89_c - Create table probe_feature_transcript
patch_88_89_d - Create table probe_transcript
patch_88_89_e - Create table probe_set_transcript
patch_88_89_f - Remove probe features from object_xref and xref table
patch_88_89_g - Remove probe mappings from the xref tables
patch_88_89_h - Remove probe set mappings from the xref tables.
patch_88_89_i - Add link columns to array table
patch_88_89_j - Added array_chip_id column to probe_set table
patch_88_89_k - Added probe_seq_id column to probe table
Deprecate methods
The following methods have been deprecated and will be removed in Ensembl release 93
Bio::EnsEMBL::Funcgen::Epigenome::tissue()
Bio::EnsEMBL::Funcgen::Epigenome::ontology_accession()
Production
Ensembl 89 mart databases
- Ensembl Genes 89
- Updated Microarray probes/probesets for all the species
- Dataset "meugenii_gene_ensembl" was renamed to "neugenii_gene_ensembl"
- Dataset "tsyrichta_gene_ensembl" was renamed to "csyrichta_gene_ensembl"
- GO and GOSlim terms were moved from Translation to Transcript
- Mouse Genes 89
- Microarray probes/probesets added for all the mouse strains
- GO and GOSlim terms were moved from Translation to Transcript
- Ensembl Variation 89
- New filters for regulatory and motif consequence types
- Ensembl Regulation 89
- Updated VISTA Enhancers for human and mouse
Variation
Phenotype data updates
- Updated Human phenotype data from different sources including NHGRI-EBI GWAS, OMIM, ClinVar, UniProt, Cosmic Gene Census, DDG2P, MIM Morbid and Orphanet.
- OMIA data for several species
- AnimalQTL data for several species
- RGD data for Rat
- ZFIN data for Zebrafish
- IMPC data for Mouse
- MGI data for Mouse
strain_gtype_poly table to be dropped
The strain_gtype_poly table will be dropped.