[BioC] MT reference differences between hg19/GRCH37

Julian Gehring julian.gehring at embl.de
Sun Mar 17 13:53:56 CET 2013


Hi,

The UCSC hg19 and GRCH 37 reference genome use different reference 
sequences for the mitochondrium (MT) that differ in length and have 
mismatches at multiple positions.  For a short explanation on this, see 
https://lists.soe.ucsc.edu/pipermail/genome/2009-July/019631.html.

Bioconductor normally only provides the UCSC references (see e.g. the 
BSgenome.* or TxDb.* packages.  When using data aligned to the GRCH 
reference, to what extend does using the UCSC reference influence the 
analysis of features located on the MT, and for which kinds of 
downstream analyses could this become critical?  E.g. locating SNPs on 
the MT is such a critical case.

The problem will likely be solved with the hg20/GRCH38 release, but data 
that requires the hg19/GRCH37 releases may still be relevant for several 
years.  Would it be reasonable to provide alternative reference 
packages, such as a GRCH37 BSgenome package?

Best wishes
Julian



More information about the Bioconductor mailing list