[BioC] MT reference differences between hg19/GRCH37
Julian Gehring
julian.gehring at embl.de
Sun Mar 17 13:53:56 CET 2013
Hi,
The UCSC hg19 and GRCH 37 reference genome use different reference
sequences for the mitochondrium (MT) that differ in length and have
mismatches at multiple positions. For a short explanation on this, see
https://lists.soe.ucsc.edu/pipermail/genome/2009-July/019631.html.
Bioconductor normally only provides the UCSC references (see e.g. the
BSgenome.* or TxDb.* packages. When using data aligned to the GRCH
reference, to what extend does using the UCSC reference influence the
analysis of features located on the MT, and for which kinds of
downstream analyses could this become critical? E.g. locating SNPs on
the MT is such a critical case.
The problem will likely be solved with the hg20/GRCH38 release, but data
that requires the hg19/GRCH37 releases may still be relevant for several
years. Would it be reasonable to provide alternative reference
packages, such as a GRCH37 BSgenome package?
Best wishes
Julian
More information about the Bioconductor
mailing list