[Bioc-devel] (crazy) copy-on-modification bug in GenomicFeatures
Hervé Pagès
hpages at fredhutch.org
Tue Sep 22 12:55:46 CEST 2015
Hi Kasper,
On 09/21/2015 06:30 PM, Kasper Daniel Hansen wrote:
> An anonymous student found this.
> library(GenomicFeatures)
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
> seqlevels(txdb, force=TRUE) <- c("chr22")
> gr <- GRanges(seqnames = "chr22", ranges = IRanges(start = 1, end =
> 52330658))
> gr.trans.chr22 <- subsetByOverlaps(transcripts(txdb), gr, ignore.strand =
> length(gr.trans.chr22)
> if you run the EXAMPLE to END_EXAMPLE code twice after each other in an R
> session, you first get the answer 1868 and second time the answer 2576.
Thanks for the report. Not sure this has anything to do with
copy-on-modification but it certainly is crazy ;-)
After spending some time looking at the seqlevels() setter for TxDb
object, I found many more problems with it. All of which should be
addressed in GenomicFeatures 1.21.29 (devel) and 1.20.5 (release).
Also from now the user won't need to specify 'force=TRUE' anymore
when setting user-supplied seqlevels on a TxDb object (the 'force'
arg is simply ignored).
> In fact, you get behavior like this in a fresh setting
> library(GenomicFeatures)
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
> seqlevels(txdb, force=TRUE) <- c("chr22")
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> seqlevels(TxDb.Hsapiens.UCSC.hg19.knownGene)
> [1] "chr22"
OK so this is indeed a long standing issue and is the
consequence of the fact that TxDb objects are implemented
as a reference class. IMO reference object semantic should
be used carefully and is not appropriate for TxDb objects.
So I think we should switch to a classic S4 implementation with
a healthy copy-on-modification semantic ASAP. Unfortunately
this won't happen for BioC 3.2...
> so here the modification of txdb gets carried through to the original
> object.
> Best,
> Kasper
> [[alternative HTML version deleted]]
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list