[Bioc-devel] (crazy) copy-on-modification bug in GenomicFeatures

Hervé Pagès hpages at fredhutch.org
Tue Sep 22 12:55:46 CEST 2015


Hi Kasper,

On 09/21/2015 06:30 PM, Kasper Daniel Hansen wrote:
> An anonymous student found this.
>
> PREP
> library(GenomicFeatures)
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
>
> EXAMPLE
> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
> seqlevels(txdb, force=TRUE) <- c("chr22")
> gr <- GRanges(seqnames = "chr22", ranges = IRanges(start = 1, end =
> 52330658))
> gr.trans.chr22 <- subsetByOverlaps(transcripts(txdb), gr, ignore.strand =
> TRUE)
> length(gr.trans.chr22)
> END_EXAMPLE
>
> if you run the EXAMPLE to END_EXAMPLE code twice after each other in an R
> session, you first get the answer 1868 and second time the answer 2576.

Thanks for the report. Not sure this has anything to do with
copy-on-modification but it certainly is crazy ;-)

After spending some time looking at the seqlevels() setter for TxDb
object, I found many more problems with it. All of which should be
addressed in GenomicFeatures 1.21.29 (devel) and 1.20.5 (release).

Also from now the user won't need to specify 'force=TRUE' anymore
when setting user-supplied seqlevels on a TxDb object (the 'force'
arg is simply ignored).

>
> In fact, you get behavior like this in a fresh setting
>
> library(GenomicFeatures)
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
>
> txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
> seqlevels(txdb, force=TRUE) <- c("chr22")
> library(TxDb.Hsapiens.UCSC.hg19.knownGene)
> seqlevels(TxDb.Hsapiens.UCSC.hg19.knownGene)
> [1] "chr22"

OK so this is indeed a long standing issue and is the
consequence of the fact that TxDb objects are implemented
as a reference class. IMO reference object semantic should
be used carefully and is not appropriate for TxDb objects.
So I think we should switch to a classic S4 implementation with
a healthy copy-on-modification semantic ASAP. Unfortunately
this won't happen for BioC 3.2...

H.

>
> so here the modification of txdb gets carried through to the original
> object.
>
> Best,
> Kasper
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpages at fredhutch.org
Phone:  (206) 667-5791
Fax:    (206) 667-1319



More information about the Bioc-devel mailing list