[Bioc-devel] SummarizedExperiment: duplication of metadata, when modifying colData
Hervé Pagès
hpages at fredhutch.org
Fri Dec 15 03:29:19 CET 2017
Hi Felix,
Nice catch. This can actually be reproduced with just:
> example(SummarizedExperiment)
> metadata(se0) <- list(aa="aa")
> se0[1 , ] <- se0[1 , ]
> metadata(se0)
$aa
[1] "aa"
$aa
[1] "aa"
The culprit is this line:
ans_metadata <- c(metadata(x), metadata(value))
in the "[<-" method for SummarizedExperiment objects.
So somehow it looks like it was a deliberate decision to have
[<- combine the metadata of 'x' and 'value'. Problem is that
this breaks the more-than-reasonable expectation that something
like x[i , j] <- x[i , j] should be a no-op.
I replaced the above line with:
ans_metadata <- metadata(x)
in SummarizedExperiment 1.9.5 (devel). With this change [<-
leaves metadata(x) intact and x[i , j] <- x[i , j] behaves like
a no-op:
https://github.com/Bioconductor/SummarizedExperiment/commit/e4fcb99c442e2f17b0ccddfb05df9f160e0bbe40
Will port to release soon.
Cheers,
H.
On 12/12/2017 01:05 AM, Felix Ernst wrote:
> Hi all,
>
>
>
> I got a bit of weird behaviour with SummarizedExperiments in Bioc 3.6 and
> 3.7. I suppose it is a bug, but I might be wrong, since the accession to the
> SummarizedExperiment object is not really straight forward. Any suggestions?
>
> library(GenomicRanges)
>
> library(SummarizedExperiment)
>
>
>
> nrows <- 200; ncols <- 6
>
> counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
>
> colnames(counts) <- LETTERS[1:6]
>
> rownames(counts) <- 1:nrows
>
> counts2 <- counts-floor(counts)
>
> rowRanges <- GRanges(rep(c("chr1", "chr2"), c(50, 150)),
>
> IRanges(floor(runif(200, 1e5, 1e6)), width=100),
>
> strand=sample(c("+", "-"), 200, TRUE),
>
> feature_id=sprintf("ID%03d", 1:200))
>
> colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
>
> row.names=LETTERS[1:6])
>
>
>
> se <- SummarizedExperiment(assays=list(counts=counts),
>
> rowRanges=rowRanges,
>
> colData=colData)
>
> colData(se)$xyz <- rep("",ncol(se))
>
> metadata(se) <- list("meep" = "meep")
>
>
>
> str(metadata(se))
>
> colData(se[, 1])$xyz <- "abc"
>
> str(metadata(se))
>
> The first metadata() returns a list, length of 1, with the correct data. The
> second call returns a list of two, with a duplicated entries and every
> further colData modification (and replacing data) duplicates the entries in
> the metadata further.
>
>> str(metadata(se))
>
> List of 1
>
> $ meep: chr "meep"
>
>> colData(se[, 1])$xyz <- "abc"
>
>> str(metadata(se))
>
> List of 2
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>> colData(se[, 2])$xyz <- "abc"
>
>> str(metadata(se))
>
> List of 4
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
>> colData(se[, 2])$xyz <- "abc"
>
>> str(metadata(se))
>
> List of 8
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> $ meep: chr "meep"
>
> Thanks for any advice and suggestions.
>
> Felix
>
>
>
> ---
>
>
>
> Felix Ernst, PhD
>
> Universit� Libre de Bruxelles
>
> RNA MOLECULAR BIOLOGY
>
> BIOPARK Charleroi Brussels-South CAMPUS
>
> Rue Profs Jeener & Brachet, 12
>
> B-6041 Charleroi - Gosselies
>
> BELGIUM
>
> +32(2)650 9774 (office phone)
>
> <mailto:felix.ernst at ulb.ac.be> felix.ernst at ulb.ac.be
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ZQe-rRouYDtnCV1eWpTTwXEhYq7F6bt4J5-bJtIYxyw&s=_1NFvrNbqOfrWIP1fxPoIZU9Og4dQzUjfpjp2ww6tF8&e=
>
--
Hervé Pagès
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024
E-mail: hpages at fredhutch.org
Phone: (206) 667-5791
Fax: (206) 667-1319
More information about the Bioc-devel
mailing list