[Bioc-devel] Error when index duplicate rows in SummarizedExperiment -- is this a bug?
Elizabeth Purdom
epurdom @end|ng |rom berke|ey@edu
Fri Jul 17 19:13:20 CEST 2020
Hello,
I want to be able to index duplicate rows of an assay of a Summarized Experiment (think bootstrapping), something like this:
assay(se[c(1,1,2,2),])
However this gives me an error when the assay contains a data.frame, rather than a DataFrame
# > assay(se[c(1,1,2,2),]) #throws error
# Error in `.rowNamesDF<-`(x, value = value) :
# duplicate 'row.names' are not allowed
# In addition: Warning message:
# non-unique values when setting 'row.names': ‘A1’, ‘A2’
Here’s a simple example:
test <- data.frame(matrix(rnorm(100),ncol=5))
row.names(test) <- paste0("A",1:nrow(test))
se<-SummarizedExperiment(test)
I can pull duplicate rows of the original data.frame:
test[c(1,1,2,2),] # works
I can also index duplicate rows of the SummarizedExperiment
se[c(1,1,2,2),] #works
But I can’t then call `assay` on that object with the duplicated rows:
assay(se[c(1,1,2,2),]) #throws error
# > assay(se[c(1,1,2,2),])
# Error in `.rowNamesDF<-`(x, value = value) :
# duplicate 'row.names' are not allowed
# In addition: Warning message:
# non-unique values when setting 'row.names': ‘A1’, ‘A2’
Of course, I can do
assay(se)[c(1,1,2,2),]
because the underlying data.frame can be indexed that way, but then I am not indexing the corresponding `rowData`, which is my goal in indexing `se` directly, rather than the `assay`.
On the other hand, I don’t get this problem if the input object is a DataFrame or matrix:
se<-SummarizedExperiment(DataFrame(test))
assay(se[c(1,1,2,2),]) #now it works
se<-SummarizedExperiment(data.matrix(test))
assay(se[c(1,1,2,2),]) #now it works
This seems like a bug, but I thought I’d check here. It seems, at a minimum, unfortunate that you can call `se[c(1,1,2,2),]` but not `assay(se[c(1,1,2,2),])`, especially given that the underlying `data.frame` allows this call.
Thanks,
Elizabeth Purdom
More information about the Bioc-devel
mailing list