[Bioc-devel] SummarizedExperiment: structure loss, when mixing matrix and data.frame data

Martin Morgan martin.morgan at roswellpark.org
Sun Nov 26 16:03:44 CET 2017


It would seem to be a bug in endoapply

lst <- SimpleList(
     m = matrix(0, 2, 2, dimnames=list(letters[1:2], LETTERS[1:2])),
     df = data.frame(A=1:2, B=1:2, row.names=letters[1:2])
)
dimnames(lst[[1]])                      # list(c("a", "b"), c("A", "B"))
dimnames(endoapply(lst, identity)[[1]]) # NULL

specifically S4Vectors:::coerceToSimpleList

lst <- list(
     m = matrix(0, 2, 2, dimnames=list(letters[1:2], LETTERS[1:2])),
     df = data.frame(A=1:2, B=1:2, row.names=letters[1:2])
)
S4Vectors:::coerceToSimpleList(lst)

Martin


On 11/26/2017 07:56 AM, Vincent Carey wrote:
> Confirmed with the following sessionInfo(), satisfying biocValid()==TRUE
> 
>> sessionInfo()
> 
> R Under development (unstable) (2017-11-22 r73776)
> 
> Platform: x86_64-pc-linux-gnu (64-bit)
> 
> Running under: Linux Mint 18.1
> 
> 
> Matrix products: default
> 
> BLAS: /home/stvjc/R-35-dist/lib/R/lib/libRblas.so
> 
> LAPACK: /home/stvjc/R-35-dist/lib/R/lib/libRlapack.so
> 
> 
> locale:
> 
>   [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
> 
>   [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
> 
>   [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
> 
>   [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
> 
>   [9] LC_ADDRESS=C               LC_TELEPHONE=C
> 
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> 
> 
> attached base packages:
> 
> [1] parallel  stats4    stats     graphics  grDevices utils     datasets
> 
> [8] methods   base
> 
> 
> other attached packages:
> 
> [1] SummarizedExperiment_1.9.2 DelayedArray_0.5.5
> 
> [3] matrixStats_0.52.2         Biobase_2.39.0
> 
> [5] GenomicRanges_1.31.1       GenomeInfoDb_1.15.1
> 
> [7] IRanges_2.13.4             S4Vectors_0.17.10
> 
> [9] BiocGenerics_0.25.0
> 
> 
> loaded via a namespace (and not attached):
> 
>   [1] lattice_0.20-35         bitops_1.0-6            grid_3.5.0
> 
>   [4] zlibbioc_1.25.0         XVector_0.19.1          Matrix_1.2-12
> 
>   [7] tools_3.5.0             RCurl_1.95-4.8          compiler_3.5.0
> 
> [10] GenomeInfoDbData_0.99.2
> 
> On Sun, Nov 26, 2017 at 7:09 AM, Felix Ernst <felix.ernst at ulb.ac.be> wrote:
> 
>> Hi all,
>>
>> I got different results constructing a SummarizedExperiment in 3.6 and
>> 3.7. My question is, whether this is intentional or a bug.
>>
>> library(GenomicRanges)
>> library(SummarizedExperiment)
>>
>> nrows <- 200; ncols <- 6
>> counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
>> colnames(counts) <- LETTERS[1:6]
>> rownames(counts) <- 1:nrows
>> counts2 <- counts-floor(counts)
>> rowRanges <- GRanges(rep(c("chr1", "chr2"), c(50, 150)),
>>                       IRanges(floor(runif(200, 1e5, 1e6)), width=100),
>>                       strand=sample(c("+", "-"), 200, TRUE),
>>                       feature_id=sprintf("ID%03d", 1:200))
>> colData <- DataFrame(Treatment=rep(c("ChIP", "Input"), 3),
>>                       row.names=LETTERS[1:6])
>>
>> se <- SummarizedExperiment(assays=list(counts=counts),
>>                             rowRanges=rowRanges,
>>                             colData=colData)
>>
>> str(assays(se)$counts)
>> assays(se)$counts2 <- as.data.frame(counts2)
>> str(assays(se)$counts)
>>
>> On a Windows 10 R3.4.2 Bioc 3.6 this produces:
>> num [1:200, 1:6] 8815 6314 1945 6185 5935 ...
>>   - attr(*, "dimnames")=List of 2
>>    ..$ : chr [1:200] "1" "2" "3" "4" ...
>>    ..$ : chr [1:6] "A" "B" "C" "D" ...
>>   num [1:200, 1:6] 8815 6314 1945 6185 5935 ...
>>   - attr(*, "dimnames")=List of 2
>>    ..$ : chr [1:200] "1" "2" "3" "4" ...
>>    ..$ : chr [1:6] "A" "B" "C" "D" ...
>>
>> On Ubuntu 17.10 R-devel r73779 Bioc3.7  this produces
>> num [1:200, 1:6] 8636 7040 9275 4821 2475 ...
>>   - attr(*, "dimnames")=List of 2
>>    ..$ : chr [1:200] "1" "2" "3" "4" ...
>>    ..$ : chr [1:6] "A" "B" "C" "D" ...
>>   num [1:1200] 8636 7040 9275 4821 2475 ...
>>
>> Somehow the structure is lost.
>>
>> This happens, if I mix matrix and data.frame data, and doesn’t, if I use
>> only matrices. The man page defines matrix-like objects,
>> which a data.frame is (isn’t it?) and the behavior is different from
>> Bioc3.6 to Bioc3.7.
>>
>> I can rule out that this is a Windows/Linux thing, because the Travis
>> build error, which pointed to an difference in the first place,
>> didn‘t occur with bioc-release, just with bioc-devel.
>>
>> Thanks for any advice and suggestions.
>>
>> Felix
>>
>>          [[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioc-devel at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
> 


This email message may contain legally privileged and/or...{{dropped:2}}



More information about the Bioc-devel mailing list