[Bioc-devel] DESeqDataSetFromMatrix Changes Column Names
Michael Love
michaelisaiahlove at gmail.com
Tue Aug 26 10:43:44 CEST 2014
hi Dario,
Here's some example behavior of SummarizedExperiment (here in devel).
The renaming behavior is coming from GenomicRanges. Anyway I can't
avoid the duplication of memory in the case of a conflict of colnames
of the matrix and the rownames of colData, unless I internally
overwrite the rownames of colData. But I don't think I would do this
because the standard is to let the colData take precedence.
watch the Vcells (used):
library(GenomicRanges)
gc()
m = matrix(rnorm(5e6),ncol=100,dimnames=list(1:5e4,paste0("a",1:100)))
gc() # 40 Mb or so taken by m
se = SummarizedExperiment(m)
gc() # no duplication after creating se
rm(se)
se = SummarizedExperiment(m,
colData=DataFrame(x=1:100,row.names=paste0("b",1:100)))
colnames(se) # colData takes precedence of colnames of se
colnames(assay(se)) # and of the colnames of m
gc() # note a duplication,
# because the colnames of the matrix in assay() were replaced
rm(se)
se = SummarizedExperiment(m, colData=DataFrame(x=1:100,row.names=colnames(m)))
gc() # no duplication, same names.
# so you can use this code to insist that
# the colnames of the DESeqDataSet come from the counts matrix
rm(m,se)
m = matrix(rnorm(5e6),ncol=100)
gc()
se = SummarizedExperiment(m, colData=DataFrame(x=1:100))
gc() # no duplication if m has no colnames going in
R Under development (unstable) (2014-06-05 r65862)
Platform: x86_64-apple-darwin12.5.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] parallel stats graphics grDevices datasets utils methods
[8] base
other attached packages:
[1] GenomicRanges_1.17.35 GenomeInfoDb_1.1.18 IRanges_1.99.24
[4] S4Vectors_0.1.2 BiocGenerics_0.11.4 devtools_1.5
[7] slidify_0.4.5 knitr_1.6 BiocInstaller_1.15.5
loaded via a namespace (and not attached):
[1] compiler_3.2.0 digest_0.6.4 evaluate_0.5.5 formatR_0.10 httr_0.4
[6] markdown_0.7.2 memoise_0.2.1 RCurl_1.95-4.3 stats4_3.2.0 stringr_0.6.2
[11] tools_3.2.0 whisker_0.3-2 XVector_0.5.7 yaml_2.1.13
On Tue, Aug 26, 2014 at 2:00 AM, Dario Strbenac
<dstr7320 at uni.sydney.edu.au> wrote:
> I am using the latest release version. I understand your recommendation about colData and will use it.
>
> --------------------------------------
> Dario Strbenac
> PhD Student
> University of Sydney
> Camperdown NSW 2050
> Australia
> _______________________________________________
> Bioc-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
More information about the Bioc-devel
mailing list