[Bioc-devel] Tip of the day: unlist(..., use.names=FALSE) often saves lots of memory
Henrik Bengtsson
hb at stat.berkeley.edu
Sun Jul 6 02:14:37 CEST 2008
Hi,
I just wanna share an seldom used feature of unlist():
Using argument 'use.names=FALSE' when calling unlist() often saves
lots of memory.
The names vector of the list will be expanded to each element and can
often consume much more memory than the actually data. So, unless you
really need the 'names' attributes, please consider using unlist(...,
use.names=FALSE) in your package(s). It is also faster.
A common example using an AffyBatch object:
> affyBatch
AffyBatch object
size of arrays=1164x1164 features (7 kb)
cdf=HG-U133_Plus_2 (54675 affyids)
number of samples=1
number of genes=54675
annotation=hgu133plus2
notes=
> pmIndex <- indexProbes(affyBatch[,1], "pm")
> object.size(pmIndex)
[1] 6572776
> cells <- unlist(pmIndex)
> object.size(cells)
[1] 29018704
> cells2 <- unlist(pmIndex, use.names=FALSE)
> object.size(cells2)
[1] 2417056
# The names consumes 92% of the memory
> object.size(cells2)/object.size(cells)
[1] 0.08329304
It is much cheaper to pass around 'cells2' compared with 'cells'.
/Henrik
More information about the Bioc-devel
mailing list