[Bioc-devel] Tip of the day: unlist(..., use.names=FALSE) often saves lots of memory

Henrik Bengtsson hb at stat.berkeley.edu
Sun Jul 6 02:14:37 CEST 2008


Hi,

I just wanna share an seldom used feature of unlist():

  Using argument 'use.names=FALSE' when calling unlist() often saves
lots of memory.

The names vector of the list will be expanded to each element and can
often consume much more memory than the actually data.  So, unless you
really need the 'names' attributes, please consider using unlist(...,
use.names=FALSE) in your package(s).  It is also faster.

A common example using an AffyBatch object:

> affyBatch
AffyBatch object
size of arrays=1164x1164 features (7 kb)
cdf=HG-U133_Plus_2 (54675 affyids)
number of samples=1
number of genes=54675
annotation=hgu133plus2
notes=

> pmIndex <- indexProbes(affyBatch[,1], "pm")
> object.size(pmIndex)
[1] 6572776

> cells <- unlist(pmIndex)
> object.size(cells)
[1] 29018704

> cells2 <- unlist(pmIndex, use.names=FALSE)
> object.size(cells2)
[1] 2417056

# The names consumes 92% of the memory
> object.size(cells2)/object.size(cells)
[1] 0.08329304

It is much cheaper to pass around 'cells2' compared with 'cells'.

/Henrik



More information about the Bioc-devel mailing list