[R] help(Memory) [forwarded message]

Martin Maechler maechler at stat.math.ethz.ch
Wed Feb 4 14:51:33 CET 2004

I can't understand that people still send things like this to

------- start of forwarded message -------
From: Tineke Casneuf <ticas at psb.ugent.be>
Sender: r-core-bounces at stat.math.ethz.ch
To: R-core at r-project.org
Subject: help(Memory)
Date: Wed, 04 Feb 2004 13:39:32 +0100


I am trying to find a appropriate package to analyse gene expression
data from DNA microarray experiments. My data are allready normalized,
so for the clustering of my data I used the 'mva' package. All I
actually need is to calculate euclidean, manhatten, ... distances and
various kinds of correlation coefficients. I am a R beginner, and to me
it's not clear which package I should use (there's so many of them!!). I
have looked at the Bioconductor website, but it looks as if those
packages are meant to be used for fancy tools for smaller datasets
(hundreds of genes): like ANOVA, identification of differentially
expressed genes,... All I want is to calculate distances and correlation
coefficients for all the genes on the microarray (up to 22 000 genes). I
have allready tried to do some calculations, with the mva package, but
the process kills itself and returns a warning: 'heap memory exhausted'.
So I read in the manual how to increase the heap memory: I put it up to
--vsize=2000M, but he still keeps saying it (needed 83Kb or some, more).
I have tried to increase the heap memory to 2200M but he won't let me do
it (too large and ignored). I used a 7 000 rows dataset.

The commands I used are:
> scan ("list_genes", what = "list") -> genenames
> read.table(file ="list_signals", row.names = genenames) -> data
> library (mva)
> as.matrix(dist(data, method = "euclidean", diag = TRUE)) -> matrix
> write.table(matrix, file = "euclidmartix")

So here's my problem: maybe I can't use R (or this package) for this
kind of big datasets (he needs to calculate a 7000 to 7000 matrix), or
there's something wrong with my commands, since R is given 2 giga and he
still crashes. Is there maybe a better package for me to use? Or it this
amount of heap memory not unusual for this big dataset and do I need to
add more?

Can somebody please help me with this?

Thanks in advance,


Tineke Casneuf          Tel: 32 (0)9 3313692
DEPARTMENT OF PLANT SYSTEMS BIOLOGY           Fax:32 (0)9 3313809
GHENT UNIVERSITY/VIB,    Technology Park 927, B-9052 Gent, Belgium
Vlaams Interuniversitair Instituut voor Biotechnologie         VIB
e-mail:ticas at psb.ugent.be     http://www.psb.ugent.be/bioinformatics/
------- end of forwarded message -------

More information about the R-help mailing list