[R] memory problem with mac os X

Huntsinger, Reid reid_huntsinger at merck.com
Tue Mar 1 18:33:53 CET 2005


Yes, this came up last week as well. "dist" uses .C() to call code to
compute the upper triangle, which is passed an "empty" R vector of size
N(N-1)/2 to fill in. It returns a list containing the arguments passed in
and assigns the result to another vector. I can only guess that because
arguments are not copied in the .C call with DUP=FALSE, R is conservative
and assumes that a copy needs to be made in the assignment, so for a while
two copies exist. I haven't found this in the code, yet, so maybe there's
something quite different going on.

In any case, you can compute the distance matrix yourself more
space-efficiently, but if you really need large distance matrices you'll
need more RAM or perhaps 64 bit hardware. The following R code is less
memory-hungry than dist, but returns the whole matrix. It uses about 1.2 GB
on a 10,000 observation dataset, while dist runs into the 3 GB address space
limit on my machine. By default it computes squared Euclidean distance, but
the "func" argument can be passed another R function to get d[i,j] =
func(x[,i],x[,j]). (Note it works on columns rather than rows. )

function(x, func=function(y) colSums(y*y)) {
# computes d[i,j] = func(column i - column j) 
n <- dim(x)[2]
d <- vector(length=n*n)
dim(d) <- c(n,n)
for (j in 1:n) {
  d[,j] <- func(x - x[,j])
}
d
}

Reid Huntsinger

-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Edouard Henrion
Sent: Monday, February 28, 2005 12:47 PM
To: r-help at stat.math.ethz.ch
Subject: [R] memory problem with mac os X


Dear list,

I am using R.2.0.1 on a G5 biprocessor 2.5GHz with 2Go RAM (Mac OS X 
10.3.8).

I'm trying to calculate an object of type "dist". I am getting the 
following memory error :

*** malloc: vm_allocate(size=1295929344) failed (error code=3)
*** malloc[25960]: error: Can't allocate region
Error: cannot allocate vector of size 1265554 Kb

When I do a top on the terminal, I can see that this size has already 
been allocated... It seems that R tries to allocate the memory twice.
Does anybody have an advice about this ?

Thanks,

Edouard Henrion

______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html




More information about the R-help mailing list