[R] Applying user function over a large matrix
Ray Brownrigg
Ray.Brownrigg at mcs.vuw.ac.nz
Wed Apr 30 00:18:50 CEST 2008
In addition to Tony's suggestion, have a look at the following sequence, which
I suspect is because the call to apply will duplicate your 1.5GB matrix,
whereas the for loop doesn't [I stand to be corrected here].
> x <- matrix(runif(210000), 21)
> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
sum(x[, i])})
user system elapsed
0.079 0.000 0.079
> unix.time(apply(x, 2, sum))
user system elapsed
0.10 0.01 0.11
> x <- matrix(runif(2100000), 21)
> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
sum(x[, i])})
user system elapsed
0.791 0.010 0.801
> unix.time(apply(x, 2, sum))
user system elapsed
1.096 0.011 1.107
> x <- matrix(runif(21000000), 21)
> unix.time({res <- numeric(ncol(x)); for(i in 1:length(res)) res[i] <-
sum(x[, i])})
user system elapsed
7.825 0.011 7.840
> unix.time(apply(x, 2, sum))
user system elapsed
15.431 0.142 15.592
>
Also, preliminary checking using the top utility shows the for loop requires
just over half the memory of the apply() call. This is on a NetBSD system
with 2GB memory.
HTH,
Ray Brownrigg
On Wed, 30 Apr 2008, Tony Plate wrote:
> It's quite possible that much of the time spent in loess() is setting up
> the data (i.e., the formula, terms, model.frame, etc.), and that much of
> that is repeated identically for each call to loess(). I would suggest
> looking at the code of loess() and work out what arguments it is calling
> simpleLoess() with, and then try calling stats:::simpleLoess() directly.
> (Of course you have to be careful with this because this is not using the
> published API).
>
> -- Tony Plate
>
> Sudipta Sarkar wrote:
> > Respected R experts,
> > I am trying to apply a user function that basically calls and
> > applies the R loess function from stat package over each time
> > series. I have a large matrix of size 21 X 9000000 and I need
> > to apply the loess for each column and hence I have
> > implemented this separate user function that applies loess
> > over each column and I am calling this function foo as follows:
> > xc<-apply(t,2,foo) where t is my 21 X 9000000 matrix and
> > loess. This is turning out to be a very slow process and I
> > need to repeat this step for 25-30 such large matrix chunks.
> > Is there any trick I can use to make this work faster?
> > Any help will be deeply appreciated.
> > Regards
> >
> >
> > Sudipta Sarkar PhD
> > Senior Analyst/Scientist
> > Lanworth Inc. (Formerly Forest One Inc.)
> > 300 Park Blvd., Ste 425
> > Itasca, IL
> > Ph: 630-250-0468
> >
More information about the R-help
mailing list