[R] randomForest: too many elements specified?

Liaw, Andy andy_liaw at merck.com
Fri Jan 21 19:12:01 CET 2011


I grep for "n, n)" in all the R code of the package (current version),
and the only place that happens is in creating proximity.  Can you do a
traceback() and see where it happens?

You should seriously consider upgrading R and the packages...

Andy 

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Czerminski, Ryszard
> Sent: Thursday, January 20, 2011 1:08 PM
> To: r-help at stat.math.ethz.ch
> Subject: [R] randomForest: too many elements specified?
> 
> I getting "Error in matrix(0, n, n) : too many elements specified"
> while building randomForest model, which looks like memory allocation
> error.
> Software versions are: randomForest 4.5-25, R version 2.7.1
> 
> Dataset is big (~90K rows, ~200 columns), but this is on a 
> big machine (
> ~120G RAM)
> and I call randomForest like this: randomForest(x,y)
> i.e. in supervised mode and not requesting proximity matrix, therefore
> answer from Andy Liaw to an email reporting the same problems in 2005
> (see below)
> is probably not directly applicable, still it looks like it is too big
> data set for this dataset/machine combination.
> 
> How does memory usage in randomForest scale with dataset size?
> Is there a way to build global rf model with dataset of this size?
> 
> Best regards,
> Ryszard
> 
> Ryszard Czerminski
> AstraZeneca Pharmaceuticals LP
> 35 Gatehouse Drive
> Waltham, MA 02451
> USA
> 781-839-4304
> ryszard.czerminski at astrazeneca.com
> 
> RE: [R] randomForest: too many element specified?
> Liaw, Andy
> Mon, 17 Jan 2005 05:56:28 -0800
> > From: luk
> >
> > When I run randonForest with a 169453x5 matrix, I got the
> > following message.
> >
> > Error in matrix(0, n, n) : matrix: too many elements specified
> >
> > Can you please advise me how to solve this problem?
> >
> > Thanks,
> >
> > Lu
> 
> 1.  When asking new questions, please don't reply to other posts.
> 
> 2.  When asking questions like these, please do show the commands you
> used.
> 
> My guess is that you asked for the proximity matrix, or is running
> unsupervised randomForest (by not providing a response vector).  This
> will
> requires a couple of n by n matrices to be created (on top of other
> things),
> n being 169453 in this case.  To store a 169453 x 169453 matrix in
> double
> precision, you need 169453^2 * 8 bytes, or or nearly 214 GB of memory.
> Even
> if you have that kind of hardware, I doubt you'll be able to make much
> sense
> out of the result.
> 
> Andy
> 
> 
> 
> --------------------------------------------------------------
> ------------
> Confidentiality Notice: This message is private and may 
> ...{{dropped:11}}
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
Notice:  This e-mail message, together with any attachme...{{dropped:11}}



More information about the R-help mailing list