[R] problems with large data II
Spencer Graves
spencer.graves at pdf.com
Fri Jan 9 15:58:56 CET 2004
If you can't get more memory, you could read portions of the file
using "scan(..., skip = ..., nlines = ...)" and then compress the data
somehow to reduce the size of the object you pass to "randomForest".
You could run "scan" like this in a loop each time processing, e.g., 10%
of the data file.
Alternatively, you could pass each portion to "randomForest" and
compare the results from several calls to "randomForest". This would
produce a type of cross validation, which might be a wise thing to do,
anyway.
hope this helps.
spencer graves
PaTa PaTaS wrote:
>Thank you all for your help. The problem is not only with reading the data (5000 cases times 2000 integer variables, imported either from SPSS or TXT file) into my R 1.8.0 but also with the procedure I would like to use = "randomForest" from library "randomForest". It is not possible to run it with such a data set (because of the insuficient memory exception). Moreover, my data has factors with more than 32 classes, which causes another error.
>
>Could you suggest any solution for my problem? Thank you a lot.
>____________________________________________________________
>Licitovat nejvyhodnejsi nabídku je postavene na hlavu! Skoda Octavia nyni se zvyhodnenim az 90.000 Kc! http://ad2.seznam.cz/redir.cgi?instance=68740%26url=http://www.skoda-auto.cz/action/fast
>
>______________________________________________
>R-help at stat.math.ethz.ch mailing list
>https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>
>
More information about the R-help
mailing list