[R-es] aumentar tamaño de memoria a mas de 4Gb‏

Jue Mar 18 11:31:03 CET 2010

Hola, ¿qué tal?

Es raro porque para predecir apenas hacen falta recursos. Pero bueno,
tienes dos opciones relativamente sencillas.

1) La primera, es fácil que no funcione: dentro de tu bucle, forzar
las llamadas al recolector de basura con gc().

2) La segunda, debería funcionar sí o sí: en lugar de hacer la
predicción sobre un conjunto de datos muy grande, partirlo en varios
pequeños a mano, hacer la predicción "cacho a cacho" y apilar los
resultados convenientemente.

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El día 18 de marzo de 2010 11:22, Víctor Rodríguez Galiano
<luxorvrg en hotmail.com> escribió:
>
> El error se produce al hacer la predicción.
>
>> Date: Thu, 18 Mar 2010 11:19:12 +0100
>> Subject: Re: [R-es] aumentar tamaño de memoria a mas de 4Gb‏
>> From: cgb en datanalytics.com
>> To: luxorvrg en hotmail.com
>> CC: r-help-es en r-project.org
>>
>> Hola, ¿qué tal?
>>
>> ¿Sabes exactamente dónde se produce el error? ¿Es al construir el
>> modelo o al hacer la predicción?
>>
>> Si sucede lo primero, ¿de qué tamaño es tu conjunto de entrenamiento?
>>
>> Un saludo,
>>
>> Carlos J. Gil Bellosta
>> http://www.datanalytics.com
>>
>>
>>
>>
>> El día 18 de marzo de 2010 10:43, Víctor Rodríguez Galiano
>> <luxorvrg en hotmail.com> escribió:
>> >
>> > Hola de nuevo,
>> >
>> > Esta es la información de mi sesion:
>> >
>> > R version 2.10.1 (2009-12-14)
>> > i386-pc-mingw32
>> >
>> > locale:
>> > [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
>> > [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
>> > [5] LC_TIME=Spanish_Spain.1252
>> >
>> > attached base packages:
>> > [1] stats     graphics  grDevices utils     datasets  methods   base
>> >
>> >
>> > Lo que yo prentendo es clasificar unos datos que están en ficheros de texto, el tamaño de cada uno de estos ficheros va desde los 10mb a 90mb. En un fichero de texto (input) le indico los ficheros que tiene que ir cogiendo para predecir y en el archivo output los nombres de los ficheros de salida con las clasificaciones. Como os comentaba antes, con el clasificador randomForest no tengo problema pero si con support vector. Cuando trabajo con los datos de entrenamiento y test no hay ningún problema. El problema surge al intentar clasificar nuevos ejemplos.
>> >
>> >
>> >
>> > Os adjunto también el codigo que meto para clasificar por si :
>> >
>> > # R script for running random forest classification model and prediction for many segments/areas
>> > # Need to run calibration only once for full model and then run prediction in a loop for different segments/areas/regions
>> > ###################################################################################################################
>> > # Part 1: calibration
>> >
>> > library(e1071)
>> >
>> > #calibration step
>> > calibrate<-read.table("calibration.txt", header=TRUE)
>> > calibrate$calibration<-as.factor(calibrate$calibration)
>> > calibrate.rf<-svm(calibration~B1+B14+B15+B16+B17+B18+B19+B20+B21+B24+B25+B26+B51+B52+B53+B54+B55+B56+B57+B58+B59+B60+B61+B62, data=calibrate, cost=6.8, gamma=0.08)
>> > ####################################################################################################################
>> >
>> > ####################################################################################################################
>> > # Part 2: Automated Prediction
>> >
>> > #R automated prediction step for support vector
>> > #Note: first you need to calibrate the model separately and then run this script for different image segments/areas
>> > #Note: this script requires two input text files called input.txt and output.txt
>> > #The first line of input.txt gives the header, the second line the number of input segments (eg. bands and elevation values) and then the later lines list the names of the input segments with txt extension
>> > #The first line of output.txt gives the header, the second line the number of output segments which is predicted by the classifier and then the later lines list the names of the output predicted segments with txt extension
>> >
>> > # reading the parameter files
>> > input<-read.table("input.txt", header=TRUE)
>> > output<-read.table("output.txt", header=TRUE)
>> >
>> > # no_elements for 1 and 2 should be the same
>> > no_elements1<-as.integer(toString(input$para1[1]))
>> > no_elements2<-as.integer(toString(input$para2[1]))
>> >
>> > # increasing the memory limit to 4 MB
>> > memory.limit(size=4000)
>> >
>> > for (i in 1:no_elements1) {
>> >  input_name<-toString(input$para1[i+1])
>> >  predict<-read.table(input_name, header=TRUE)
>> >  predValues<-predict(calibrate.rf, predict)
>> >  predValues<-as.numeric(predValues)
>> >  output_name<-toString(output$para2[i+1])
>> >  write.table(predValues, output_name, row.names=FALSE, col.names=output_name)
>> > }
>> >
>> > Lo que muestra la funcion str() es lo siguiente:
>> >
>> >> str(output)
>> > 'data.frame':   2 obs. of  1 variable:
>> >  $ para2: Factor w/ 2 levels "1","2H_map.txt": 1 2
>> >> str(input)
>> > 'data.frame':   2 obs. of  1 variable:
>> >  $ para1: Factor w/ 2 levels "1","2H.txt": 1 2
>> >
>> >
>> >
>> >
>> > _________________________________________________________________
>> > ¿Te gustaría tener Hotmail en tu móvil Movistar? ¡Es gratis!
>> > http://serviciosmoviles.es.msn.com/hotmail/movistar-particulares.aspx
>> >        [[alternative HTML version deleted]]
>> >
>> >
>> > _______________________________________________
>> > R-help-es mailing list
>> > R-help-es en r-project.org
>> > https://stat.ethz.ch/mailman/listinfo/r-help-es
>> >
>> >
>
> _________________________________________________________________
> Ahora Messenger en tu Blackberry® 8520 con Movistar por 0 €. ¿A qué esperas?
> http://serviciosmoviles.es.msn.com/messenger/blackberry.aspx
>        [[alternative HTML version deleted]]
>
>
> _______________________________________________
> R-help-es mailing list
> R-help-es en r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>
>