[R-es] aumentar tamaño de memoria a mas de 4Gb‏

Jue Mar 18 11:42:55 CET 2010

Una tercera opción siguiendo el hilo de propuestas de Carlos consiste
a muestrear el conjunto de entrenamiento, y llevar un análisis de  
variación de los resultados versus el tamaño muestral.
Pues, es muy probable que no necesites toda la información para  
conseguir un modelo de predicción fiable.
Este de tipo de situación es parte de la vocación del muestreo.

Un saludo. Olivier
--  
____________________________________

Olivier G. Nuñez
Email: onunez en iberstat.es
Tel : +34 663 03 69 09
Web: http://www.iberstat.es

____________________________________

El 18/03/2010, a las 11:22, Víctor Rodríguez Galiano escribió:

>
> El error se produce al hacer la predicción.
>
>> Date: Thu, 18 Mar 2010 11:19:12 +0100
>> Subject: Re: [R-es] aumentar tamaño de memoria a mas de 4Gb‏
>> From: cgb en datanalytics.com
>> To: luxorvrg en hotmail.com
>> CC: r-help-es en r-project.org
>>
>> Hola, ¿qué tal?
>>
>> ¿Sabes exactamente dónde se produce el error? ¿Es al construir el
>> modelo o al hacer la predicción?
>>
>> Si sucede lo primero, ¿de qué tamaño es tu conjunto de 
>> entrenamiento?
>>
>> Un saludo,
>>
>> Carlos J. Gil Bellosta
>> http://www.datanalytics.com
>>
>>
>>
>>
>> El día 18 de marzo de 2010 10:43, Víctor Rodríguez Galiano
>> <luxorvrg en hotmail.com> escribió:
>>>
>>> Hola de nuevo,
>>>
>>> Esta es la información de mi sesion:
>>>
>>> R version 2.10.1 (2009-12-14)
>>> i386-pc-mingw32
>>>
>>> locale:
>>> [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
>>> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
>>> [5] LC_TIME=Spanish_Spain.1252
>>>
>>> attached base packages:
>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>
>>>
>>> Lo que yo prentendo es clasificar unos datos que están en  
>>> ficheros de texto, el tamaño de cada uno de estos ficheros va  
>>> desde los 10mb a 90mb. En un fichero de texto (input) le indico  
>>> los ficheros que tiene que ir cogiendo para predecir y en el  
>>> archivo output los nombres de los ficheros de salida con las  
>>> clasificaciones. Como os comentaba antes, con el clasificador  
>>> randomForest no tengo problema pero si con support vector. Cuando  
>>> trabajo con los datos de entrenamiento y test no hay ningún  
>>> problema. El problema surge al intentar clasificar nuevos ejemplos.
>>>
>>>
>>>
>>> Os adjunto también el codigo que meto para clasificar por si :
>>>
>>> # R script for running random forest classification model and  
>>> prediction for many segments/areas
>>> # Need to run calibration only once for full model and then run  
>>> prediction in a loop for different segments/areas/regions
>>> #################################################################### 
>>> ###############################################
>>> # Part 1: calibration
>>>
>>> library(e1071)
>>>
>>> #calibration step
>>> calibrate<-read.table("calibration.txt", header=TRUE)
>>> calibrate$calibration<-as.factor(calibrate$calibration)
>>> calibrate.rf<-svm(calibration~B1+B14+B15+B16+B17+B18+B19+B20+B21 
>>> +B24+B25+B26+B51+B52+B53+B54+B55+B56+B57+B58+B59+B60+B61+B62,  
>>> data=calibrate, cost=6.8, gamma=0.08)
>>> #################################################################### 
>>> ################################################
>>>
>>> #################################################################### 
>>> ################################################
>>> # Part 2: Automated Prediction
>>>
>>> #R automated prediction step for support vector
>>> #Note: first you need to calibrate the model separately and then  
>>> run this script for different image segments/areas
>>> #Note: this script requires two input text files called input.txt  
>>> and output.txt
>>> #The first line of input.txt gives the header, the second line  
>>> the number of input segments (eg. bands and elevation values) and  
>>> then the later lines list the names of the input segments with  
>>> txt extension
>>> #The first line of output.txt gives the header, the second line  
>>> the number of output segments which is predicted by the  
>>> classifier and then the later lines list the names of the output  
>>> predicted segments with txt extension
>>>
>>> # reading the parameter files
>>> input<-read.table("input.txt", header=TRUE)
>>> output<-read.table("output.txt", header=TRUE)
>>>
>>> # no_elements for 1 and 2 should be the same
>>> no_elements1<-as.integer(toString(input$para1[1]))
>>> no_elements2<-as.integer(toString(input$para2[1]))
>>>
>>> # increasing the memory limit to 4 MB
>>> memory.limit(size=4000)
>>>
>>> for (i in 1:no_elements1) {
>>>  input_name<-toString(input$para1[i+1])
>>>  predict<-read.table(input_name, header=TRUE)
>>>  predValues<-predict(calibrate.rf, predict)
>>>  predValues<-as.numeric(predValues)
>>>  output_name<-toString(output$para2[i+1])
>>>  write.table(predValues, output_name, row.names=FALSE,  
>>> col.names=output_name)
>>> }
>>>
>>> Lo que muestra la funcion str() es lo siguiente:
>>>
>>>> str(output)
>>> 'data.frame':   2 obs. of  1 variable:
>>>  $ para2: Factor w/ 2 levels "1","2H_map.txt": 1 2
>>>> str(input)
>>> 'data.frame':   2 obs. of  1 variable:
>>>  $ para1: Factor w/ 2 levels "1","2H.txt": 1 2
>>>
>>>
>>>
>>>
>>> _________________________________________________________________
>>> ¿Te gustaría tener Hotmail en tu móvil Movistar? ¡Es gratis!
>>> http://serviciosmoviles.es.msn.com/hotmail/movistar- 
>>> particulares.aspx
>>>        [[alternative HTML version deleted]]
>>>
>>>
>>> _______________________________________________
>>> R-help-es mailing list
>>> R-help-es en r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>>
>>>
>  		 	   		
> _________________________________________________________________
> Ahora Messenger en tu Blackberry® 8520 con Movistar por 0 €. ¿A  
> qué esperas?
> http://serviciosmoviles.es.msn.com/messenger/blackberry.aspx
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-help-es mailing list
> R-help-es en r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es