[R-sig-eco] gls-crossvalidation

Pinaud David pinaud at cebc.cnrs.fr
Thu Aug 26 11:58:05 CEST 2010


  Dear Liliaina,
What is "records"? counting data? It seems that you have a zero-inflated 
distribution (a lot of zero) with surdispersion. We had the same problem 
and try to solve the problem by using ZI Poisson distribution rather 
than using log(counts + 1). The problem here is that this kind of 
distribution have the variance increasing with mean, so you can have 
very large values in prediction... Look at the packages "pscl" and 
"ZIGP" maybe.
HTH
David

Le 26/08/2010 11:29, Claudia liliana Ballesteros Mejia a écrit :
> Dear list,
>
> I'm trying to fit a gls model with an spatial component and I want to validate my results doing a crossvalidation. I wrote the code getting at the end the mean square error, and supposedly it should work but I'm getting huge numbers as results (ranging between 3 to 2000). Perhaps I should mention that my data have lots of zeros.
> can somebody tell me what's wrong?
> This is the code I'm using for the crossvalidation.
>
> m_err2.vect<- vector()
> for(j in 1:10)
>     {
>     print(j)
>     select.rec<- sample(1:nrow(data.dmi), 0.9*nrow(data.dmi))
>     train.rec<- data.rec[select.rec,]   #Selecting 90% of the data for training purpose
>     test.rec<-  data.rec[-select.rec,]   #Selecting 10% (remaining) for testing purpose
>     gls.rec<- gls (log(records+1)~roads+pop+conflict+airport+rails+PA+pristine+tur_plac, data = train.rec,correlation=corSpher(form=~X_Mol + Y_Mol, nugget=TRUE), na.action=na.omit)
>     #Create fitted values using test.dmi data
>     rec_pred<- predict(gls.rec, test.rec)
>     rec_obs<-test.rec[,"records"]
>     # Get the prediction error = Mean Square Error (MSE)= 1/n
>     m_err2<- t(rec_pred - rec_obs)%*%(rec_pred - rec_obs)/nrow(test.rec)
>     m_err2.vect<- c(m_err2.vect, m_err2)
> }
>
>   m_err2.vect
>
>   [1]  155.68777  380.22485  121.41826 1188.19114  292.95930   40.00558  253.04283   13.58491 1239.02019  149.31290
>
> mean(m_err2.vect)
> [1] 383.3448
>
>
> Thanks a lot in advance, any suggestion would be very much appreciated.
>
> Cheers,
>
> Liliana.
>
>
>
>
>
> 	[[alternative HTML version deleted]]
>
>
>
>
> __________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________
>
> The message was checked by ESET Mail Security.
> http://www.eset.com
>
>
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
>
> __________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________
>
> The message was checked by ESET Mail Security.
> http://www.eset.com
>

-- 
***************************************************
David PINAUD
Ingénieur de Recherche "Analyses spatiales"

Centre d'Etudes Biologiques de Chizé - CNRS UPR1934
79360 Villiers-en-Bois, France
poste 485
Tel: +33 (0)5.49.09.35.58
Fax: +33 (0)5.49.09.65.26
http://www.cebc.cnrs.fr/

***************************************************




__________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________

The message was checked by ESET Mail Security.
http://www.eset.com



More information about the R-sig-ecology mailing list