[R-sig-eco] gls-crossvalidation
Pinaud David
pinaud at cebc.cnrs.fr
Thu Aug 26 11:58:05 CEST 2010
Dear Liliaina,
What is "records"? counting data? It seems that you have a zero-inflated
distribution (a lot of zero) with surdispersion. We had the same problem
and try to solve the problem by using ZI Poisson distribution rather
than using log(counts + 1). The problem here is that this kind of
distribution have the variance increasing with mean, so you can have
very large values in prediction... Look at the packages "pscl" and
"ZIGP" maybe.
HTH
David
Le 26/08/2010 11:29, Claudia liliana Ballesteros Mejia a écrit :
> Dear list,
>
> I'm trying to fit a gls model with an spatial component and I want to validate my results doing a crossvalidation. I wrote the code getting at the end the mean square error, and supposedly it should work but I'm getting huge numbers as results (ranging between 3 to 2000). Perhaps I should mention that my data have lots of zeros.
> can somebody tell me what's wrong?
> This is the code I'm using for the crossvalidation.
>
> m_err2.vect<- vector()
> for(j in 1:10)
> {
> print(j)
> select.rec<- sample(1:nrow(data.dmi), 0.9*nrow(data.dmi))
> train.rec<- data.rec[select.rec,] #Selecting 90% of the data for training purpose
> test.rec<- data.rec[-select.rec,] #Selecting 10% (remaining) for testing purpose
> gls.rec<- gls (log(records+1)~roads+pop+conflict+airport+rails+PA+pristine+tur_plac, data = train.rec,correlation=corSpher(form=~X_Mol + Y_Mol, nugget=TRUE), na.action=na.omit)
> #Create fitted values using test.dmi data
> rec_pred<- predict(gls.rec, test.rec)
> rec_obs<-test.rec[,"records"]
> # Get the prediction error = Mean Square Error (MSE)= 1/n
> m_err2<- t(rec_pred - rec_obs)%*%(rec_pred - rec_obs)/nrow(test.rec)
> m_err2.vect<- c(m_err2.vect, m_err2)
> }
>
> m_err2.vect
>
> [1] 155.68777 380.22485 121.41826 1188.19114 292.95930 40.00558 253.04283 13.58491 1239.02019 149.31290
>
> mean(m_err2.vect)
> [1] 383.3448
>
>
> Thanks a lot in advance, any suggestion would be very much appreciated.
>
> Cheers,
>
> Liliana.
>
>
>
>
>
> [[alternative HTML version deleted]]
>
>
>
>
> __________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________
>
> The message was checked by ESET Mail Security.
> http://www.eset.com
>
>
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>
>
> __________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________
>
> The message was checked by ESET Mail Security.
> http://www.eset.com
>
--
***************************************************
David PINAUD
Ingénieur de Recherche "Analyses spatiales"
Centre d'Etudes Biologiques de Chizé - CNRS UPR1934
79360 Villiers-en-Bois, France
poste 485
Tel: +33 (0)5.49.09.35.58
Fax: +33 (0)5.49.09.65.26
http://www.cebc.cnrs.fr/
***************************************************
__________ Information from ESET Mail Security, version of virus signature database 5397 (20100825) __________
The message was checked by ESET Mail Security.
http://www.eset.com
More information about the R-sig-ecology
mailing list