[R] cv.glm {boot}

Liaw, Andy andy_liaw at merck.com
Tue Mar 15 17:08:05 CET 2005


> From: Trevor Wiens
> 
> On Tue, 15 Mar 2005 07:05:49 +0000 (GMT)
> Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> 
> > 
> > Cross-validation assumes exchangeability of units.  You can 
> easily write 
> > your own code (lots of examples in MASS), but first you 
> would need to 
> > prove the validity of what you are attempting.  For 
> example, dropping 
> > chunks in the middle of a time series is not valid unless 
> your prediction 
> > somehow takes the temporal structure into account (and glm 
> does not).
> > 
> 
> Yes, I'm aware of that and I do have a number of predictors 
> which vary with time (from year to year such as precipitation 
> or properly timed vegetation indices from each year....) so 
> that isn't my problem. Also my spatial blocking is also valid 
> (distinct partitions of the study area). I'm also aware of 
> the problems of spatial autocorrelation and have taken some 
> measures to deal with that. I am however rather new at R and 
> not a statistician, so I am heavily reliant on books such as 
> Hosmer and Lemeshow or Manley(Resource selection by Animals) 
> on procedure. Unforunately, they are not S-plus or R oriented 
> so I have some difficulty translating those ideas to R.
> 
> You mention lots of examples in MASS regarding 
> cross-validation, but I can't find them. Perhaps I'm looking 
> in the wrong spot. I've done help.search('validation'), .... 
> and found nothing that seemed obviously applicable to my 
> problem. I suppose I should pick up a copy of your books 
> which would probably be very helpful. However, if it isn't 
> too much trouble. I would really appreciate a bit more direct help. 

`MASS' _is_ a book, the supporting software of which contains a `scripts'
subdirectory that has R verion of codes used in the book, including code for
CV.

Andy

 
> This is what I assumed I would do somethink like this (in 
> this example basp = Baird's Sparrow presence or absence)
> 
> train <- birddata[birddata$recordyear != 2000]
> test <- birddata[birddata$recordyear == 2000]
> train.glm <- glm(basp ~ elev + slope + precip + precip_1 ..., 
> data=birddata, family=binomial)
> pred <- predict(train.glm, newdata=test, type='response')
> actual <- test$basp
> what happens next??
> 
> Thanks in advance.
> 
> T
> -- 
> Trevor Wiens 
> twiens at interbaun.com
> 
> The significant problems that we face cannot be solved at the same 
> level of thinking we were at when we created them. 
> (Albert Einstein)
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
> 
>




More information about the R-help mailing list