[R] How to do cross validation with glm?

Andra Isan andra_isan at yahoo.com
Wed Aug 24 19:19:53 CEST 2011


Hi,

Thanks for the reply. What I meant is that, I would like to partition my dat data (a data frame) into training and testing data and then evaluate the performance of the model on test data. So, I thought cross validation is the natural choice to see how the prediction works on the hold-out data. Is there any example that I can take a look to see how to do cross validation and get the prediction results on my data?

Thanks a lot,
Andra

--- On Wed, 8/24/11, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:

> From: Prof Brian Ripley <ripley at stats.ox.ac.uk>
> Subject: Re: [R] How to do cross validation with glm?
> To: "Andra Isan" <andra_isan at yahoo.com>
> Cc: r-help at r-project.org
> Date: Wednesday, August 24, 2011, 10:11 AM
> What you describe is not
> cross-validation, so I am afraid we do not know what you
> mean.  And cv.glm does 'prediction for the hold-out
> data' for you: you can read the code to see how it does so.
> 
> I suspect you mean you want to do validation on a test set,
> but that is not what you actually
> claim.   There are lots of examples of this
> sort of thing in MASS (the book, scripts in the package).
> 
> On Wed, 24 Aug 2011, Andra Isan wrote:
> 
> > Hi All,
> > 
> > I have a fitted model called glm.fit which I used glm
> and data dat is my data frame
> > 
> > pred= predict(glm.fit, data = dat, type="response")
> > 
> > to predict how it predicts on my whole data but
> obviously I have to do cross-validation to train the model
> on one part of my data and predict on the other part. So, I
> searched for it and I found a function cv.glm which is in
> package boot. So, I tired to use it as:
> > 
> > cv.glm = (cv.glm(dat, glm.fit, cost,
> K=nrow(dat))$delta)
> > 
> > but I am not sure how to do the prediction for the
> hold-out data. Is there any better way for cross-validation
> to learn a model on training data and test it on test data
> in R?
> > 
> > Thanks,
> > Andra
> > 
> > ______________________________________________
> > R-help at r-project.org
> mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained,
> reproducible code.
> > 
> 
> -- Brian D. Ripley,         
>         ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,         
>    Tel:  +44 1865 272861 (self)
> 1 South Parks Road,         
>            +44 1865
> 272866 (PA)
> Oxford OX1 3TG, UK           
>     Fax:  +44 1865 272595
>



More information about the R-help mailing list