[R] External validation for a hurdle model (pscl)

Maria Eugenia Utgés m@ri@eugeni@u @ending from gm@il@com
Wed Jan 9 15:34:24 CET 2019


Hi Jeff,
Yes, my question is more general perhaps
Not about R programming, data exploration, or statistical theory.
Just that in modelling texts external validation is set as "panacea" but
"unreacheable", so they explain other methods as cross validation,
bootstrapping, etc.
Here I have new data for a previously constructed model (and already
internally validated by bootstrapping), but have not found how to correctly
and sufficiently make the external validation and by which means (all ends
in just a plot? a % of correct classification?)

El mar., 8 ene. 2019 a las 17:08, Jeff Newmiller (<jdnewmil using dcn.davis.ca.us>)
escribió:

> That said, the gist of the OP's outline is correct, and the main reason to
> look elsewhere is to get more thorough advice on what statistical concerns
> should be addressed than would be on topic here.
>
> One comment: reviewing plots of differences versus various independent
> variables for systematic biases is a task R is particularly well suited
> for, but discovering which plots highlight issues with your model or data
> takes familiarity with your data (explore) and with theory (which you learn
> elsewhere) and with R (which we can help with if you have more specific
> questions).
>
> On January 8, 2019 10:50:14 AM PST, Bert Gunter <bgunter.4567 using gmail.com>
> wrote:
> >This list is (mostly) about R programming. Your query is (mostly) about
> >statistics. So you should post on a statistics site like
> >stats.stackexchange.com
> >not here; I am pretty sure you'll receive lots of answers there.
> >
> >Cheers,
> >Bert
> >
> >
> >Bert Gunter
> >
> >"The trouble with having an open mind is that people keep coming along
> >and
> >sticking things into it."
> >-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> >On Tue, Jan 8, 2019 at 10:18 AM Maria Eugenia Utgés
> ><mariaeugeniau using gmail.com>
> >wrote:
> >
> >> Hi R-list,
> >> We have constructed a hurdle model some time ago.
> >> Now we were able to gather new data in the same city (38 new sites),
> >and
> >> want to do an external validation to see if the model still performs
> >ok.
> >> All the books and lectures I have read say its the best validation
> >option
> >> but...
> >> I have made a (simple) search, but it seems that as having new data
> >for a
> >> model is rare, have not found anything with the depth enough so as to
> >> reproduce it/adapt it to hurdle models.
> >>
> >> I have predicted the probability for non-zero counts
> >> nonzero <- 1 - predict(final, newdata = datosnuevos, type = "prob")[,
> >1]
> >>
> >> and the predicted mean from the count component
> >> countmean <- predict(final, newdata = datosnuevos, type = "count")
> >>
> >> I understand that "newdata" is taking into account the new values for
> >the
> >> independent variables (environmental variables), is it?
> >>
> >> So, I have to compare the predicted values of y (calculated with the
> >new
> >> values of the environmental variables) with the new observed values.
> >>
> >> That would be using the model (constructed with the old values),
> >having as
> >> input the new variables, and having as output a "new" prediction, to
> >be
> >> contrasted with the "new" observed y.
> >>
> >> These comparison would be by means of AUC, correct classification,
> >and/or
> >> what other options? Results of the external validation would just be
> >a % of
> >> correct predicted values? plots?
> >>
> >> Need some guidance, sorry if the explanation was "basic" but needed
> >to
> >> write it in my own words so as not to miss any detail.
> >>
> >> Thank you very much in advance,
> >>
> >> María Eugenia Utgés
> >>
> >> CeNDIE-ANLIS
> >> Buenos Aires
> >> Argentina
> >> a
> >>
> >>         [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list