[R] test logistic regression model

Mitchell Maltenfort mm@|ten @end|ng |rom gm@||@com
Sun Nov 20 18:49:26 CET 2022


Two possible fixes occur to me

1) Redo the test/training split but within levels of factor - so you have
the same split within each level and each level accounted for in training
and testing

2) if you have a lot of levels, and perhaps sparse representation in a few,
consider recoding levels to pool the rare ones into an “other” category

On Sun, Nov 20, 2022 at 11:41 AM Bert Gunter <bgunter.4567 using gmail.com> wrote:

> small reprex:
>
> set.seed(5)
> dat <- data.frame(f = rep(c('r','g'),4), y = runif(8))
> newdat <- data.frame(f =rep(c('r','g','b'),2))
> ## convert values in newdat not seen in dat to NA
> is.na(newdat$f) <-!( newdat$f %in% dat$f)
> lmfit <- lm(y~f, data = dat)
>
> ##Result:
> > predict(lmfit,newdat)
>         1         2         3         4         5         6
> 0.4374251 0.6196527        NA 0.4374251 0.6196527        NA
>
> If this does not suffice, as Rui said, we need details of what you did.
> (predict.glm works like predict.lm)
>
>
> -- Bert
>
>
> On Sun, Nov 20, 2022 at 7:46 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:
> >
> > Às 15:29 de 20/11/2022, Gábor Malomsoki escreveu:
> > > Dear Bert,
> > >
> > > Yes, was trying to fill the not existing categories with NAs, but the
> > > suggested solutions in stackoverflow.com unfortunately did not work.
> > >
> > > Best regards
> > > Gabor
> > >
> > >
> > > Bert Gunter <bgunter.4567 using gmail.com> schrieb am So., 20. Nov. 2022,
> 16:20:
> > >
> > >> You can't predict results for categories that you've not seen before
> > >> (think about it). You will need to remove those cases from your test
> set
> > >> (or convert them to NA and predict them as NA).
> > >>
> > >> -- Bert
> > >>
> > >> On Sun, Nov 20, 2022 at 7:02 AM Gábor Malomsoki <
> gmalomsoki1980 using gmail.com>
> > >> wrote:
> > >>
> > >>> Dear all,
> > >>>
> > >>> i have created a logistic regression model,
> > >>>   on the train df:
> > >>> mymodel1 <- glm(book_state ~ TG_KraftF5, data = train, family =
> > >>> "binomial")
> > >>>
> > >>> then i try to predict with the test df
> > >>> Predict<- predict(mymodel1, newdata = test, type = "response")
> > >>> then iget this error message:
> > >>> Error in model.frame.default(Terms, newdata, na.action = na.action,
> xlev =
> > >>> object$xlevels)
> > >>> Factor  "TG_KraftF5" has new levels
> > >>>
> > >>> i have tried different proposals from stackoverflow, but
> unfortunately
> > >>> they
> > >>> did not solved the problem.
> > >>> Do you have any idea how to test a logistic regression model when
> you have
> > >>> different levels in train and in test df?
> > >>>
> > >>> thank you in advance
> > >>> Regards,
> > >>> Gabor
> > >>>
> > >>>          [[alternative HTML version deleted]]
> > >>>
> > >>> ______________________________________________
> > >>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >>> https://stat.ethz.ch/mailman/listinfo/r-help
> > >>> PLEASE do read the posting guide
> > >>> http://www.R-project.org/posting-guide.html
> > >>> and provide commented, minimal, self-contained, reproducible code.
> > >>>
> > >>
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > hello,
> >
> > What exactly didn't work? You say you have tried the solutions found in
> > stackoverflow but without a link, we don't know which answers to which
> > questions you are talking about.
> > Like Bert said, if you assign NA to the new levels, present only in
> > test, it should work.
> >
> > Can you post links to what you have tried?
> >
> > Hope this helps,
> >
> > Rui Barradas
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
-- 
Sent from Gmail Mobile

	[[alternative HTML version deleted]]



More information about the R-help mailing list