[R] Bestglm subset analysis

Bert Gunter bgunter.4567 at gmail.com
Thu Jun 30 01:28:33 CEST 2016


This is a statistics question, which is largely off topic on this
list. However, I'll give you a very brief OT response:

I would strongly suggest you consult a local statistician to explain
to why what you are doing is likely to result in complete nonsense
(best subset of 5 or 6 from 21 predictors on 79 cases). Failing that,
try asking on a statistics site, like stats.stackexchange.com.

Note also: This is a plain text list. Please don't post in HTML.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jun 29, 2016 at 11:24 AM, D Wolf via R-help
<r-help at r-project.org> wrote:
> Hello All,
> I am working on a linear regression model and trying to find the best subset of variables for my dataset. I have 21 predictors, 1 response variable, and 79 observations. I need to find the best 5 or 6 predictors for my model. I've used leaps for lm() and I'm now trying bestglm for glm(). I'm following this webpage, which gives the code below. https://rstudio-pubs-static.s3.amazonaws.com/2897_9220b21cfc0c43a396ff9abf122bb351.html
> My code:library(bestglm)library(base)lbw.for.bestglm <- within(df_Chl, {y <- df_Chl$Chloro })res.bestglm <- bestglm(Xy = lbw.for.bestglm, family = gaussian, IC = "AIC", method = "exhaustive")
> # get coefficientsres.bestglm$BestModelsHere is a sample of my results (I removed the 5th through 21st predictors for brevity).> res.bestglm$BestModels    R21   R31   R32   R41 1 FALSE FALSE FALSE FALSE  2 FALSE  TRUE FALSE FALSE  3 FALSE FALSE FALSE FALSE 4 FALSE  TRUE FALSE FALSE 5 FALSE  TRUE FALSE FALSE  Criterion1  326.73272  326.95253  327.06594  327.09125  327.8208
> Is it correct to assume I should keep variables that are TRUE from 1 through 5? What do those five rows represent?
> I know the AIC criterion result should be as low as possible. Is it possible to discern a good result for any of the IC criterion results, such as AIC, LOOCV, BICg, etc..? If BIC returns lower Criterion results, does that mean I need to use the BIC subset instead of the subset from AIC?
> Thank You,
> Doug
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list