[R] Question about variable selection
John Fox
jfox at mcmaster.ca
Sat Feb 18 20:35:05 CET 2006
Dear Wensui and Andy,
When the explanatory variables are correlated it's perfectly possible for
the marginal relationship between and X and Y to be zero and a partial
relationship nonzero (even in the absence of interactions) -- this is simply
a reflection of the more general point that partial and marginal
relationships can differ.
Regards,
John
--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox
--------------------------------
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Wensui Liu
> Sent: Saturday, February 18, 2006 2:03 PM
> To: Liaw, Andy
> Cc: r-help at stat.math.ethz.ch
> Subject: Re: [R] Question about variable selection
>
> Thank you so much for your reply, Andy.
>
> But what if I am only interesed in main effects instead of
> interactions?
>
>
>
> On 2/18/06, Liaw, Andy <andy_liaw at merck.com> wrote:
> >
> > That depends on whether the IV could have some significant
> > interactions with other Ivs not considered in the bivariate
> analysis.
> > E.g.,
> >
> > > iv <- expand.grid(-2:2, -2:2)
> > > y <- 3 + iv[,1] * iv[,2] + rnorm(nrow(iv), sd=0.1) summary(lm(y ~
> > > iv[,1]))
> >
> > Call:
> > lm(formula = y ~ iv[, 1])
> >
> > Residuals:
> > Min 1Q Median 3Q Max
> > -4.06259 -1.06048 -0.02377 1.05901 4.04315
> >
> > Coefficients:
> > Estimate Std. Error t value Pr(>|t|)
> > (Intercept) 3.01908 0.41482 7.278 2.09e-07 ***
> > iv[, 1] 0.01417 0.29332 0.048 0.962
> > ---
> > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> >
> > Residual standard error: 2.074 on 23 degrees of freedom Multiple
> > R-Squared: 0.0001014, Adjusted R-squared: -0.04337
> > F-statistic: 0.002333 on 1 and 23 DF, p-value: 0.9619
> >
> > > summary(lm(y ~ iv[,1] * iv[,2]))
> >
> > Call:
> > lm(formula = y ~ iv[, 1] * iv[, 2])
> >
> > Residuals:
> > Min 1Q Median 3Q Max
> > -0.22390 -0.08894 -0.01279 0.13525 0.17608
> >
> > Coefficients:
> > Estimate Std. Error t value Pr(>|t|)
> > (Intercept) 3.019083 0.026330 114.665 <2e-16 ***
> > iv[, 1] 0.014167 0.018618 0.761 0.455
> > iv[, 2] -0.005486 0.018618 -0.295 0.771
> > iv[, 1]:iv[, 2] 0.992865 0.013165 75.418 <2e-16 ***
> > ---
> > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> >
> > Residual standard error: 0.1316 on 21 degrees of freedom
> > Multiple R-Squared: 0.9963, Adjusted R-squared: 0.9958
> > F-statistic: 1896 on 3 and 21 DF, p-value: < 2.2e-16
> >
> >
> >
> >
> > Andy
> >
> > From: Wensui Liu
> > >
> > > Dear Lister,
> > >
> > > I have a question about variable selection for regression.
> > >
> > > if the IV is not significantly related to DV in the bivariate
> > > analysis, does it make sense to include this IV into the
> full model
> > > with multiple IVs?
> > >
> > > Thank you so much!
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> > >
> >
> >
> >
> >
> ----------------------------------------------------------------------
> > --------
> > Notice: This e-mail message, together with any
> > attachment...{{dropped}}
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list