[R] dealing with multicollinearity

ronggui 0034058 at fudan.edu.cn
Mon Apr 11 15:01:14 CEST 2005


why not use vif command (from car library) to caculate the VIF to help you assess is a collinearity is infulential?

I have never  seen any book dealling with this topics by perturbation analysis.

the VIF,tolerance,principal component analysis are the tools dealing with collinearity.you can get the information from john fox's book.

generally,caculating the correlation directly is not essential.

one more thing,if your purpose of modeling is  prediction but not interpretation,collinearity does not matter much.


On Mon, 11 Apr 2005 12:22:55 +0200 (CEST)
Manuel Gutierrez <manuel_gutierrez_lopez at yahoo.es> wrote:

> 
> I have a linear model y~x1+x2 of some data where the
> coefficient for
> x1 is higher than I would have expected from theory
> (0.7 vs 0.88)
> I wondered whether this would be an artifact due to x1
> and x2 being correlated despite that the variance
> inflation factor is not too high (1.065):
> I used perturbation analysis to evaluate collinearity
> library(perturb)
> P<-perturb(A,pvars=c("x1","x2"),prange=c(1,1))
> > summary(P)
> Perturb variables:
> x1 		 normal(0,1) 
> x2 		 normal(0,1) 
> 
> Impact of perturbations on coefficients:
>             mean     s.d.     min      max     
> (Intercept)  -26.067    0.270  -27.235  -25.481
> x1             0.726    0.025    0.672    0.882
> x2             0.060    0.011    0.037    0.082
> 
> I get a mean for x1 of 0.726 which is closer to what
> is expected.
> I am not an statistical expert so I'd like to know if
> my evaluation of the effects of collinearity is
> correct and in that case any solutions to obtain a
> reliable linear model.
> Thanks,
> Manuel
> 
> Some more detailed information:
> 
> > A<-lm(y~x1+x2)
> > summary(A)
> 
> Call:
> lm(formula = y ~ x1 + x2)
> 
> Residuals:
>       Min        1Q    Median        3Q       Max 
> -4.221946 -0.484055 -0.004762  0.397508  2.542769 
> 
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)    
> (Intercept) -27.23472    0.27996 -97.282  < 2e-16 ***
> x1            0.88202    0.02475  35.639  < 2e-16 ***
> x2            0.08180    0.01239   6.604 2.53e-10 ***
> ---
> Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.'
> 0.1 ` ' 1 
> 
> Residual standard error: 0.823 on 241 degrees of
> freedom
> Multiple R-Squared: 0.8411,	Adjusted R-squared: 0.8398
> 
> F-statistic: 637.8 on 2 and 241 DF,  p-value: <
> 2.2e-16 
> 
> > cor.test(x1,x2)
> 
> 	Pearson's product-moment correlation
> 
> data:  x1 and x2 
> t = -3.9924, df = 242, p-value = 8.678e-05
> alternative hypothesis: true correlation is not equal
> to 0 
> 95 percent confidence interval:
>  -0.3628424 -0.1269618 
> sample estimates:
>       cor 
> -0.248584
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




More information about the R-help mailing list