[R] vif in package car: there are aliased coefficients in the model

John Fox jfox at mcmaster.ca
Sat Mar 28 19:17:52 CET 2015


Dear Rodolfo,

Sending the data helps, though if you had done what I suggested, you would have seen what's going on:

-------------------- snip ------------------

> dim(data)
[1] 8 8

> summary(lm(response_variable ~ predictor_1 + predictor_2 + predictor_3 + predictor_4 
+             + predictor_5 + predictor_6 + predictor_7, data = data))

Call:
lm(formula = response_variable ~ predictor_1 + predictor_2 + 
    predictor_3 + predictor_4 + predictor_5 + predictor_6 + predictor_7, 
    data = data)

Residuals:
ALL 8 residuals are 0: no residual degrees of freedom!

Coefficients: (1 not defined because of singularities)
                    Estimate Std. Error t value Pr(>|t|)
(Intercept)          -5.1905         NA      NA       NA
predictor_1yellow     2.4477         NA      NA       NA
predictor_2fora       6.5056         NA      NA       NA
predictor_2interior   6.0769         NA      NA       NA
predictor_3           0.6750         NA      NA       NA
predictor_4           3.0742         NA      NA       NA
predictor_5           0.6715         NA      NA       NA
predictor_6          -0.9850         NA      NA       NA
predictor_7               NA         NA      NA       NA

Residual standard error: NaN on 0 degrees of freedom
Multiple R-squared:      1,	Adjusted R-squared:    NaN 
F-statistic:   NaN on 7 and 0 DF,  p-value: NA

-------------------- snip ------------------

So the data set that you're using has 8 cases and 8 variables, one of which is a factor with 3 levels. Consequently, the model you're fitting my LS has 9 coefficients. Necessarily the rank of the model matrix is deficient. When you eliminate a coefficient, you get a perfect fit: 8 coefficients fit to 8 cases with 0 df for error.

This is of course nonsense: You don't have enough data to fit a model of this complexity. In fact, you might not have enough data to reasonably fit a model with just 1 predictor.

I'm cc'ing this response to the r-help email list, where you started this thread.

Best,
 John

On Sat, 28 Mar 2015 12:04:05 -0300
 Rodolfo Pelinson <rodolfopelinson at gmail.com> wrote:
> Thanks a lot for your answer and your time! But Im still having the same
> problem.
> 
> That's the script I am using:
> ____________________________________________________________________________________________________________________
> library(car)
> 
> data <-read.table("data_vif.txt", header = T, sep = "\t", row.names = 1)
> data
> 
> vif(lm(response_variable ~ predictor_1 + predictor_2 + predictor_3 +
> predictor_4 + predictor_5 + predictor_6 + predictor_7, data = data))
> 
> vif(lm(response_variable ~ predictor_1 + predictor_2 + predictor_3 +
> predictor_4 + predictor_5 + predictor_6, data = data))
> ____________________________________________________________________________________________________________________
> 
> the first vif function above returns me the following error:
> 
> "Error in vif.default(lm(response_variable ~ predictor_1 + predictor_2 +  :
>   there are aliased coefficients in the model"
> 
> Then if I remove any one of the predictors (in the script I removed
> predictor_7 as an example), it returns this:
> 
>             GVIF Df GVIF^(1/(2*Df))
> predictor_1  NaN  1             NaN
> predictor_2  NaN  2             NaN
> predictor_3  NaN  1             NaN
> predictor_4  NaN  1             NaN
> predictor_5  NaN  1             NaN
> predictor_6  NaN  1             NaN
> Warning message:
> In cov2cor(v) : diag(.) had 0 or NA entries; non-finite result is doubtful
> 
> 
> Can you help me with this? I even attached to this e-mail my data set. It's
> a small table.
> 
> Sorry for the question.
> 
> 
> 
> 2015-03-27 21:51 GMT-03:00 John Fox <jfox at mcmaster.ca>:
> 
> > Dear Rodolfo,
> >
> > It's apparently the case that at least one of the columns of the model
> > matrix for your model is perfectly collinear with others.
> >
> > There's not nearly enough information here to figure out exactly what the
> > problem is, and the information that you provided certainly falls short of
> > allowing me or anyone else to reproduce your problem and diagnose it
> > properly. It's not even clear from your message exactly what the structure
> > of the model is, although localizacao  is apparently a factor with 3
> > levels.
> >
> >
> > If you look at the summary() output for your model or just print it, you
> > should at least see which coefficients are aliased, and that might help you
> > understand what went wrong.
> >
> > I hope this helps,
> >  John
> >
> > -------------------------------------------------------
> > John Fox, Professor
> > McMaster University
> > Hamilton, Ontario, Canada
> > http://socserv.mcmaster.ca/jfox/
> >
> >
> > > -----Original Message-----
> > > From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Rodolfo
> > > Pelinson
> > > Sent: March-27-15 3:07 PM
> > > To: r-help at r-project.org
> > > Subject: [R] vif in package car: there are aliased coefficients in the
> > model
> > >
> > > Hello. I'm trying to use the function vif from package car in a lm.
> > However it
> > > returns the following error:
> > > "Error in vif.default(lm(MDescores.sitescores ~ hidroperiodo +
> > localizacao
> > > +  : there are aliased coefficients in the model"
> > >
> > > When I exclude any predictor from the model, it returns this warning
> > > message:
> > > "Warning message: In cov2cor(v) : diag(.) had 0 or NA entries; non-finite
> > > result is doubtful"
> > >
> > > When I exclude any other predictor from the model vif finally works. I
> > can't
> > > figure it out whats the problem. This are the results that R returns
> > > me:
> > >
> > > > vif(lm(MDescores.sitescores ~ hidroperiodo + localizacao + area +
> > > profundidade + NTVM +  NTVI + PCs...c.1.., data = MDVIF)) Error in
> > > vif.default(lm(MDescores.sitescores ~ hidroperiodo + localizacao +
> > >  :   there are aliased coefficients in the model
> > >
> > > > vif(lm(MDescores.sitescores ~ localizacao + area + profundidade + NTVM
> > > > +
> > >  NTVI + PCs...c.1.., data = MDVIF))
> > >              GVIF Df GVIF^(1/(2*Df))
> > > localizacao   NaN  2             NaN
> > > area          NaN  1             NaN
> > > profundidade  NaN  1             NaN
> > > NTVM          NaN  1             NaN
> > > NTVI          NaN  1             NaN
> > > PCs...c.1..   NaN  1             NaN
> > > Warning message:
> > > In cov2cor(v) : diag(.) had 0 or NA entries; non-finite result is
> > doubtful
> > >
> > > Thanks.
> > > --
> > > Rodolfo Mei Pelinson.
> > >
> > >       [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-
> > > guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> >
> > ---
> > This email has been checked for viruses by Avast antivirus software.
> > http://www.avast.com
> >
> >
> 
> 
> -- 
> Rodolfo Mei Pelinson.

------------------------------------------------
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/



More information about the R-help mailing list