[R] How set lm() to don't return NA in summary()?

peter dalgaard pdalgd at gmail.com
Sat Aug 6 09:47:23 CEST 2011


On Aug 6, 2011, at 03:23 , Walmes Zeviani wrote:

> Hi,
> 
> I've data from an incomplete fatorial design. One level of a factor doesn't
> has the levels of the other. When I use lm(), the summary() return NA for
> that non estimable parameters. Ok, I understant it. But I use
> contrast::contrast(), gmodels::estimable(), multcomp::glht() and all these
> fail when model has NA estimates. This is becouse vcov() and coef() has
> different dimensions. Is possible set lm() to omit NA? Below same toy data
> and code.
> 
>> # toy data
>> adi <- expand.grid(cult=gl(1,3,la=LETTERS[1]), fert=101)
>> fat <- expand.grid(cult=gl(2,3,la=LETTERS[2:3]), fert=seq(50,150,50))
>> da <- rbind(adi, fat)
>> da$y <- da$fert+rnorm(nrow(da),0,10)
>> 
>> # plot
>> require(lattice)
>> xyplot(y~fert|cult, da)
>> 
>> # adjust
>> m0 <- lm(y~cult*fert, da)
>> summary(m0)
> 
> ...
> Coefficients: (1 not defined because of singularities)
>            Estimate Std. Error t value Pr(>|t|)
> (Intercept)  7.55401   10.18956   0.741    0.469
> cultB       -8.04486   13.54672  -0.594    0.561
> cultC       -3.83644    6.74848  -0.568    0.578
> fert         0.87486    0.08265  10.586 1.24e-08 ***
> cultB:fert   0.13509    0.11688   1.156    0.265
> cultC:fert        NA         NA      NA       NA
> ...
>> require(gmodels)
>> estimable(m0, cm=c(1,0,0,100,0,0))
> Erro em estimable.default(m0, cm = c(1, 0, 0, 100, 0, 0)) :
> Dimension of structure(c(1, 0, 0, 100, 0, 0), .Dim = c(1L, 6L)): 1x6, not
> compatible with no of parameters in m0: 5

...and of course (you forgot to say) just removing the 6th element of cm won't do either.

A generic trick is to modify the model matrix directly:

> M <- model.matrix(m0)[,-6]
> m1 <- update(m0, ~ M - 1)
> summary(m1)

Call:
lm(formula = y ~ M - 1, data = da)

Residuals:
    Min      1Q  Median      3Q     Max 
-27.165  -4.704  -2.446   6.666  25.328 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
M(Intercept)  -1.8618    14.6718  -0.127    0.901    
McultB         2.2102    19.5058   0.113    0.911    
McultC        11.5308     9.7171   1.187    0.253    
Mfert          0.9506     0.1190   7.989 5.65e-07 ***
McultB:fert    0.0758     0.1683   0.450    0.658    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 14.57 on 16 degrees of freedom
Multiple R-squared: 0.9866,	Adjusted R-squared: 0.9824 
F-statistic: 235.5 on 5 and 16 DF,  p-value: 2.165e-14 

> estimable(m1, cm=c(1,0,0,100,0))
              Estimate Std. Error  t value DF     Pr(>|t|)
(1 0 0 100 0) 93.20357   8.415451 11.07529 16 6.515116e-09


Two things to notice: The coefficient names are not quite the same as the original, and you need to fit without the intercept (or remove the 1st column as well). 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
"Døden skal tape!" --- Nordahl Grieg



More information about the R-help mailing list