[R] Summary coefficients give NA values because of singularities

Uwe Ligges ligges at statistik.tu-dortmund.de
Tue Dec 6 14:15:08 CET 2011



On 05.12.2011 21:57, Gathurst wrote:
> Hello,
>
> I have a data set which I am using to find a model with the most significant
> parameters included and most importantly, the p-values.  The full model is
> of the form:
>    sad[,1]~b_1 sad[,2]+b_2 sad[,3]+b_3 sad[,4]+b_4 sad[,5]+b_5 sad[,6]+b_6
> sad[,7]+b_7 sad[,8]+b_8 sad[,9]+b_9 sad[,10],
> where the 9 variables on the right hand side are all indicator variables.
> The thing I don't understand is the line ' sad[, 10]         NA         NA
> NA       NA ' as a result of 'Coefficients: (1 not defined because of
> singularities)'.
>
> I think the output is taking sad[,10] as the intercept, based on previous
> attempts at figuring my issue out, which I find a bit wierd considering
> sad[,10] is either 0 or 1.  How do I produce the correct output showing all
> p-values?

You cannot: sad[,10] is either collinear to one or more of the other 
variables or is constant.

Uwe Ligges




>
> My code and output is as follows:
>
> sad<-matrix(1,ncol=11,nrow=486)
> sad[,c(1:10)]<-d[,2][-357]
> sad[,1]<-d[,29][-357]
> sad[,2][sad[,2]!=1]<-0
> sad[,3][sad[,3]!=2]<-0
> sad[,4][sad[,4]!=3]<-0
> sad[,5][sad[,5]!=4]<-0
> sad[,6][sad[,6]!=5]<-0
> sad[,7][sad[,7]!=6]<-0
> sad[,8][sad[,8]!=7]<-0
> sad[,9][sad[,9]!=8]<-0
> sad[,10][sad[,10]!=9]<-0
> sad[,2][sad[,2]==1]<-1
> sad[,3][sad[,3]==2]<-1
> sad[,4][sad[,4]==3]<-1
> sad[,5][sad[,5]==4]<-1
> sad[,6][sad[,6]==5]<-1
> sad[,7][sad[,7]==6]<-1
> sad[,8][sad[,8]==7]<-1
> sad[,9][sad[,9]==8]<-1
> sad[,10][sad[,10]==9]<-1
> sad
>
> summary(lm(sad[,1]~sad[,2]+sad[,3]
> +sad[,4]+sad[,5]+sad[,6]
> +sad[,7]+sad[,8]+sad[,9]+sad[,10]))
>
> Call:
> lm(formula = sad[, 1] ~ sad[, 2] + sad[, 3] + sad[, 4] + sad[,
>      5] + sad[, 6] + sad[, 7] + sad[, 8] + sad[, 9] + sad[, 10])
>
> Residuals:
>      Min      1Q  Median      3Q     Max
> -3.3191 -0.3893  0.0519  0.7436  1.0519
>
> Coefficients: (1 not defined because of singularities)
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept)  4.34091    0.14495  29.947<2e-16 ***
> sad[, 2]    -0.16142    0.18128  -0.890   0.3737
> sad[, 3]    -0.23221    0.20275  -1.145   0.2527
> sad[, 4]     0.17832    0.19695   0.905   0.3657
> sad[, 5]     0.06450    0.21447   0.301   0.7638
> sad[, 6]    -0.15909    0.18713  -0.850   0.3957
> sad[, 7]    -0.39286    0.18171  -2.162   0.0311 *
> sad[, 8]    -0.08450    0.21146  -0.400   0.6896
> sad[, 9]    -0.02176    0.20170  -0.108   0.9141
> sad[, 10]         NA         NA      NA       NA
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.9615 on 477 degrees of freedom
> Multiple R-squared: 0.02984,    Adjusted R-squared: 0.01357
> F-statistic: 1.834 on 8 and 477 DF,  p-value: 0.06869
>
> Thanks in advance.
>
> --
> View this message in context: http://r.789695.n4.nabble.com/Summary-coefficients-give-NA-values-because-of-singularities-tp4162113p4162113.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list