[R] Coefficients: (20 not defined because of singularities)

Prof Brian Ripley ripley at stats.ox.ac.uk
Fri May 30 11:06:31 CEST 2003


It is the model matrix which is singular, *not* the variable.  You are 
trying to fit a collinear model.

Use alias() to see what is going on.

On Fri, 30 May 2003, Thomas Fischer wrote:

> Hello,
> 
> I am trying to run a linear regression analysis on my data set. For some 
> reason most variables are removed due to singularities.
> 
> My linear regression looks this way (I am using only partial data, which 
> is selected by flags):
> 
> fm<-lm(log(cplex6.time..sec..[flags]) ~ cplex6.cities[flags] + 
> log(1/features.meanOver.frust[flags]) + 
> log(1/features.meanOver.minDist[flags]) +
> [...]
> avg..steps.to.loc..Opt..norm..[flags] + NN.List.opt..tour.max.[flags])
> 
> As I am using inversion and logarithms I set all data to positiv values, 
> before running lm():
> 
> cplex6.time..sec..[cplex6.time..sec..<=0.00001]=0.00001
> features.meanOver.frust[features.meanOver.frust<=0.00001]=0.00001
> features.meanOver.minDist[features.meanOver.minDist<=0.00001]=0.00001
> [...]
> features.varOver.varDist[features.varOver.varDist<=0.00001]=0.00001
> 
> Retrieving the summary of fm, I get the message, that some coefficients 
> have been removed.

No, that they are nor defined, as it says.


> [...]
> Coefficients: (20 not defined because of singularities)
>                                                 Estimate Std. Error t 
> value
> (Intercept)                                      87.2162    44.1148   
> 1.977
> log(1/features.meanOver.frust[flags])            -2.5298     0.1515 
> -16.702
> log(1/features.meanOver.minDist[flags])         154.7170    11.3917  
> 13.582
> log(1/features.meanOver.quant25Dist[flags])    -943.4625    71.3505 
> -13.223
> log(1/features.meanOver.quart1SpanDist[flags])  776.1049    60.0571  
> 12.923
> log(1/features.meanOver.spanDist[flags])         -9.8069     0.1400 
> -70.038
> log(1/features.meanOver.varDist[flags])         -11.3211     0.6715 
> -16.859
> log(1/features.quant25Over.minDist[flags])      -46.9655     3.1438 
> -14.939
> avg..steps.to.loc..Opt..norm..[flags]             0.8324     1.0919   
> 0.762
>                                                Pr(>|t|)
> (Intercept)                                      0.0511 .
> log(1/features.meanOver.frust[flags])            <2e-16 ***
> log(1/features.meanOver.minDist[flags])          <2e-16 ***
> log(1/features.meanOver.quant25Dist[flags])      <2e-16 ***
> log(1/features.meanOver.quart1SpanDist[flags])   <2e-16 ***
> log(1/features.meanOver.spanDist[flags])         <2e-16 ***
> log(1/features.meanOver.varDist[flags])          <2e-16 ***
> log(1/features.quant25Over.minDist[flags])       <2e-16 ***
> avg..steps.to.loc..Opt..norm..[flags]            0.4478
> ---
> Signif. codes:  0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
> [...]
> 
> 
> The summary of one of the removed coefficients looks like this:

That's the summary of the variable, not the coefficient.

> > summary(features.spanOver.quart1SpanDist[flags])
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
> 0.05584 0.05797 0.06366 0.06311 0.06674 0.07290
> > summary(log(1/features.spanOver.quart1SpanDist[flags]))
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>   2.619   2.707   2.754   2.767   2.848   2.885
> 
> The summary of a coefficient that was kept looks this way:
> 
> > summary(features.quant25Over.minDist[flags])
>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
> 0.001030 0.001030 0.001030 0.001032 0.001030 0.001040
> > summary(log(1/features.quant25Over.minDist[flags]))
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
>   6.869   6.878   6.878   6.877   6.878   6.878
> 
> So, I don't see the difference. Why has the first coefficient been 
> removed and the second one kept?
> Please help me.
> 
> I'm using R 1.6.2 on a Linux x86 machine.
> 
> Greetings,
> Thomas Fischer
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> 

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list