[R] Coefficients: (20 not defined because of singularities)
Prof Brian Ripley
ripley at stats.ox.ac.uk
Fri May 30 11:06:31 CEST 2003
It is the model matrix which is singular, *not* the variable. You are
trying to fit a collinear model.
Use alias() to see what is going on.
On Fri, 30 May 2003, Thomas Fischer wrote:
> Hello,
>
> I am trying to run a linear regression analysis on my data set. For some
> reason most variables are removed due to singularities.
>
> My linear regression looks this way (I am using only partial data, which
> is selected by flags):
>
> fm<-lm(log(cplex6.time..sec..[flags]) ~ cplex6.cities[flags] +
> log(1/features.meanOver.frust[flags]) +
> log(1/features.meanOver.minDist[flags]) +
> [...]
> avg..steps.to.loc..Opt..norm..[flags] + NN.List.opt..tour.max.[flags])
>
> As I am using inversion and logarithms I set all data to positiv values,
> before running lm():
>
> cplex6.time..sec..[cplex6.time..sec..<=0.00001]=0.00001
> features.meanOver.frust[features.meanOver.frust<=0.00001]=0.00001
> features.meanOver.minDist[features.meanOver.minDist<=0.00001]=0.00001
> [...]
> features.varOver.varDist[features.varOver.varDist<=0.00001]=0.00001
>
> Retrieving the summary of fm, I get the message, that some coefficients
> have been removed.
No, that they are nor defined, as it says.
> [...]
> Coefficients: (20 not defined because of singularities)
> Estimate Std. Error t
> value
> (Intercept) 87.2162 44.1148
> 1.977
> log(1/features.meanOver.frust[flags]) -2.5298 0.1515
> -16.702
> log(1/features.meanOver.minDist[flags]) 154.7170 11.3917
> 13.582
> log(1/features.meanOver.quant25Dist[flags]) -943.4625 71.3505
> -13.223
> log(1/features.meanOver.quart1SpanDist[flags]) 776.1049 60.0571
> 12.923
> log(1/features.meanOver.spanDist[flags]) -9.8069 0.1400
> -70.038
> log(1/features.meanOver.varDist[flags]) -11.3211 0.6715
> -16.859
> log(1/features.quant25Over.minDist[flags]) -46.9655 3.1438
> -14.939
> avg..steps.to.loc..Opt..norm..[flags] 0.8324 1.0919
> 0.762
> Pr(>|t|)
> (Intercept) 0.0511 .
> log(1/features.meanOver.frust[flags]) <2e-16 ***
> log(1/features.meanOver.minDist[flags]) <2e-16 ***
> log(1/features.meanOver.quant25Dist[flags]) <2e-16 ***
> log(1/features.meanOver.quart1SpanDist[flags]) <2e-16 ***
> log(1/features.meanOver.spanDist[flags]) <2e-16 ***
> log(1/features.meanOver.varDist[flags]) <2e-16 ***
> log(1/features.quant25Over.minDist[flags]) <2e-16 ***
> avg..steps.to.loc..Opt..norm..[flags] 0.4478
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
> [...]
>
>
> The summary of one of the removed coefficients looks like this:
That's the summary of the variable, not the coefficient.
> > summary(features.spanOver.quart1SpanDist[flags])
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 0.05584 0.05797 0.06366 0.06311 0.06674 0.07290
> > summary(log(1/features.spanOver.quart1SpanDist[flags]))
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 2.619 2.707 2.754 2.767 2.848 2.885
>
> The summary of a coefficient that was kept looks this way:
>
> > summary(features.quant25Over.minDist[flags])
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 0.001030 0.001030 0.001030 0.001032 0.001030 0.001040
> > summary(log(1/features.quant25Over.minDist[flags]))
> Min. 1st Qu. Median Mean 3rd Qu. Max.
> 6.869 6.878 6.878 6.877 6.878 6.878
>
> So, I don't see the difference. Why has the first coefficient been
> removed and the second one kept?
> Please help me.
>
> I'm using R 1.6.2 on a Linux x86 machine.
>
> Greetings,
> Thomas Fischer
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
--
Brian D. Ripley, ripley at stats.ox.ac.uk
Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UK Fax: +44 1865 272595
More information about the R-help
mailing list