[R] Coefficients: (20 not defined because of singularities)
Thomas Fischer
th.fischer at gmx.net
Fri May 30 10:44:55 CEST 2003
Hello,
I am trying to run a linear regression analysis on my data set. For some
reason most variables are removed due to singularities.
My linear regression looks this way (I am using only partial data, which
is selected by flags):
fm<-lm(log(cplex6.time..sec..[flags]) ~ cplex6.cities[flags] +
log(1/features.meanOver.frust[flags]) +
log(1/features.meanOver.minDist[flags]) +
[...]
avg..steps.to.loc..Opt..norm..[flags] + NN.List.opt..tour.max.[flags])
As I am using inversion and logarithms I set all data to positiv values,
before running lm():
cplex6.time..sec..[cplex6.time..sec..<=0.00001]=0.00001
features.meanOver.frust[features.meanOver.frust<=0.00001]=0.00001
features.meanOver.minDist[features.meanOver.minDist<=0.00001]=0.00001
[...]
features.varOver.varDist[features.varOver.varDist<=0.00001]=0.00001
Retrieving the summary of fm, I get the message, that some coefficients
have been removed.
[...]
Coefficients: (20 not defined because of singularities)
Estimate Std. Error t
value
(Intercept) 87.2162 44.1148
1.977
log(1/features.meanOver.frust[flags]) -2.5298 0.1515
-16.702
log(1/features.meanOver.minDist[flags]) 154.7170 11.3917
13.582
log(1/features.meanOver.quant25Dist[flags]) -943.4625 71.3505
-13.223
log(1/features.meanOver.quart1SpanDist[flags]) 776.1049 60.0571
12.923
log(1/features.meanOver.spanDist[flags]) -9.8069 0.1400
-70.038
log(1/features.meanOver.varDist[flags]) -11.3211 0.6715
-16.859
log(1/features.quant25Over.minDist[flags]) -46.9655 3.1438
-14.939
avg..steps.to.loc..Opt..norm..[flags] 0.8324 1.0919
0.762
Pr(>|t|)
(Intercept) 0.0511 .
log(1/features.meanOver.frust[flags]) <2e-16 ***
log(1/features.meanOver.minDist[flags]) <2e-16 ***
log(1/features.meanOver.quant25Dist[flags]) <2e-16 ***
log(1/features.meanOver.quart1SpanDist[flags]) <2e-16 ***
log(1/features.meanOver.spanDist[flags]) <2e-16 ***
log(1/features.meanOver.varDist[flags]) <2e-16 ***
log(1/features.quant25Over.minDist[flags]) <2e-16 ***
avg..steps.to.loc..Opt..norm..[flags] 0.4478
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
[...]
The summary of one of the removed coefficients looks like this:
> summary(features.spanOver.quart1SpanDist[flags])
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.05584 0.05797 0.06366 0.06311 0.06674 0.07290
> summary(log(1/features.spanOver.quart1SpanDist[flags]))
Min. 1st Qu. Median Mean 3rd Qu. Max.
2.619 2.707 2.754 2.767 2.848 2.885
The summary of a coefficient that was kept looks this way:
> summary(features.quant25Over.minDist[flags])
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.001030 0.001030 0.001030 0.001032 0.001030 0.001040
> summary(log(1/features.quant25Over.minDist[flags]))
Min. 1st Qu. Median Mean 3rd Qu. Max.
6.869 6.878 6.878 6.877 6.878 6.878
So, I don't see the difference. Why has the first coefficient been
removed and the second one kept?
Please help me.
I'm using R 1.6.2 on a Linux x86 machine.
Greetings,
Thomas Fischer
More information about the R-help
mailing list