[R] Regression Error: Otherwise good variable causes singularity. Why?
David Winsemius
dwinsemius at comcast.net
Thu Aug 12 17:28:26 CEST 2010
On Aug 12, 2010, at 10:35 AM, asdir wrote:
>
> This command
>
>
> cdmoutcome<- glm(log(value)~factor(year)
>> +log(gdppcpppconst)+log(gdppcpppconstAII)
>> +log(co2eemisspc)+log(co2eemisspcAII)
>> +log(dist)
>> +fdiboth
>> +odapartnertohost
>> +corrupt
>> +log(infraindex)
>> +litrate
>> +africa
>> +imr
>> , data=cdmdata2, subset=zero==1, gaussian(link =
>> "identity"))
>
> results in this table
>
>
> Coefficients: (1 not defined because of singularities)
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 1.216e+01 5.771e+01 0.211 0.8332
>> factor(year)2006 -1.403e+00 5.777e-01 -2.429 0.0157 *
>> factor(year)2007 -2.799e-01 7.901e-01 -0.354 0.7234
>> log(gdppcpppconst) 2.762e-01 5.517e+00 0.050 0.9601
>> log(gdppcpppconstAII) -1.344e-01 9.025e-01 -0.149 0.8817
>> log(co2eemisspc) 5.655e+00 2.903e+00 1.948 0.0523 .
>> log(co2eemisspcAII) -1.411e-01 4.245e-01 -0.332 0.7399
>> log(dist) -2.938e-01 4.023e-01 -0.730 0.4658
>> fdiboth 1.326e-04 1.133e-04 1.171 0.2425
>> odapartnertohost 2.319e-03 1.437e-03 1.613 0.1078
>> corrupt 1.875e+00 3.313e+00 0.566 0.5718
>> log(infraindex) 4.783e+00 1.091e+01 0.438 0.6615
You have probably created litrate as a factor without realizing it.
That can easily happen if you just use read.table and one of the
values cannot be gracefully interpreted as a numeric. Either read in
with stringsAsFactors=FALSE or asIs=TRUE and then coerce it to
numeric. or if you want to fix an existing factor f%^&-up, then the
FAQ tells you to use something like:
cdmdata2$f_ed_variable <-
as.numeric(as.character(cdmdata2$f_ed_variable)
>> litrate0.47 -2.485e+01 3.190e+01 -0.779 0.4365
>> litrate0.499 -1.657e+01 2.591e+01 -0.639 0.5230
>> litrate0.523 -2.440e+01 3.427e+01 -0.712 0.4769
>> litrate0.528 -9.184e+00 1.379e+01 -0.666 0.5060
>> litrate0.595 -2.309e+01 2.776e+01 -0.832 0.4062
>> litrate0.66 -1.451e+01 2.734e+01 -0.531 0.5961
>> litrate0.675 -1.707e+01 2.813e+01 -0.607 0.5444
>> litrate0.68 -6.346e+00 1.063e+01 -0.597 0.5509
>> litrate0.699 2.717e+00 3.541e+00 0.768 0.4434
>> litrate0.706 -1.960e+01 2.933e+01 -0.668 0.5046
>> litrate0.714 -2.586e+01 4.002e+01 -0.646 0.5186
>> litrate0.736 5.641e+00 1.561e+01 0.361 0.7181
>> litrate0.743 -2.692e+01 4.253e+01 -0.633 0.5273
>> litrate0.762 -2.208e+01 3.100e+01 -0.712 0.4767
>> litrate0.802 -2.325e+01 3.766e+01 -0.617 0.5375
>> litrate0.847 -2.620e+01 3.948e+01 -0.664 0.5075
>> litrate0.86 -3.576e+01 4.950e+01 -0.722 0.4707
>> litrate0.864 -4.482e+01 6.274e+01 -0.714 0.4755
>> litrate0.872 -1.946e+01 2.715e+01 -0.717 0.4739
>> litrate0.877 -2.710e+01 3.702e+01 -0.732 0.4646
>> litrate0.879 -3.460e+01 5.147e+01 -0.672 0.5020
>> litrate0.886 -3.276e+01 4.860e+01 -0.674 0.5008
>> litrate0.889 -4.120e+01 5.755e+01 -0.716 0.4746
>> litrate0.904 -2.282e+01 2.985e+01 -0.764 0.4453
>> litrate0.91 -3.478e+01 5.037e+01 -0.691 0.4904
>> litrate0.923 -1.762e+01 2.551e+01 -0.691 0.4902
>> litrate0.925 -2.445e+01 3.611e+01 -0.677 0.4990
>> litrate0.926 -2.995e+01 4.565e+01 -0.656 0.5123
>> litrate0.928 -2.839e+01 3.933e+01 -0.722 0.4710
>> litrate0.937 -2.571e+01 3.795e+01 -0.677 0.4986
>> litrate0.94 -2.109e+01 3.051e+01 -0.691 0.4900
>> litrate0.959 -2.078e+01 2.895e+01 -0.718 0.4735
>> litrate0.96 -3.403e+01 4.798e+01 -0.709 0.4787
>> litrate0.962 -4.084e+01 5.755e+01 -0.710 0.4785
>> litrate0.971 -3.743e+01 5.247e+01 -0.713 0.4761
>> litrate0.98 -3.709e+01 5.170e+01 -0.717 0.4737
>> litrate0.986 -2.663e+01 4.437e+01 -0.600 0.5488
>> litrate0.991 -3.045e+01 4.166e+01 -0.731 0.4654
>> litrate1 -2.732e+01 4.459e+01 -0.613 0.5405
>> africa NA NA NA NA
>> imr 2.160e+00 9.357e-01 2.309 0.0216 *
>
> although it should result in something similar to this:
>
>
> Coefficients: (1 not defined because of singularities)
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 1.216e+01 5.771e+01 0.211 0.8332
>> factor(year)2006 -1.403e+00 5.777e-01 -2.429 0.0157 *
>> factor(year)2007 -2.799e-01 7.901e-01 -0.354 0.7234
>> log(gdppcpppconst) 2.762e-01 5.517e+00 0.050 0.9601
>> log(gdppcpppconstAII) -1.344e-01 9.025e-01 -0.149 0.8817
>> log(co2eemisspc) 5.655e+00 2.903e+00 1.948 0.0523 .
>> log(co2eemisspcAII) -1.411e-01 4.245e-01 -0.332 0.7399
>> log(dist) -2.938e-01 4.023e-01 -0.730 0.4658
>> fdiboth 1.326e-04 1.133e-04 1.171 0.2425
>> odapartnertohost 2.319e-03 1.437e-03 1.613 0.1078
>> corrupt 1.875e+00 3.313e+00 0.566 0.5718
>> log(infraindex) 4.783e+00 1.091e+01 0.438 0.6615
>> litrate -2.485e+01 3.190e+01 -0.779 0.4365
>> africa -2.732e+01 4.459e+01 -0.613 0.5405
>> imr 2.160e+00 9.357e-01 2.309 0.0216 *
>
> In fact, if I don't use the litrate variable, the regression runs
> just fine.
> If I use the variable in a different regression, it also works fine.
> I just
> can't find the point where it turns ugly.
>
> I tested the litrate-variable for everything I know to test for: The
> structure is numerical and it does not contain any missings. It has
> the same
> length as every other variable in the set and is a continuous
> variable with
> values between 0 and 1.
>
> Does anyone have an idea?
> --
> View this message in context: http://r.789695.n4.nabble.com/Regression-Error-Otherwise-good-variable-causes-singularity-Why-tp2322780p2322780.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list