[R] question regarding logit regression using glm
Spencer Graves
spencer.graves at pdf.com
Mon Aug 8 00:54:20 CEST 2005
The "problem" is that with 40 parameters, you are able to get a
perfect fit for at least some of the observations. To achieve this, it
sends selected parameters to +/-Inf. Of course, it quits before it gets
to Inf, but most of your parameter estimates exceeded 1e13 in absolute
value.
What do you want? Do you really need MSA to be a factor, requiring
you to estimate 39 parameters for MSA? Does it make sense to
parameterize it some other way, like latitude and longitude? You could
fit a polynomial in lat + lon and gain substantial insight, I suspect,
that you can't get from the factor coefficients.
spencer graves
Haibo Huang wrote:
> I got the following warning messages when I did a
> binomial logit regression using glm():
>
> Warning messages:
> 1: Algorithm did not converge in: glm.fit(x = X, y =
> Y, weights = weights, start = start, etastart =
> etastart,
> 2: fitted probabilities numerically 0 or 1 occurred
> in: glm.fit(x = X, y = Y, weights = weights, start =
> start, etastart = etastart,
>
> Can some one share your thoughts on how to solve this
> problem? Please read the following for details. Thank
> you very much!
>
> Best,
> Ed
>
>
>
>>Lease=read.csv("lease.csv", header=TRUE)
>>Lease$ET = factor(Lease$EarlyTermination)
>>SICCode=factor(Lease$SIC.Code)
>>TO=factor(Lease$TenantHasOption)
>>LO=factor(Lease$LandlordHasOption)
>>TEO=factor(Lease$TenantExercisedOption)
>>
>>RegA=glm(ET~1+MSA,
>
> + family=binomial(link=logit), data=Lease,
> weights=Origil.SQFT)
> Warning messages:
> 1: Algorithm did not converge in: glm.fit(x = X, y =
> Y, weights = weights, start = start, etastart =
> etastart,
> 2: fitted probabilities numerically 0 or 1 occurred
> in: glm.fit(x = X, y = Y, weights = weights, start =
> start, etastart = etastart,
>
>>summary(RegA)
>
>
> Call:
> glm(formula = ET ~ 1 + MSA, family = binomial(link =
> logit),
> data = Lease, weights = Origil.SQFT)
>
> Deviance Residuals:
> Min 1Q Median 3Q
> Max
> -6.038e+03 -2.066e-06 0.000e+00 0.000e+00
> 6.720e+03
>
> Coefficients:
> Estimate Std. Error z value
> Pr(>|z|)
> (Intercept) 5.711e+00 8.466e-02 6.745e+01
> <2e-16 ***
> MSAAnchorage -6.493e+00 8.541e-02 -7.602e+01
> <2e-16 ***
> MSAAtlanta 6.894e+14 2.310e+04 2.985e+10
> <2e-16 ***
> MSAAustin -9.362e+14 4.954e+04 -1.890e+10
> <2e-16 ***
> MSABoston -2.474e+15 2.151e+04 -1.150e+11
> <2e-16 ***
> MSACharlotte -2.150e+15 7.265e+04 -2.960e+10
> <2e-16 ***
> MSAChicago -1.174e+15 2.057e+04 -5.707e+10
> <2e-16 ***
> MSACleveland -7.607e+14 7.046e+04 -1.080e+10
> <2e-16 ***
> MSAColumbus -2.768e+15 1.685e+05 -1.642e+10
> <2e-16 ***
> MSADallas 2.061e+14 3.261e+04 6.321e+09
> <2e-16 ***
> MSADenver 5.470e+14 3.366e+04 1.625e+10
> <2e-16 ***
> MSAEast Bay -6.191e+01 1.344e+05 -4.61e-04
> 1
> MSAFt. Worth -6.565e+00 8.483e-02 -7.739e+01
> <2e-16 ***
> MSAHouston -2.735e+15 3.576e+04 -7.648e+10
> <2e-16 ***
> MSAIndianapolis -7.483e+14 6.588e+04 -1.136e+10
> <2e-16 ***
> MSALos Angeles -1.388e+15 2.887e+04 -4.809e+10
> <2e-16 ***
> MSAMinneapolis -1.011e+15 2.731e+04 -3.702e+10
> <2e-16 ***
> MSANashville 2.143e+01 9.395e+04 2.28e-04
> 1
> MSANew Orleans -3.370e+15 5.038e+04 -6.689e+10
> <2e-16 ***
> MSANew York -2.526e+15 2.969e+04 -8.507e+10
> <2e-16 ***
> MSANorfolk -5.614e+01 2.020e+06 -2.78e-05
> 1
> MSAOakland-East Bay -2.272e+15 3.642e+04 -6.239e+10
> <2e-16 ***
> MSAOrange County -5.165e+14 2.428e+04 -2.128e+10
> <2e-16 ***
> MSAOrlando -3.215e+15 1.096e+05 -2.933e+10
> <2e-16 ***
> MSAPhiladelphia -8.871e+14 4.948e+04 -1.793e+10
> <2e-16 ***
> MSAPhoenix -1.156e+01 8.807e-02 -1.313e+02
> <2e-16 ***
> MSAPortland 7.604e+14 3.841e+04 1.980e+10
> <2e-16 ***
> MSARaleigh-Durham -4.312e+01 1.294e+05 -3.33e-04
> 1
> MSARiverside 1.626e+15 4.645e+05 3.500e+09
> <2e-16 ***
> MSASacramento -9.873e+14 5.345e+04 -1.847e+10
> <2e-16 ***
> MSASalt Lake City 1.793e+15 2.029e+05 8.839e+09
> <2e-16 ***
> MSASan Antonio 9.451e+14 9.473e+04 9.977e+09
> <2e-16 ***
> MSASan Diego -3.740e+15 6.651e+04 -5.623e+10
> <2e-16 ***
> MSASan Francisco 3.109e+14 2.394e+04 1.299e+10
> <2e-16 ***
> MSASan Jose 7.392e+14 2.961e+04 2.497e+10
> <2e-16 ***
> MSASeattle -2.250e+15 1.581e+04 -1.423e+11
> <2e-16 ***
> MSASt. Louis -2.606e+15 1.801e+05 -1.447e+10
> <2e-16 ***
> MSAStamford -6.592e+00 8.469e-02 -7.784e+01
> <2e-16 ***
> MSAWashington DC 8.460e+13 3.319e+04 2.549e+09
> <2e-16 ***
> MSAWest Palm Beach -3.924e+01 2.308e+05 -1.70e-04
> 1
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.'
> 0.1 ` ' 1
>
> (Dispersion parameter for binomial family taken to be
> 1)
>
> Null deviance: 123111026 on 9302 degrees of
> freedom
> Residual deviance: 3028559052 on 9263 degrees of
> freedom
> AIC: 3028559132
>
> Number of Fisher Scoring iterations: 25
>
>
>>anova(RegA)
>
> Analysis of Deviance Table
>
> Model: binomial, link: logit
>
> Response: ET
>
> Terms added sequentially (first to last)
>
>
> Df Deviance Resid. Df Resid. Dev
> NULL 9302 123111026
> MSA 39 0 9263 3028559052
>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
--
Spencer Graves, PhD
Senior Development Engineer
PDF Solutions, Inc.
333 West San Carlos Street Suite 700
San Jose, CA 95110, USA
spencer.graves at pdf.com
www.pdf.com <http://www.pdf.com>
Tel: 408-938-4420
Fax: 408-280-7915
More information about the R-help
mailing list