[R] singular information matrix in lrm.fit

Prof Brian Ripley ripley at stats.ox.ac.uk
Sun Oct 12 08:16:19 CEST 2008


I believe lrm has a criterion appropriate to single-precision calculations 
(as S-PLUS used to use).  Try reducing 'tol' from its default of 1e-7.

But your design matrix *is* near singular

> kappa(cbind(1,x))
[1] 557390.5

so try centring/scaling your variables.

On Sun, 12 Oct 2008, Gad Abraham wrote:

> Hi,
>
> I'm trying to do binary logistic regression on 10 covariables, comparing glm 
> to lrm from Harrell's Design package. They don't seem to agree on whether the 
> data is collinear:
>
>> library(Design)
>> load(url("http://www.csse.unimelb.edu.au/~gabraham/data.Rdata"))
>> lrm(y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10, data=x)
> singular information matrix in lrm.fit (rank= 10 ).  Offending variable(s):
> X10
> Error in j:(j + params[i] - 1) : NA/NaN argument
>
> If I understand correctly, lrm is complaining about collinearity in the data.

Not quite: it is complaining about singularity in a weighted covariance 
matrix of the inputs.

> However, the rank of the matrix is 10:
>> qr(x)$rank
> [1] 10

You have forgotten about the intercept.

> glm doesn't seem to care about the supposed collinearity, but does say that 
> the data are perfectly separable:
>
>> glm(y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10, data=x,
> +    family=binomial(), control=glm.control(maxit=50))
>
> Call:  glm(formula = y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 +    X10, 
> family = binomial(), data = x, control = glm.control(maxit = 50))
>
> Coefficients:
> (Intercept)           X1           X2           X3           X4   X5
> -6.921e+03    7.185e-02    4.344e-02   -3.980e-02   -5.362e-02 -6.387e-03
>         X6           X7           X8           X9          X10
>  2.455e-01    2.753e-02   -1.848e-01    1.903e-01   -3.187e-02
>
> Degrees of Freedom: 27 Total (i.e. Null);  17 Residual
> Null Deviance:      38.82
> Residual Deviance: 4.266e-10    AIC: 22
> Warning message:
> In glm.fit(x = X, y = Y, weights = weights, start = start, etastart = 
> etastart,  :
>  fitted probabilities numerically 0 or 1 occurred
>
>
> What's the reason for this discrepancy?
>
> Thanks,
> Gad
>
>
> -- 
> Gad Abraham
> Dept. CSSE and NICTA
> The University of Melbourne
> Parkville 3010, Victoria, Australia
> email: gabraham at csse.unimelb.edu.au
> web: http://www.csse.unimelb.edu.au/~gabraham

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list