[R] Error in lm() with very small (close to zero) regressor

William Dunlap wdunlap at tibco.com
Tue Mar 31 18:31:19 CEST 2015


If you really want your coefficient estimates to be scale-equivariant you
should test those methods for such a thing.  E.g., here are functions that
let you check how scaling one predictor affects the estimated coefficients
- they should give the same results for any scale factor.

f <-
function (scale=1, n=100, data=data.frame(Y=seq_len(n),
X1=sqrt(seq_len(n)), X2=log(seq_len(n))))
{
    cf <- coef(lm(data=data, Y ~ X1 + I(X2/scale)))
    cf * c(1, 1, 1/scale)
}
g <-
function (scale=1, n=100, data=data.frame(Y=seq_len(n),
X1=sqrt(seq_len(n)), X2=log(seq_len(n))))
{
    cf <- coef(fastLm(data=data, Y ~ X1 + I(X2/scale), method=4))
    cf * c(1, 1, 1/scale)
}
h <-
function (scale=1, n=100, data=data.frame(Y=seq_len(n),
X1=sqrt(seq_len(n)), X2=log(seq_len(n))))
{
    cf <- coef(fastLm(data=data, Y ~ X1 + I(X2/scale), method=5))
    cf * c(1, 1, 1/scale)
}

See how they compare for scale factors between 10^-15 and 10^15.  lm() is
looking pretty good.
> options(digits=4)
> scale <- 10 ^ seq(-15,15,by=5)
> sapply(scale, f)
               [,1]    [,2]    [,3]    [,4]    [,5]    [,6]    [,7]
(Intercept)  -9.393  -9.393  -9.393  -9.393  -9.393  -9.393  -9.393
X1           19.955  19.955  19.955  19.955  19.955  19.955  19.955
I(X2/scale) -20.372 -20.372 -20.372 -20.372 -20.372 -20.372 -20.372
> sapply(scale, g)
                 [,1]    [,2]    [,3]    [,4]    [,5]    [,6]       [,7]
(Intercept) 0.000e+00  -9.393  -9.393  -9.393  -9.393  -9.393 -3.126e+01
X1          2.772e-29  19.955  19.955  19.955  19.955  19.955  1.218e+01
I(X2/scale) 1.474e+01 -20.372 -20.372 -20.372 -20.372 -20.372 -2.892e-29
> sapply(scale, h)
                 [,1]      [,2]    [,3]    [,4]    [,5]       [,6]
[,7]
(Intercept) 0.000e+00 3.807e-20  -9.395  -9.393  -9.393 -3.126e+01
-3.126e+01
X1          2.945e-29 2.772e-19  19.954  19.955  19.955  1.218e+01
 1.218e+01
I(X2/scale) 1.474e+01 1.474e+01 -20.369 -20.372 -20.372 -2.892e-19
 6.596e-30



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Tue, Mar 31, 2015 at 5:10 AM, RiGui <raluca.gui at business.uzh.ch> wrote:

> I found a fix to my problem using the fastLm() from package RcppEigen,
> using
> the Jacobi singular value decomposition (SVD) (method 4) or a method based
> on the eigenvalue-eigenvector decomposition of X'X - method 5 of the fastLm
> function
>
>
>
> install.packages("RcppEigen")
> library(RcppEigen)
>
> n_obs <- 1500
> y  <- rnorm(n_obs, 10,2.89)
> x1 <- rnorm(n_obs, 0.00000000000001235657,0.000000000000000045)
> x2 <- rnorm(n_obs, 10,3.21)
> X  <- cbind(x1,x2)
>
>
>
> bFE <- fastLm(y ~ x1 + x2, method =4)
> bFE
>
> Call:
> fastLm.formula(formula = y ~ x1 + x2, method = 4)
>
> Coefficients:
>         (Intercept)                  x1                  x2
> 9.94832839474159414 0.00000000000012293 0.00440078989949841
>
>
> Best,
>
> Raluca
>
>
>
>
>
> --
> View this message in context:
> http://r.789695.n4.nabble.com/Error-in-lm-with-very-small-close-to-zero-regressor-tp4705185p4705328.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list