[R] Error in lm() with very small (close to zero) regressor

peter dalgaard pdalgd at gmail.com
Sun Mar 29 20:31:14 CEST 2015


> On 28 Mar 2015, at 18:52 , RiGui <raluca.gui at business.uzh.ch> wrote:
> 
> Thank you for your replies! 
> 
> I am terribly sorry for the code not being reproducible, is the first time I
> am posting here, I run the code several times before I posted, but...I
> forgot about the library used.
> 
> To answer to your questions:
> 
> How do you know this answer is "correct"? 
> 
> What I am doing is actually a "fixed effect" estimation. I apply a
> projection matrix to the data, both dependent and independent variables,
> projection which renders the regressors that do not vary, equal to basically
> zero - the x1 from the post. 
> 
> Once I apply the projection, I need to run OLS to get the estimates, so x1
> should be zero. 

Please rethink: If a regressor is very small, the regression coefficient will be very large; if it is small and random, OLS estimators will be highly variable. 

R has no way of knowing that a regressor with small values isn't what the user intended (e.g. it could be picoMolar concentrations stated in Molar units); if you want a mechanism that eliminates near-zero regressors you need to do it explicitly. 

> Therefore, the results with the scaled regressor is not correct. 
> 
> Besides, I do not see why the bOLS is wrong, since is the formula of the OLS
> estimator from any Econometrics book.

Textbooks often gloss over details like numerical stability (and in general, textbooks often use slightly oversimplified methods in order not to confuse students unnecessarily). 
Better books will give the (X'X)^-1 X'Y formula with a warning not to use it as is, but e.g. use the X=QR decomposition [which gives (R'Q'QR)^-1 R'Q'Y = (R'R)^-1 R'Q'Y = R^-1 Q'Y].


> Here again the corrected code: 
> 
> install.packages("corpcor")
> library(corpcor)
> 
> n_obs <- 1000
> y  <- rnorm(n_obs, 10,2.89)
> x1 <- rnorm(n_obs, 0.00000000000001235657,0.000000000000000045)
> x2 <- rnorm(n_obs, 10,3.21)
> X  <- cbind(x1,x2)
> 
> bFE <- lm(y ~ x1 + x2)
> bFE
> 
> bOLS <- pseudoinverse(t(X) %*% X) %*% t(X) %*% y
> bOLS
> 

Notice again, that these are not comparable in that bFE has an intercept term and bOLS hasn't. You need to compare with

y ~ x1 + x2 - 1

and 

y ~ x2 - 1


> 
> Best,
> 
> Raluca Gui 
> 
> 
> 
> 
> --
> View this message in context: http://r.789695.n4.nabble.com/Error-in-lm-with-very-small-close-to-zero-regressor-tp4705185p4705212.html
> Sent from the R help mailing list archive at Nabble.com.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list