[R] naive "collinear" weighted linear regression

David Winsemius dwinsemius at comcast.net
Thu Nov 12 04:06:13 CET 2009


On Nov 11, 2009, at 7:45 PM, Mauricio Calvao wrote:

> Hi there
>
> Sorry for what may be a naive or dumb question.
>
> I have the following data:
>
> > x <- c(1,2,3,4) # predictor vector
>
> > y <- c(2,4,6,8) # response vector. Notice that it is an exact,  
> perfect straight line through the origin and slope equal to 2
>
> > error <- c(0.3,0.3,0.3,0.3) # I have (equal) ``errors'', for  
> instance, in the measured responses

Which means those x, y, and "error" figures did not come from an  
experiment, but rather from theory???

>
> Of course the best fit coefficients should be 0 for the intercept  
> and 2 for the slope. Furthermore, it seems completely plausible (or  
> not?) that, since the y_i have associated non-vanishing  
> ``errors'' (dispersions), there should be corresponding non- 
> vanishing ``errors'' associated to the best fit coefficients, right?
>
> When I try:
>
> > fit_mod <- lm(y~x,weights=1/error^2)
>
> I get
>
> Warning message:
> In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
>  extra arguments weigths are just disregarded.

  (Actually the weights are for adjusting for sampling, and I do not  
see any sampling in your "design".)

>
> Keeping on, despite the warning message, which I did not quite  
> understand, when I type:
>
> > summary(fit_mod)
>
> I get
>
> Call:
> lm(formula = y ~ x, weigths = 1/error^2)
>
> Residuals:
>         1          2          3          4
> -5.067e-17  8.445e-17 -1.689e-17 -1.689e-17
>
> Coefficients:
>             Estimate Std. Error   t value Pr(>|t|)
> (Intercept) 0.000e+00  8.776e-17 0.000e+00        1
> x           2.000e+00  3.205e-17 6.241e+16   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 7.166e-17 on 2 degrees of freedom
> Multiple R-squared:     1,      Adjusted R-squared:     1
> F-statistic: 3.895e+33 on 1 and 2 DF,  p-value: < 2.2e-16
>
>
> Naively, should not the column Std. Error be different from zero??  
> What I have in mind, and sure is not what Std. Error means, is that  
> if I carried out a large simulation, assuming each response y_i a  
> Gaussian random variable with mean y_i and standard deviation  
> 2*error=0.6, and then making an ordinary least squares fitting of  
> the slope and intercept, I would end up with a mean for these  
> simulated coefficients which should be 2 and 0, respectively,

Well, not precisely 2 and 0, but rather something very close ... i.e,  
within "experimental error". Please note that numbers in the range of  
10e-17 are effectively zero from a numerical analysis perspective.

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

 > .Machine$double.eps ^ 0.5
[1] 1.490116e-08

> and, that's the point, a non-vanishing standard deviation for these  
> fitted coefficients, right?? This somehow is what I expected should  
> be an estimate or, at least, a good indicator, of the degree of  
> uncertainty which I should assign to the fitted coefficients; it  
> seems to me these deviations, thus calculated as a result of the  
> simulation, will certainly not be zero (or  3e-17, for that matter).  
> So this Std. Error does not provide what I, naively, think should be  
> given as a measure of the uncertainties or errors in the fitted  
> coefficients...

You are trying to impose an error structure on a data situation that  
you constructed artificially to be perfect.

>
> What am I not getting right??

That if you input "perfection" into R's linear regression program, you  
get appropriate warnings?

>
> Thanks and sorry for the naive and non-expert question!

You are a Professor of physics, right? You do experiments, right? You  
replicate them.  S0 perhaps I'm the one who should be puzzled.

> -- 
> #######################################
> Prof. Mauricio Ortiz Calvao
> Federal University of Rio de Janeiro
> Institute of Physics, P O Box 68528
> CEP 21941-972 Rio de Janeiro, RJ
> Brazil
-- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list