[R] naive "collinear" weighted linear regression

Mauricio Calvao orca at if.ufrj.br
Thu Nov 12 01:45:46 CET 2009


Hi there

Sorry for what may be a naive or dumb question.

I have the following data:

 > x <- c(1,2,3,4) # predictor vector

 > y <- c(2,4,6,8) # response vector. Notice that it is an exact, 
perfect straight line through the origin and slope equal to 2

 > error <- c(0.3,0.3,0.3,0.3) # I have (equal) ``errors'', for 
instance, in the measured responses

Of course the best fit coefficients should be 0 for the intercept and 2 
for the slope. Furthermore, it seems completely plausible (or not?) 
that, since the y_i have associated non-vanishing ``errors'' 
(dispersions), there should be corresponding non-vanishing ``errors'' 
associated to the best fit coefficients, right?

When I try:

 > fit_mod <- lm(y~x,weights=1/error^2)

I get

Warning message:
In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
   extra arguments weigths are just disregarded.

Keeping on, despite the warning message, which I did not quite 
understand, when I type:

 > summary(fit_mod)

I get

Call:
lm(formula = y ~ x, weigths = 1/error^2)

Residuals:
          1          2          3          4
-5.067e-17  8.445e-17 -1.689e-17 -1.689e-17

Coefficients:
              Estimate Std. Error   t value Pr(>|t|)
(Intercept) 0.000e+00  8.776e-17 0.000e+00        1
x           2.000e+00  3.205e-17 6.241e+16   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 7.166e-17 on 2 degrees of freedom
Multiple R-squared:     1,      Adjusted R-squared:     1
F-statistic: 3.895e+33 on 1 and 2 DF,  p-value: < 2.2e-16


Naively, should not the column Std. Error be different from zero?? What 
I have in mind, and sure is not what Std. Error means, is that if I 
carried out a large simulation, assuming each response y_i a Gaussian 
random variable with mean y_i and standard deviation 2*error=0.6, and 
then making an ordinary least squares fitting of the slope and 
intercept, I would end up with a mean for these simulated coefficients 
which should be 2 and 0, respectively, and, that's the point, a 
non-vanishing standard deviation for these fitted coefficients, right?? 
This somehow is what I expected should be an estimate or, at least, a 
good indicator, of the degree of uncertainty which I should assign to 
the fitted coefficients; it seems to me these deviations, thus 
calculated as a result of the simulation, will certainly not be zero (or 
  3e-17, for that matter). So this Std. Error does not provide what I, 
naively, think should be given as a measure of the uncertainties or 
errors in the fitted coefficients...

What am I not getting right??

Thanks and sorry for the naive and non-expert question!


-- 
#######################################
Prof. Mauricio Ortiz Calvao
Federal University of Rio de Janeiro
Institute of Physics, P O Box 68528
CEP 21941-972 Rio de Janeiro, RJ
Brazil

Email: orca at if.ufrj.br
Phone: (55)(21)25627483
Homepage: http://www.if.ufrj.br/~orca
#######################################




More information about the R-help mailing list