[R] naive "collinear" weighted linear regression
Mauricio Calvao
orca at if.ufrj.br
Thu Nov 12 01:45:46 CET 2009
Hi there
Sorry for what may be a naive or dumb question.
I have the following data:
> x <- c(1,2,3,4) # predictor vector
> y <- c(2,4,6,8) # response vector. Notice that it is an exact,
perfect straight line through the origin and slope equal to 2
> error <- c(0.3,0.3,0.3,0.3) # I have (equal) ``errors'', for
instance, in the measured responses
Of course the best fit coefficients should be 0 for the intercept and 2
for the slope. Furthermore, it seems completely plausible (or not?)
that, since the y_i have associated non-vanishing ``errors''
(dispersions), there should be corresponding non-vanishing ``errors''
associated to the best fit coefficients, right?
When I try:
> fit_mod <- lm(y~x,weights=1/error^2)
I get
Warning message:
In lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
extra arguments weigths are just disregarded.
Keeping on, despite the warning message, which I did not quite
understand, when I type:
> summary(fit_mod)
I get
Call:
lm(formula = y ~ x, weigths = 1/error^2)
Residuals:
1 2 3 4
-5.067e-17 8.445e-17 -1.689e-17 -1.689e-17
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.000e+00 8.776e-17 0.000e+00 1
x 2.000e+00 3.205e-17 6.241e+16 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.166e-17 on 2 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 3.895e+33 on 1 and 2 DF, p-value: < 2.2e-16
Naively, should not the column Std. Error be different from zero?? What
I have in mind, and sure is not what Std. Error means, is that if I
carried out a large simulation, assuming each response y_i a Gaussian
random variable with mean y_i and standard deviation 2*error=0.6, and
then making an ordinary least squares fitting of the slope and
intercept, I would end up with a mean for these simulated coefficients
which should be 2 and 0, respectively, and, that's the point, a
non-vanishing standard deviation for these fitted coefficients, right??
This somehow is what I expected should be an estimate or, at least, a
good indicator, of the degree of uncertainty which I should assign to
the fitted coefficients; it seems to me these deviations, thus
calculated as a result of the simulation, will certainly not be zero (or
3e-17, for that matter). So this Std. Error does not provide what I,
naively, think should be given as a measure of the uncertainties or
errors in the fitted coefficients...
What am I not getting right??
Thanks and sorry for the naive and non-expert question!
--
#######################################
Prof. Mauricio Ortiz Calvao
Federal University of Rio de Janeiro
Institute of Physics, P O Box 68528
CEP 21941-972 Rio de Janeiro, RJ
Brazil
Email: orca at if.ufrj.br
Phone: (55)(21)25627483
Homepage: http://www.if.ufrj.br/~orca
#######################################
More information about the R-help
mailing list