[R] newbie's additional (probably to some extent OT) questions
Thomas W Blackwell
tblackw at umich.edu
Fri Nov 7 00:24:23 CET 2003
JB and Michael -
I'm coming into this without having reviewed the earlier emails
(if there are any) in this thread. But I will guess that the
data come from a high school physics experiment on gravitational
acceleration which drops a weight dragging a paper tape through
a buzzer with a piece of carbon paper in it. This prints periodic
marks on the paper tape. The data x are the distances traveled
at successive time points following time zero.
I think it's DYNAMITE that you're actually doing this data analysis.
It's what I always wanted to do as a high school student, but didn't
have the technical background then to carry out. In fact ... come to
think of it ... I'm pretty sure I STILL HAVE my high school ticker
tapes folded up among my high school papers somewhere, 35 years
later, still waiting to be properly analyzed !
It makes sense to fit a no-intercept model with no linear term
and only a quadratic term. The model formula x ~ 0 + I(t^2)
does this correctly. (If one wanted to account for friction,
the linear term would come back in.)
Question 1 involves a distinction between the standard deviation
of the residuals and the standard error of an estimate for the
single coefficient in the model. These are not at all the same
concept. The coefficient estimate behaves like a sample average,
and has much smaller sampling variation over repeated experiments
than one observation would. In the no-intercept model, the
standard deviation of the residuals is stated as 0.01945 on 18 df.
In the model WITH an intercept, it is stated as 0.01683 on 17 df.
I don not understand 'MuPad' but I observe an apparent typographical
error in which the second residual standard deviation is reported
instead as 0.006813. All of these three numbers represent the
residual standard deviation. Naturally, this is much larger than
the standard error of an estimate: 0.0001487 or 0.0005367.
Question 2 refers to the estimated value for the intercept in a
model with constant and quadratic terms only (no linear term).
The estimated value is 0.043 +- 0.016 (no units are given).
Gosh, I'm not surprised. The observations and the predictors
are all non-negative. Linear regression produces an unbiased
estimate, given its assumptions, but when there is uncertainty
in the predictors as well, it is known to be biased downward.
(Think of the "two regression lines".) If some of that bias
shows up in the intercept, it's no surprise. If this were a
mission-critical data set, I would certainly plot the residuals
against the fitted values and look for empirical evidence to
judge whether the quadratic-only model is adequate.
HTH - tom blackwell - u michigan medical school - ann arbor -
On Thu, 6 Nov 2003, JB wrote:
> (1)
> So finally, thank to your help I have this:
>
> summary(lm(x ~ 0+I(t^2)))
>
> And then I get this result:
> =================================================
> Call:
> lm(formula = x ~ 0 + I(t^2))
>
> Residuals:
> Min 1Q Median 3Q Max
> -3.332e-02 -9.362e-03 1.169e-05 1.411e-02 3.459e-02
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> I(t^2) 0.0393821 0.0001487 264.8 <2e-16 ***
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> Residual standard error: 0.01945 on 18 degrees of freedom
> Multiple R-Squared: 0.9997, Adjusted R-squared: 0.9997
> F-statistic: 7.014e+04 on 1 and 18 DF, p-value: < 2.2e-16
> =================================================
>
> I see in MuPad, that Delta^2 is 0.006813. Now is not the standard error the
> square root of Delta^2? Should I not get 0.069 as standard error?
>
> (2)
> When I use the model
> summary(lm(x ~ I(t^2)))
> I get (of course) another result with a slightly smaller Delta^2. But I do
> not expect such an error as this would mean that there was a systematic
> error in our measurement of the distance and if I understand the result of
> R correctly, the error was 0.04m which is impossible:
>
> ==========================================================
> Call:
> lm(formula = x ~ I(t^2))
>
> Residuals:
> Min 1Q Median 3Q Max
> -0.0202520 -0.0116533 -0.0006036 0.0036699 0.0432987
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 0.0427606 0.0161085 2.655 0.0167 *
> I(t^2) 0.0379989 0.0005367 70.801 <2e-16 ***
> ---
> Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
>
> Residual standard error: 0.01683 on 17 degrees of freedom
> Multiple R-Squared: 0.9966, Adjusted R-squared: 0.9964
> F-statistic: 5013 on 1 and 17 DF, p-value: < 2.2e-16
> =====================================================
>
> What is going on here?
> (Sorry but I am only a high school teacher and have not much idea of
> statistics.)
>
> TIA,
>
> JB
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
More information about the R-help
mailing list