[R] quadratic trends and changes in slopes
kjetil brinchmann halvorsen
kjetil at entelnet.bo
Mon Jan 20 17:43:05 CET 2003
On 20 Jan 2003 at 1:11, Martin Michlmayr wrote:
Hola!
below is a (lengthy) response, in form of R code and analysis with
simulated data.
> # first simulating some example data:
> x <- 1:9
> t <- 0.5 + 0.1*x + 0.4 * (x-4)*ifelse(x>5,1,0)
> plot(x,t)
> test <- data.frame(x,t)
> rm(x,t)
> # We can do a non-linear regression to estimate a change-point:
> library(nls)
> test.nls1 <- nls(t ~ a+b*x+c*(x-change)*I(x>change), data=test, start=c(a=0, b=0.1,
+ c=0.4, change=5))
> summary(test.nls1)
Formula: t ~ a + b * x + c * (x - change) * I(x > change)
Parameters:
Estimate Std. Error t value Pr(>|t|)
a 0.50000 0.13856 3.608 0.015406
b 0.10000 0.05060 1.976 0.105057
c 0.48000 0.06197 7.746 0.000573
change 4.66667 0.32773 14.239 3.08e-05
Residual standard error: 0.1131 on 5 degrees of freedom
Correlation of Parameter Estimates:
a b c
b -0.9129
c 0.7454 -0.8165
change -0.4894 0.6969 -0.2626
> # But you should be aware that the standard errors assume the expectation function
> # is differentiable, which is not the case here!
> # And so to what you asked: quadratic regressions
> models <- vector(length=9, mode="list")
> for (i in 1:9) {
+ try( models[[i]] <- lm(t ~ x + I(x^2), data=test, subset=1:i) ) }
> lapply(models, summary)
[[1]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!
Coefficients: (2 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.6
Residual standard error: NaN on 0 degrees of freedom
[[2]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
ALL 2 residuals are 0: no residual degrees of freedom!
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.5
x 0.1
Residual standard error: NaN on 0 degrees of freedom
Multiple R-Squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 1 and 0 DF, p-value: NA
[[3]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
ALL 3 residuals are 0: no residual degrees of freedom!
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.0e-01
x 1.0e-01
I(x^2) 1.7e-16
Residual standard error: NaN on 0 degrees of freedom
Multiple R-Squared: 1, Adjusted R-squared: NaN
F-statistic: NaN on 2 and 0 DF, p-value: NA
[[4]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
1 2 3 4
2.069e-17 -6.206e-17 6.206e-17 -2.069e-17
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.000e-01 2.576e-16 1.941e+15 3.28e-16
x 1.000e-01 2.350e-16 4.256e+14 1.50e-15
I(x^2) 4.138e-17 4.626e-17 8.940e-01 0.535
Residual standard error: 9.252e-17 on 1 degrees of freedom
Multiple R-Squared: 1, Adjusted R-squared: 1
F-statistic: 2.921e+030 on 2 and 1 DF, p-value: 4.138e-16
[[5]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
1 2 3 4 5
-4.164e-21 -9.710e-19 2.938e-18 -2.946e-18 9.835e-19
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.000e-01 6.649e-18 7.520e+16 <2e-16
x 1.000e-01 5.067e-18 1.974e+16 <2e-16
I(x^2) -1.437e-18 8.285e-19 -1.735e+00 0.225
Residual standard error: 3.1e-18 on 2 degrees of freedom
Multiple R-Squared: 1, Adjusted R-squared: 1
F-statistic: 5.203e+033 on 2 and 2 DF, p-value: < 2.2e-16
[[6]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
1 2 3 4 5 6
-8.571e-02 8.571e-02 1.143e-01 6.592e-17 -2.571e-01 1.429e-01
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.90000 0.34915 2.578 0.082
x -0.28571 0.22842 -1.251 0.300
I(x^2) 0.07143 0.03194 2.236 0.111
Residual standard error: 0.1952 on 3 degrees of freedom
Multiple R-Squared: 0.8969, Adjusted R-squared: 0.8281
F-statistic: 13.05 on 2 and 3 DF, p-value: 0.03311
[[7]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
1 2 3 4 5 6
7
-8.571e-02 8.571e-02 1.143e-01 -1.284e-16 -2.571e-01 1.429e-01
1.284e-16
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.90000 0.26342 3.417 0.0269
x -0.28571 0.15096 -1.893 0.1313
I(x^2) 0.07143 0.01844 3.873 0.0179
Residual standard error: 0.169 on 4 degrees of freedom
Multiple R-Squared: 0.9596, Adjusted R-squared: 0.9394
F-statistic: 47.5 on 2 and 4 DF, p-value: 0.001632
[[8]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
1 2 3 4 5 6 7
8
-0.05000 0.07381 0.07857 -0.03571 -0.26905 0.17857 0.10714 -
0.08333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.79286 0.23186 3.420 0.01885
x -0.20238 0.11821 -1.712 0.14757
I(x^2) 0.05952 0.01282 4.642 0.00562
Residual standard error: 0.1662 on 5 degrees of freedom
Multiple R-Squared: 0.9744, Adjusted R-squared: 0.9642
F-statistic: 95.26 on 2 and 5 DF, p-value: 0.0001046
[[9]]
Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)
Residuals:
Min 1Q Median 3Q Max
-0.30476 -0.08571 0.03810 0.06667 0.18095
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.66190 0.22671 2.920 0.02665
x -0.10952 0.10410 -1.052 0.33326
I(x^2) 0.04762 0.01015 4.690 0.00336
Residual standard error: 0.1782 on 6 degrees of freedom
Multiple R-Squared: 0.9787, Adjusted R-squared: 0.9716
F-statistic: 138 on 2 and 6 DF, p-value: 9.622e-06
# For more models, you would like to extract only part of the summary
for each model!
Note that the quadratic term is significant (at 5% level) only for
the last 3 models, that is, for ranges 1-7, 1-8 and 1-9. This method
can only possibly give a significant result for the x^2 term when we
use data some *past* the change point, so as I see it,
will necessarily overestimate the change point. You should consider
if not the first method given is better!
Kjetil Halvorsen
> I'd like to use linear and quadratic trend analysis in order to find
> out a change in slope. Basically, I need to solve a similar problem as
> discussed in http://www.gseis.ucla.edu/courses/ed230bc1/cnotes4/trend1.html
>
> My subjects have counted dots: one dot, two dots, etc. up to 9 dots.
> The reaction time increases with increasing dots. The theory is that
> 1 up to 3 or 4 points can be counted relatively fast (a process known
> as "subiziting") but that is becomes slower at around 5 dots ("counting").
> The question is when the slope changes. Most papers in the literature
> determine this by checking when it changes from being a linear trend to
> a quadratric trend. i.e deviation from linearity is seen as evidence
> that the second, slower process is used.
>
> I'd like to test the ranges 1-2, 1-3, 1-4, 1-5, 1-6, etc and see when
> a qudratric trend is significant. However, although I have read some
> literature and done many google searches, I cannot figure out how to
> do this with R. Can anyone show me a simple example of how to do
> this. (Either with the method described above or with a different
> method -- but please note that I only have 9 data points, tho; 1:9).
>
> Any help is appreciated.
>
> Thanks.
>
>
> FWIW, here's the description from one paper using this method:
>
> "For both conditions, the subitizing range for each group was established
> using quadratic trend tests on the aggregated RT data for numerosities 1-3,
> 1-4, and so on (Akin and Chase, 1978; Chi and Klahr, 1975; Pylyshyn,
> 1993). For both groups the first appearance of a quadratic trend was in
> the N=1-5 range (t(9) = 7.33, p< .001 for the control group, and t(5) =
> 5.35, p = .005 for the Turner group). This indicates a subitizing range
> of 4 for both groups. This divergence from a linear increase in RT
> suggests the deployment of a new process for the last numerosity added."
>
> --
> Martin Michlmayr
> tbm at cyrius.com
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
More information about the R-help
mailing list