[R] quadratic trends and changes in slopes

Mon Jan 20 17:43:05 CET 2003

On 20 Jan 2003 at 1:11, Martin Michlmayr wrote:

Hola!

below is a (lengthy) response, in form of R code and analysis with 
simulated data.

> # first simulating some example data:
> x <- 1:9
> t <- 0.5 + 0.1*x + 0.4 * (x-4)*ifelse(x>5,1,0)
> plot(x,t)
> test <- data.frame(x,t)
> rm(x,t)
> # We can do a non-linear regression to estimate a change-point:
> library(nls)
> test.nls1 <- nls(t ~ a+b*x+c*(x-change)*I(x>change), data=test, start=c(a=0, b=0.1, 
+                     c=0.4, change=5))
> summary(test.nls1)

Formula: t ~ a + b * x + c * (x - change) * I(x > change)

Parameters:
       Estimate Std. Error t value Pr(>|t|)
a       0.50000    0.13856   3.608 0.015406
b       0.10000    0.05060   1.976 0.105057
c       0.48000    0.06197   7.746 0.000573
change  4.66667    0.32773  14.239 3.08e-05

Residual standard error: 0.1131 on 5 degrees of freedom

Correlation of Parameter Estimates:
             a       b       c
b      -0.9129                
c       0.7454 -0.8165        
change -0.4894  0.6969 -0.2626

> # But you should be aware that the standard errors assume the expectation function
> # is differentiable, which is not the case here!

> # And so to what you asked: quadratic regressions
> models <- vector(length=9, mode="list")

> for (i in 1:9) {
+   try( models[[i]] <- lm(t ~ x + I(x^2), data=test, subset=1:i) ) }
> lapply(models, summary)
[[1]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
ALL 1 residuals are 0: no residual degrees of freedom!

Coefficients: (2 not defined because of singularities)
            Estimate Std. Error t value Pr(>|t|)
(Intercept)      0.6                            

Residual standard error: NaN on 0 degrees of freedom

[[2]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
ALL 2 residuals are 0: no residual degrees of freedom!

Coefficients: (1 not defined because of singularities)
            Estimate Std. Error t value Pr(>|t|)
(Intercept)      0.5                            
x                0.1                            

Residual standard error: NaN on 0 degrees of freedom
Multiple R-Squared:     1,      Adjusted R-squared:   NaN 
F-statistic:   NaN on 1 and 0 DF,  p-value: NA 

[[3]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
ALL 3 residuals are 0: no residual degrees of freedom!

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  5.0e-01                            
x            1.0e-01                            
I(x^2)       1.7e-16                            

Residual standard error: NaN on 0 degrees of freedom
Multiple R-Squared:     1,      Adjusted R-squared:   NaN 
F-statistic:   NaN on 2 and 0 DF,  p-value: NA 

[[4]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
         1          2          3          4 
 2.069e-17 -6.206e-17  6.206e-17 -2.069e-17 

Coefficients:
             Estimate Std. Error   t value Pr(>|t|)
(Intercept) 5.000e-01  2.576e-16 1.941e+15 3.28e-16
x           1.000e-01  2.350e-16 4.256e+14 1.50e-15
I(x^2)      4.138e-17  4.626e-17 8.940e-01    0.535

Residual standard error: 9.252e-17 on 1 degrees of freedom
Multiple R-Squared:     1,      Adjusted R-squared:     1 
F-statistic: 2.921e+030 on 2 and 1 DF,  p-value: 4.138e-16 

[[5]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
         1          2          3          4          5 
-4.164e-21 -9.710e-19  2.938e-18 -2.946e-18  9.835e-19 

Coefficients:
              Estimate Std. Error    t value Pr(>|t|)
(Intercept)  5.000e-01  6.649e-18  7.520e+16   <2e-16
x            1.000e-01  5.067e-18  1.974e+16   <2e-16
I(x^2)      -1.437e-18  8.285e-19 -1.735e+00    0.225

Residual standard error: 3.1e-18 on 2 degrees of freedom
Multiple R-Squared:     1,      Adjusted R-squared:     1 
F-statistic: 5.203e+033 on 2 and 2 DF,  p-value: < 2.2e-16 

[[6]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
         1          2          3          4          5          6 
-8.571e-02  8.571e-02  1.143e-01  6.592e-17 -2.571e-01  1.429e-01 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.90000    0.34915   2.578    0.082
x           -0.28571    0.22842  -1.251    0.300
I(x^2)       0.07143    0.03194   2.236    0.111

Residual standard error: 0.1952 on 3 degrees of freedom
Multiple R-Squared: 0.8969,     Adjusted R-squared: 0.8281 
F-statistic: 13.05 on 2 and 3 DF,  p-value: 0.03311 

[[7]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
         1          2          3          4          5          6     
     7 
-8.571e-02  8.571e-02  1.143e-01 -1.284e-16 -2.571e-01  1.429e-01  
1.284e-16 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.90000    0.26342   3.417   0.0269
x           -0.28571    0.15096  -1.893   0.1313
I(x^2)       0.07143    0.01844   3.873   0.0179

Residual standard error: 0.169 on 4 degrees of freedom
Multiple R-Squared: 0.9596,     Adjusted R-squared: 0.9394 
F-statistic:  47.5 on 2 and 4 DF,  p-value: 0.001632 

[[8]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
       1        2        3        4        5        6        7        
8 
-0.05000  0.07381  0.07857 -0.03571 -0.26905  0.17857  0.10714 -
0.08333 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.79286    0.23186   3.420  0.01885
x           -0.20238    0.11821  -1.712  0.14757
I(x^2)       0.05952    0.01282   4.642  0.00562

Residual standard error: 0.1662 on 5 degrees of freedom
Multiple R-Squared: 0.9744,     Adjusted R-squared: 0.9642 
F-statistic: 95.26 on 2 and 5 DF,  p-value: 0.0001046 

[[9]]

Call:
lm(formula = t ~ x + I(x^2), data = test, subset = 1:i)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.30476 -0.08571  0.03810  0.06667  0.18095 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.66190    0.22671   2.920  0.02665
x           -0.10952    0.10410  -1.052  0.33326
I(x^2)       0.04762    0.01015   4.690  0.00336

Residual standard error: 0.1782 on 6 degrees of freedom
Multiple R-Squared: 0.9787,     Adjusted R-squared: 0.9716 
F-statistic:   138 on 2 and 6 DF,  p-value: 9.622e-06 

# For more models, you would like to extract only part of the summary 
for each model!

Note that the quadratic term is significant (at 5% level) only for 
the last 3 models, that is, for ranges 1-7, 1-8 and 1-9. This method 
can only possibly give a significant result for the x^2 term when we 
use data some *past* the change point, so as I see it, 
will necessarily overestimate the change point. You should consider
if not the first method given is better!

Kjetil Halvorsen

> I'd like to use linear and quadratic trend analysis in order to find
> out a change in slope.  Basically, I need to solve a similar problem as
> discussed in http://www.gseis.ucla.edu/courses/ed230bc1/cnotes4/trend1.html
> 
> My subjects have counted dots: one dot, two dots, etc. up to 9 dots.
> The reaction time increases with increasing dots.  The theory is that
> 1 up to 3 or 4 points can be counted relatively fast (a process known
> as "subiziting") but that is becomes slower at around 5 dots ("counting").
> The question is when the slope changes.  Most papers in the literature
> determine this by checking when it changes from being a linear trend to
> a quadratric trend. i.e deviation from linearity is seen as evidence
> that the second, slower process is used.
> 
> I'd like to test the ranges 1-2, 1-3, 1-4, 1-5, 1-6, etc and see when
> a qudratric trend is significant.  However, although I have read some
> literature and done many google searches, I cannot figure out how to
> do this with R.  Can anyone show me a simple example of how to do
> this. (Either with the method described above or with a different
> method -- but please note that I only have 9 data points, tho; 1:9).
> 
> Any help is appreciated.
> 
> Thanks.
> 
> 
> FWIW, here's the description from one paper using this method:
> 
> "For both conditions, the subitizing range for each group was established
> using quadratic trend tests on the aggregated RT data for numerosities 1-3,
> 1-4, and so on (Akin and Chase, 1978; Chi and Klahr, 1975; Pylyshyn,
> 1993). For both groups the first appearance of a quadratic trend was in
> the N=1-5 range (t(9) = 7.33, p< .001 for the control group, and t(5) =
> 5.35, p = .005 for the Turner group). This indicates a subitizing range
> of 4 for both groups. This divergence from a linear increase in RT
> suggests the deployment of a new process for the last numerosity added."
> 
> -- 
> Martin Michlmayr
> tbm at cyrius.com
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help