[R] retrieving p-values in lm

Marc Schwartz MSchwartz at mn.rr.com
Fri Dec 9 15:00:02 CET 2005


On Fri, 2005-12-09 at 14:19 +0100, Patrick Kuss wrote:
> Dear list,
> 
> I want to retrieve the p-value of a two-polynomial regression. For a
> one-polynomial lm I can easily do this with:
> summary(lm(b~a, data=c)[[4]][[8]].
> 
> But how do I find the final p-value in the two-polynomial regression? Under
> $coefficients I don't find it
> 
> Any suggestions?
> 
> Patrick
> 
> alt <-(2260,2183,2189,1930,2435,
> 2000,2100,2050,2020,2470,
> 1700,2310,2090,1560,2060,
> 1790,1940,2100,2250,2010)
> 
> H <- c(0.2034,0.1845,0.2053,0.1788,0.2196,
> 0.2037,0.1655,0.2176,0.1844,0.2033,
> 0.1393,0.2019,0.1975,0.1490,0.1917,
> 0.2180,0.2064,0.1943,0.2139,0.1320)
> 
> X <- data.frame(alt,H)
> 
> lm.res <- summary(lm(H~alt,data=X))
> lm.res
> p1 <- lm.res[[4]][[8]]
> p1
> 
> lm.res.2 <- summary(lm(H~alt+I(alt^2),data=X))
> lm.res.2
> str(lm.res.2) # where is p
> 
> p2 <- lm.res.2[[???]][[????]]


First, you might want to review Chapter 11: Statistical Models in R in
An Introduction to R, which is available with your R installation or
from the main R web site under Documentation. Specifically, page 53
describes the extractor functions to be used for getting model
information.

In this case using coef() will extract the model coefficients in both
cases:

> coef(lm.res)
                Estimate   Std. Error  t value   Pr(>|t|)
(Intercept) 6.245371e-02 4.713400e-02 1.325024 0.20173833
alt         6.179038e-05 2.261665e-05 2.732074 0.01368545

> coef(lm.res.2)
                 Estimate   Std. Error    t value  Pr(>|t|)
(Intercept) -9.433748e-02 3.133627e-01 -0.3010488 0.7670283
alt          2.178857e-04 3.091330e-04  0.7048283 0.4904618
I(alt^2)    -3.838002e-08 7.579576e-08 -0.5063610 0.6191070


In both models, the coefficients are present if you review the structure
as you have in your code above:

> names(lm.res)
 [1] "call"          "terms"         "residuals"     "coefficients" 
 [5] "aliased"       "sigma"         "df"            "r.squared"    
 [9] "adj.r.squared" "fstatistic"    "cov.unscaled" 

> names(lm.res.2)
 [1] "call"          "terms"         "residuals"     "coefficients" 
 [5] "aliased"       "sigma"         "df"            "r.squared"    
 [9] "adj.r.squared" "fstatistic"    "cov.unscaled" 


So, you can get the term p values by using:

> coef(lm.res)[, 4]
(Intercept)         alt 
 0.20173833  0.01368545 

> coef(lm.res.2)[, 4]
(Intercept)         alt    I(alt^2) 
  0.7670283   0.4904618   0.6191070 


In terms of the overall model p value, this is actually calculated when
you display (print) the model. It is not stored as part of the model
object itself. If you review the code for print.summary.lm() using:

> stats:::print.summary.lm

...
   pf(x$fstatistic[1], x$fstatistic[2], x$fstatistic[3],
      lower.tail = FALSE)
...


Where the first argument is the F statistic and the other two are the
degrees of freedom:

> lm.res$fstatistic
    value     numdf     dendf 
 7.464231  1.000000 18.000000 

> lm.res.2$fstatistic
    value     numdf     dendf 
 3.706139  2.000000 17.000000 


So, in the case of your two models:

> x <- lm.res
> pf(x$fstatistic[1], x$fstatistic[2], x$fstatistic[3],
     lower.tail = FALSE)
     value 
0.01368545 


> x <- lm.res.2
> pf(x$fstatistic[1], x$fstatistic[2], x$fstatistic[3], 
     lower.tail = FALSE)
    value 
0.0461472 


HTH,

Marc Schwartz




More information about the R-help mailing list