[R] Linear Regression Question

Wed Oct 14 11:53:31 CEST 2009

On 13-Oct-09 21:17:11, Alexandre Cohen wrote:
> Dear Sir or Madam,
> I am a student at MSc Probability and Finance at Paris 6 University/ 
> Ecole Polytechnique. I am using R and I can't find an answer to the  
> following question. I will be very thankful if you can answer it.
> 
> I have two vectors rendements_CAC40 and rendements_AlcatelLucent.
> I use the lm function as follows, and then the sumarry function:
> 
> regression=lm(rendements_CAC40 ~ rendements_AlcatelLucent);
> sum=summarry(regression);
> 
> I obtain:
> 
> Call:
> lm(formula = rendements_CAC40 ~ rendements_AlcatelLucent)
> 
> Residuals:
>       Min       1Q   Median       3Q      Max
> -6.43940 -0.84170 -0.01124  0.76235  9.08087
> 
> Coefficients:
>                           Estimate Std. Error t value Pr(>|t|)
> (Intercept)              -0.03579    0.07113  -0.503    0.615
> rendements_AlcatelLucent  0.33951    0.01732  19.608   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Residual standard error: 1.617 on 515 degrees of freedom
> Multiple R-squared: 0.4274,   Adjusted R-squared: 0.4263
> F-statistic: 384.5 on 1 and 515 DF,  p-value: < 2.2e-16
> 
> I would like to access to the p-value field, but I can't find the name 
> of it, as we can see it below:
> 
>  > names(sum)
>   [1] "call"          "terms"         "residuals"     "coefficients"   
> "aliased"       "sigma"         "df"            "r.squared"
>   [9] "adj.r.squared" "fstatistic"    "cov.unscaled"
> 
> I thought that I could find it in the fstatistic field, but it is not:
> 
> sum$fstatistic
>     value    numdf    dendf
> 384.4675   1.0000 515.0000
> 
> Thank in advance for your time,
> Kind regards,
> Alexandre Cohen

Assuming you gave executed your code with "summary" correctly spelled
(i.e. not "summarry" or "sumarry" as you have written above), then
the information you require can be found in

  sum$coefficients

which you can as well write as sum$coef

You will find that sum$coef is an array with 4 columns ("Estimate",
"Std. Error", "t value" and "Pr(>|t|)"), so the P-values are in the
final column sum$coef[,4].

Emulating your calculation above with toy regression data:

  X <- (0:10) ; Y <- 1.0 + 0.25*X + 2.5*rnorm(11)
  regression <- lm(Y~X)
  sum <- summary(regression)
  sum
  # Call:
  # lm(formula = Y ~ X)
  # Residuals:
  #     Min      1Q  Median      3Q     Max 
  # -5.7182 -1.5383  0.2989  1.9806  3.9364 
  # Coefficients:
  #             Estimate Std. Error t value Pr(>|t|)
  # (Intercept)  2.10035    1.81418   1.158    0.277
  # X           -0.03147    0.30665  -0.103    0.921
  #
  # Residual standard error: 3.216 on 9 degrees of freedom
  # Multiple R-squared: 0.001169,   Adjusted R-squared: -0.1098 
  # F-statistic: 0.01053 on 1 and 9 DF,  p-value: 0.9205 

  sum$coef
  #               Estimate Std. Error    t value  Pr(>|t|)
  # (Intercept)  2.1003505  1.8141796  1.1577412 0.2767698
  # X           -0.0314672  0.3066523 -0.1026152 0.9205184

  sum$coef[,4]
  # (Intercept)           X 
  #   0.2767698   0.9205184 

[And, by the way, although it in fact works, it is not a good idea
to use a function name ("sum") as the name of a variable.]

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 14-Oct-09                                       Time: 10:53:28
------------------------------ XFMail ------------------------------