Erin Hodgess
Wed Mar 18 00:04:25 CET 2009

Dear R People:

Here is a small data frame and two particular formulas:
> test.df
            y  x
1  -0.9261650  1
2   1.5702700  2
3   0.1673920  3
4   0.7893085  4
5   0.3576875  5
6  -1.4620915  6
7  -0.5506215  7
8  -0.3480292  8
9  -1.2344036  9
10  0.8502660 10
> summary(lm(exp(y)~x))

lm(formula = exp(y) ~ x)

    Min      1Q  Median      3Q     Max
-1.6360 -0.6435 -0.4722  0.4215  2.9127

            Estimate Std. Error t value Pr(>|t|)
(Intercept)   2.1689     0.9782   2.217   0.0574 .
x            -0.1368     0.1577  -0.868   0.4108
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.432 on 8 degrees of freedom
Multiple R-squared: 0.08604,    Adjusted R-squared: -0.0282
F-statistic: 0.7532 on 1 and 8 DF,  p-value: 0.4108

> summary(lm(I(y^2)~x))

lm(formula = I(y^2) ~ x)

    Min      1Q  Median      3Q     Max
-0.9584 -0.6387 -0.2651  0.5754  1.4412

            Estimate Std. Error t value Pr(>|t|)
(Intercept)  1.10084    0.62428   1.763    0.116
x           -0.03813    0.10061  -0.379    0.715

Residual standard error: 0.9138 on 8 degrees of freedom
Multiple R-squared: 0.01764,    Adjusted R-squared: -0.1052
F-statistic: 0.1436 on 1 and 8 DF,  p-value: 0.7146


These both work just fine.

My question is:  when do you know to use I() and just the function of
the variable, please?

thanks in advance,
PS Happy St Pat's Day!

Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodgess at gmail.com

