[R] simple lm question

R. Michael Weylandt michael.weylandt at gmail.com
Sat Dec 3 05:10:18 CET 2011


In your code by supplying a vector M[,"e"] you are regressing "e"
against all the variables provided in the data argument, including "e"
itself -- this gives the very strange regression coefficients you
observe. R has no way to know that that's somehow related to the "e"
it sees in the data argument.

In the suggested way,

lm(formula = e ~ ., data = as.data.frame(M))

e is regressed against everything that is not e and sensible results are given.

Michael

On Fri, Dec 2, 2011 at 11:03 PM, Worik R <worikr at gmail.com> wrote:
>>
>> Use `lm` the way it is designed to be used, with a data argument:
>>
>> > l2 <- lm(e~. , data=as.data.frame(M))
>> > summary(l2)
>>
>> Call:
>> lm(formula = e ~ ., data = as.data.frame(M))
>>
>>
> And what is the regression being done in this case?  How are the
> independent  variables used?
>
> It looks like M[,5]~M[,1]+M[,2]+M[,3]+M[,4] as those are the
> coefficients.   But the results are different when I do that explicitly:
>
>> M <- matrix(runif(5*20), nrow=20)
>> colnames(M) <- c('a', 'b', 'c', 'd', 'e')
>> l1 <- lm(df[,'e']~., data=df)
>> summary(l1)
>
> Call:
> lm(formula = df[, "e"] ~ ., data = df)
>
> Residuals:
>       Min         1Q     Median         3Q        Max
> -9.580e-17 -3.360e-17 -8.596e-18  9.114e-18  2.032e-16
>
> Coefficients:
>              Estimate Std. Error    t value Pr(>|t|)
> (Intercept) -7.505e-17  7.158e-17 -1.048e+00    0.312
> a           -1.653e-17  7.117e-17 -2.320e-01    0.820
> b           -5.042e-17  5.480e-17 -9.200e-01    0.373
> c            4.236e-17  5.774e-17  7.340e-01    0.475
> d           -3.878e-17  4.946e-17 -7.840e-01    0.446
> e            1.000e+00  6.083e-17  1.644e+16   <2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 6.763e-17 on 14 degrees of freedom
> Multiple R-squared:     1,    Adjusted R-squared:     1
> F-statistic: 6.435e+31 on 5 and 14 DF,  p-value: < 2.2e-16
>
>> l3 <- lm(M[,5]~M[,1]+M[,2]+M[,3]+M[,4])
>> summary(l3)
>
> Call:
> lm(formula = M[, 5] ~ M[, 1] + M[, 2] + M[, 3] + M[, 4])
>
> Residuals:
>     Min       1Q   Median       3Q      Max
> -0.49398 -0.14203  0.01588  0.14157  0.31335
>
> Coefficients:
>            Estimate Std. Error t value Pr(>|t|)
> (Intercept)   0.6681     0.1859   3.594  0.00266 **
> M[, 1]       -0.1767     0.2419  -0.730  0.47644
> M[, 2]       -0.3874     0.2135  -1.814  0.08970 .
> M[, 3]        0.3695     0.2180   1.695  0.11078
> M[, 4]        0.1361     0.2366   0.575  0.57360
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.2449 on 15 degrees of freedom
> Multiple R-squared: 0.2988,    Adjusted R-squared: 0.1119
> F-statistic: 1.598 on 4 and 15 DF,  p-value: 0.2261
>
>
> cheers
> Worik
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list