[R] simple lm question
R. Michael Weylandt
michael.weylandt at gmail.com
Sat Dec 3 05:10:18 CET 2011
In your code by supplying a vector M[,"e"] you are regressing "e"
against all the variables provided in the data argument, including "e"
itself -- this gives the very strange regression coefficients you
observe. R has no way to know that that's somehow related to the "e"
it sees in the data argument.
In the suggested way,
lm(formula = e ~ ., data = as.data.frame(M))
e is regressed against everything that is not e and sensible results are given.
Michael
On Fri, Dec 2, 2011 at 11:03 PM, Worik R <worikr at gmail.com> wrote:
>>
>> Use `lm` the way it is designed to be used, with a data argument:
>>
>> > l2 <- lm(e~. , data=as.data.frame(M))
>> > summary(l2)
>>
>> Call:
>> lm(formula = e ~ ., data = as.data.frame(M))
>>
>>
> And what is the regression being done in this case? How are the
> independent variables used?
>
> It looks like M[,5]~M[,1]+M[,2]+M[,3]+M[,4] as those are the
> coefficients. But the results are different when I do that explicitly:
>
>> M <- matrix(runif(5*20), nrow=20)
>> colnames(M) <- c('a', 'b', 'c', 'd', 'e')
>> l1 <- lm(df[,'e']~., data=df)
>> summary(l1)
>
> Call:
> lm(formula = df[, "e"] ~ ., data = df)
>
> Residuals:
> Min 1Q Median 3Q Max
> -9.580e-17 -3.360e-17 -8.596e-18 9.114e-18 2.032e-16
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) -7.505e-17 7.158e-17 -1.048e+00 0.312
> a -1.653e-17 7.117e-17 -2.320e-01 0.820
> b -5.042e-17 5.480e-17 -9.200e-01 0.373
> c 4.236e-17 5.774e-17 7.340e-01 0.475
> d -3.878e-17 4.946e-17 -7.840e-01 0.446
> e 1.000e+00 6.083e-17 1.644e+16 <2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 6.763e-17 on 14 degrees of freedom
> Multiple R-squared: 1, Adjusted R-squared: 1
> F-statistic: 6.435e+31 on 5 and 14 DF, p-value: < 2.2e-16
>
>> l3 <- lm(M[,5]~M[,1]+M[,2]+M[,3]+M[,4])
>> summary(l3)
>
> Call:
> lm(formula = M[, 5] ~ M[, 1] + M[, 2] + M[, 3] + M[, 4])
>
> Residuals:
> Min 1Q Median 3Q Max
> -0.49398 -0.14203 0.01588 0.14157 0.31335
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 0.6681 0.1859 3.594 0.00266 **
> M[, 1] -0.1767 0.2419 -0.730 0.47644
> M[, 2] -0.3874 0.2135 -1.814 0.08970 .
> M[, 3] 0.3695 0.2180 1.695 0.11078
> M[, 4] 0.1361 0.2366 0.575 0.57360
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.2449 on 15 degrees of freedom
> Multiple R-squared: 0.2988, Adjusted R-squared: 0.1119
> F-statistic: 1.598 on 4 and 15 DF, p-value: 0.2261
>
>
> cheers
> Worik
>
> [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list