[R] simple lm question
David Winsemius
dwinsemius at comcast.net
Sat Dec 3 05:41:01 CET 2011
On Dec 2, 2011, at 11:20 PM, Worik R wrote:
> Duh! Silly me! But my confusion persits: What is the regression
> being
> done? See below....
<Sigh> Please note that your "df" and "M" are undoubtedly different
objects by now:
> M <- matrix(runif(5*20), nrow=20)
> colnames(M) <- c('a', 'b', 'c', 'd', 'e')
> l1 <- lm(e~., data=as.data.frame(M))
> l1
Call:
lm(formula = e ~ ., data = as.data.frame(M))
Coefficients:
(Intercept) a b c d
0.40139 -0.15032 -0.06242 0.13139 0.23905
> l3 <- lm(M[,5]~M[,1]+M[,2]+M[,3]+M[,4])
> l3
Call:
lm(formula = M[, 5] ~ M[, 1] + M[, 2] + M[, 3] + M[, 4])
Coefficients:
(Intercept) M[, 1] M[, 2] M[, 3] M[, 4]
0.40139 -0.15032 -0.06242 0.13139 0.23905
As expected.
--
David.
>
> On Sat, Dec 3, 2011 at 5:10 PM, R. Michael Weylandt <
> michael.weylandt at gmail.com> wrote:
>
>> In your code by supplying a vector M[,"e"] you are regressing "e"
>> against all the variables provided in the data argument, including
>> "e"
>> itself -- this gives the very strange regression coefficients you
>> observe. R has no way to know that that's somehow related to the "e"
>> it sees in the data argument.
>>
>
>> In the suggested way,
>>
>> lm(formula = e ~ ., data = as.data.frame(M))
>>
>> e is regressed against everything that is not e and sensible
>> results are
>> given.
>>
>
> But still 'l1 <- lm(e~., data=df)' is not the same as 'l3 <-
> lm(M[,5]~M[,1]+M[,2]+M[,3]+M[,4])'
>
>> M <- matrix(runif(5*20), nrow=20)
>> colnames(M) <- c('a', 'b', 'c', 'd', 'e')
>> l1 <- lm(e~., data=df)
>> summary(l1)
>
> Call:
> lm(formula = e ~ ., data = df)
>
> Residuals:
> Min 1Q Median 3Q Max
> -0.38343 -0.21367 0.03067 0.13757 0.49080
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 0.28521 0.29477 0.968 0.349
> a 0.09283 0.30112 0.308 0.762
> b 0.23921 0.22425 1.067 0.303
> c -0.16027 0.24154 -0.664 0.517
> d 0.24025 0.20054 1.198 0.250
>
> Residual standard error: 0.2871 on 15 degrees of freedom
> Multiple R-squared: 0.1602, Adjusted R-squared: -0.06375
> F-statistic: 0.7153 on 4 and 15 DF, p-value: 0.5943
>
>> l3 <- lm(M[,5]~M[,1]+M[,2]+M[,3]+M[,4])
>> summary(l3)
>
> Call:
> lm(formula = M[, 5] ~ M[, 1] + M[, 2] + M[, 3] + M[, 4])
>
> Residuals:
> Min 1Q Median 3Q Max
> -0.36355 -0.22679 -0.01202 0.18462 0.37377
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 0.76972 0.24501 3.142 0.00672 **
> M[, 1] -0.23830 0.24123 -0.988 0.33890
> M[, 2] -0.02046 0.21958 -0.093 0.92699
> M[, 3] -0.29518 0.22559 -1.308 0.21040
> M[, 4] -0.31545 0.24570 -1.284 0.21866
> ---
> Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
>
> Residual standard error: 0.2668 on 15 degrees of freedom
> Multiple R-squared: 0.2762, Adjusted R-squared: 0.08317
> F-statistic: 1.431 on 4 and 15 DF, p-value: 0.272
>
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list