[R] lmList and lapply(... lm) different std. errors
beatlebg
rhelpforum at gmail.com
Wed Dec 15 13:24:38 CET 2010
Am I trying to perform multiple linear regressions on each 'VARIABLE2'. I
figured out that there are different ways, using the following code: (data
is given at the end of this message)
reg <- lapply(split(TRY, VARIABLE2), function(X){lm(X2 ~ X3, data=X)})
lapply(reg, summary)
Which produces the following:
$`1`
Call:
lm(formula = X2 ~ X3, data = X)
Residuals:
Min 1Q Median 3Q Max
-1.24233 -0.30028 0.03706 0.46170 1.12408
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.0705 0.2323 13.215 5.95e-15 ***
X3 0.4744 0.2640 1.797 0.0813 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.5752 on 34 degrees of freedom
Multiple R-squared: 0.08672, Adjusted R-squared: 0.05986
F-statistic: 3.228 on 1 and 34 DF, p-value: 0.08126
$`2`
Call:
lm(formula = X2 ~ X3, data = X)
Residuals:
Min 1Q Median 3Q Max
-1.1358 -0.6403 0.2505 0.4055 1.2088
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.5859 0.2968 8.713 4.53e-10 ***
X3 0.4957 0.3435 1.443 0.158
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.6765 on 33 degrees of freedom
Multiple R-squared: 0.05937, Adjusted R-squared: 0.03086
F-statistic: 2.083 on 1 and 33 DF, p-value: 0.1584
$`3`
Call:
lm(formula = X2 ~ X3, data = X)
Residuals:
Min 1Q Median 3Q Max
-1.70021 -0.66049 -0.00138 0.81210 1.26162
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.9473 0.3522 5.529 2.73e-06 ***
X3 0.8515 0.3954 2.154 0.0378 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.8979 on 37 degrees of freedom
Multiple R-squared: 0.1114, Adjusted R-squared: 0.08739
F-statistic: 4.639 on 1 and 37 DF, p-value: 0.03784
It should also be possible to use the lmList function, but remarkebly, I get
the same estimates, but different Std. Errors... I used the following code:
modlst <- lmList(X2 ~ X3 | VARIABLE2, TRY)
summary(modlst)
Which produces
Call:
Model: X2 ~ X3 | VARIABLE2
Data: TRY
Coefficients:
(Intercept)
Estimate Std. Error t value Pr(>|t|)
1 3.070507 0.2969014 10.341841 0.000000e+00
2 2.585938 0.3224380 8.019952 1.665779e-12
3 1.947292 0.2882936 6.754546 8.454271e-10
X3
Estimate Std. Error t value Pr(>|t|)
1 0.4744112 0.3373931 1.406108 0.162672738
2 0.4957349 0.3731949 1.328354 0.186968753
3 0.8515270 0.3236325 2.631154 0.009803152
Residual standard error: 0.7350239 on 104 degrees of freedom
I do not understand what is the difference between these two methods and
what causes the difference in Std. Errors. Which method is preferable? I
checked the results with other software programm, and those results
corresponded with the first method...
I really hope someone can explain where I made a mistake. Thank you.
data.frame: TRY:
VARIABLE2 X2 X3
1 1 2.3025851 1.00000000
2 1 3.8286414 1.00000000
3 1 4.3820266 1.00000000
4 1 3.6375862 1.00000000
5 1 3.7841896 1.00000000
6 1 3.4965076 1.00000000
7 1 2.8332133 1.00000000
8 1 3.6375862 1.00000000
9 1 4.0775374 1.00000000
10 1 3.4339872 1.00000000
11 1 3.5263605 1.00000000
12 1 3.0445224 1.00000000
13 1 2.8332133 1.00000000
14 1 2.7725887 1.00000000
15 1 3.0910425 1.00000000
16 1 4.1108739 1.00000000
17 1 3.2958369 1.00000000
18 1 2.7080502 1.00000000
19 1 2.9957323 1.00000000
20 1 3.6375862 1.00000000
21 1 3.8918203 1.00000000
22 1 3.8712010 1.00000000
23 1 3.4011974 1.00000000
24 1 3.2958369 1.00000000
25 1 4.1271344 1.00000000
26 1 4.1588831 1.00000000
27 1 4.1271344 0.90476190
28 1 3.8712010 0.66666667
29 1 4.5108595 0.66666667
30 1 3.9120230 0.33333333
31 1 3.6375862 0.23809524
32 1 3.4339872 0.04761905
33 1 2.8903718 0.00000000
34 1 2.8903718 0.00000000
35 1 2.8332133 0.00000000
36 1 1.9459101 0.00000000
37 2 2.0794415 1.00000000
38 2 3.4657359 1.00000000
39 2 3.9889840 1.00000000
40 2 3.4339872 1.00000000
41 2 3.4011974 1.00000000
42 2 3.3322045 1.00000000
43 2 2.8903718 1.00000000
44 2 3.3672958 1.00000000
45 2 3.3322045 1.00000000
46 2 3.4339872 1.00000000
47 2 3.4011974 1.00000000
48 2 3.2958369 1.00000000
49 2 2.8332133 1.00000000
50 2 3.3322045 1.00000000
51 2 3.3672958 1.00000000
52 2 3.6635616 1.00000000
53 2 2.8903718 1.00000000
54 2 1.9459101 1.00000000
55 2 2.0794415 1.00000000
56 2 2.3025851 1.00000000
57 2 2.4849066 1.00000000
58 2 2.0794415 1.00000000
59 2 2.3978953 1.00000000
60 2 2.4849066 1.00000000
61 2 4.2904594 1.00000000
62 2 3.9889840 0.57142857
63 2 3.6109179 0.52380952
64 2 3.5553481 0.33333333
65 2 3.1780538 0.33333333
66 2 3.1780538 0.33333333
67 2 2.7725887 0.33333333
68 2 3.1354942 0.19047619
69 2 1.7917595 0.09523810
70 2 1.9459101 0.19047619
71 2 1.6094379 0.00000000
72 3 2.3978953 1.00000000
73 3 2.4849066 1.00000000
74 3 1.6094379 1.00000000
75 3 1.3862944 1.00000000
76 3 1.7917595 1.00000000
77 3 1.0986123 1.00000000
78 3 2.0794415 1.00000000
79 3 1.3862944 1.00000000
80 3 1.9459101 1.00000000
81 3 3.1780538 1.00000000
82 3 2.1972246 1.00000000
83 3 2.4849066 1.00000000
84 3 2.6390573 1.00000000
85 3 3.6109179 1.00000000
86 3 2.3978953 1.00000000
87 3 2.1972246 1.00000000
88 3 1.6094379 1.00000000
89 3 3.0910425 1.00000000
90 3 3.6888795 1.00000000
91 3 3.3672958 1.00000000
92 3 3.4011974 1.00000000
93 3 2.4849066 1.00000000
94 3 3.4657359 1.00000000
95 3 4.0604430 1.00000000
96 3 3.6635616 1.00000000
97 3 3.6109179 1.00000000
98 3 3.8286414 1.00000000
99 3 3.6375862 1.00000000
100 3 3.7135721 1.00000000
101 3 3.8918203 0.80952381
102 3 3.7376696 0.85714286
103 3 3.0445224 0.66666667
104 3 3.2958369 0.33333333
105 3 2.7080502 0.00000000
106 3 1.9459101 0.00000000
107 3 2.4849066 0.04761905
108 3 1.9459101 0.00000000
109 3 0.6931472 0.00000000
--
View this message in context: http://r.789695.n4.nabble.com/lmList-and-lapply-lm-different-std-errors-tp3088903p3088903.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list