[R] Interpreting Multiple Linear Regression Summary
Marc Schwartz
marc_schwartz at me.com
Wed Nov 9 20:28:59 CET 2011
On Nov 9, 2011, at 1:17 PM, Rich Shepard wrote:
> On Wed, 9 Nov 2011, Daniel Nordlund wrote:
>
>> I would guess that there is something problematic with the how the data
>> frame is structured relative to what lm() is expecting.
>
> Dan,
>
> I was not comfortable with my explanation, but the formula (and data
> frame) was equivalent to those of the other 8 streams.
>
>> So, I would not give up looking for a solution just yet.
>
> OK. I'm always up for learning more about R and its processes.
>
>> Can you show us the result of str() on the data frame that you attached?
>
> Sure. I subset the original data frame to select only the 6 predictor
> variables and the response variable. Same lm() results. I'll provide the
> data frame, too.
>
> summary(lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data =
> mod.stump.cast))
>
> Call:
> lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data = mod.stump.cast)
>
> Residuals:
> ALL 1 residuals are 0: no residual degrees of freedom!
>
> Coefficients: (6 not defined because of singularities)
> Estimate Std. Error t value Pr(>|t|)
> (Intercept) 125 NA NA NA
> Cond NA NA NA NA
> Ca NA NA NA NA
> Cl NA NA NA NA
> Mg NA NA NA NA
> Na NA NA NA NA
> SO4 NA NA NA NA
>
> Residual standard error: NaN on 0 degrees of freedom
> (63 observations deleted due to missingness)
>
> str(mod.stump.cast)
> 'data.frame': 64 obs. of 7 variables:
> $ Ca : num NA NA 24.4 NA 21.4 NA NA NA NA NA ...
> $ Cl : num 1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ...
> $ Cond: num NA NA 190 187 184 NA NA NA NA NA ...
> $ Mg : num NA NA 10 NA 9.1 NA NA NA NA NA ...
> $ Na : num NA NA NA NA NA NA NA NA NA NA ...
> $ SO4 : num 9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ...
> $ TDS : num 105 181 112 144 114 308 96 430 108 108 ...
>
> summary(mod.stump.cast)
> Ca Cl Cond Mg Na
> Min. : 0.60 Min. : 1.000 Min. : 2.2 Min. : 9.10 Min. : 4
> 1st Qu.:23.35 1st Qu.: 2.000 1st Qu.:214.8 1st Qu.:11.00 1st Qu.: 4
> Median :28.35 Median : 4.000 Median :282.5 Median :17.40 Median : 4
> Mean :32.77 Mean : 4.076 Mean :294.6 Mean :17.85 Mean : 4
> 3rd Qu.:40.55 3rd Qu.: 5.600 3rd Qu.:372.0 3rd Qu.:22.10 3rd Qu.: 4
> Max. :64.30 Max. :13.000 Max. :636.0 Max. :32.40 Max. : 4
> NA's :50.00 NA's :11.000 NA's : 42.0 NA's :51.00 NA's :62
> SO4 TDS
> Min. : 4.00 Min. : 14.0
> 1st Qu.: 7.00 1st Qu.:131.2
> Median : 9.40 Median :174.0
> Mean : 16.31 Mean :176.9
> 3rd Qu.: 17.00 3rd Qu.:195.5
> Max. :105.00 Max. :430.0
> NA's : 3.00 NA's : 2.0
>
> mod.stump.cast
> Ca Cl Cond Mg Na SO4 TDS
> 1 NA 1.58 NA NA NA 9.4 105
> 2 NA 5.60 NA NA NA 6.5 181
> 3 24.4 3.00 190.0 10.0 NA 9.0 112
> 4 NA NA 187.0 NA NA NA 144
> 5 21.4 1.00 184.0 9.1 NA 7.0 114
> 6 NA 5.00 NA NA NA 55.0 308
> 7 NA 1.20 NA NA NA 6.8 96
> 8 NA 4.00 NA NA NA 105.0 430
> 9 NA 4.00 NA NA NA 15.6 108
> 10 NA 8.40 NA NA NA 8.4 108
> 11 NA 1.00 NA NA NA 8.8 125
> 12 NA 1.40 NA NA NA 19.4 129
> 13 NA 4.90 NA NA NA 37.0 360
> 14 NA 1.70 NA NA NA 12.0 140
> 15 NA 2.00 NA NA NA 10.0 95
> 16 NA 1.60 NA NA NA 9.1 120
> 17 NA 3.30 NA NA NA 34.0 280
> 18 NA 2.20 NA NA NA 11.0 130
> 19 NA 9.00 NA NA NA 69.0 352
> 20 NA 1.00 NA NA NA 18.0 148
> 21 NA 2.00 NA NA NA 9.0 107
> 22 28.0 1.00 248.0 11.0 4 13.0 125
> 23 32.0 1.00 NA 12.0 4 9.0 139
> 24 NA 5.00 NA NA NA 7.0 188
> 25 NA 4.00 NA NA NA 6.0 201
> 26 NA 3.00 NA NA NA 5.0 178
> 27 NA 2.27 NA NA NA 7.8 197
> 28 NA 1.76 NA NA NA 7.8 187
> 29 NA 5.81 NA NA NA 7.5 182
> 30 NA 4.23 NA NA NA 6.0 165
> 31 NA 4.23 NA NA NA 7.3 186
> 32 NA 6.25 NA NA NA 7.0 191
> 33 NA 6.72 NA NA NA 7.5 190
> 34 34.7 4.00 304.0 17.4 NA 6.0 176
> 35 NA NA 354.0 NA NA 7.0 175
> 36 42.5 5.00 379.0 21.1 NA 7.0 220
> 37 NA 5.80 NA NA NA 5.6 163
> 38 26.0 5.80 300.0 24.0 NA 5.6 163
> 39 NA 2.20 NA NA NA 5.4 152
> 40 NA 5.40 NA NA NA 11.0 221
> 41 NA 5.40 NA NA NA 10.5 171
> 42 NA 4.80 NA NA NA 9.9 204
> 43 NA 8.00 NA NA NA 11.7 174
> 44 NA 1.00 NA NA NA 8.4 190
> 45 NA 4.80 NA NA NA 12.1 174
> 46 NA 5.90 NA NA NA 16.0 210
> 47 NA 5.90 NA NA NA 20.0 190
> 48 NA 13.00 NA NA NA 7.6 180
> 49 NA 5.60 NA NA NA 17.0 200
> 50 NA 1.20 NA NA NA 6.5 180
> 51 0.6 NA 2.2 NA NA NA NA
> 52 21.4 NA 187.0 9.5 NA 8.0 120
> 53 NA NA 285.0 NA NA 22.0 135
> 54 48.3 3.00 378.0 22.1 NA 24.0 228
> 55 63.5 7.00 533.0 29.9 NA 44.0 14
> 56 NA NA 207.0 NA NA NA NA
> 57 NA NA 262.0 NA NA 13.0 156
> 58 28.7 2.00 244.0 12.6 NA 13.0 140
> 59 NA NA 238.0 NA NA 12.0 128
> 60 NA NA 280.0 NA NA 18.0 160
> 61 NA NA 380.0 NA NA 23.0 215
> 62 NA NA 402.0 NA NA 23.0 230
> 63 64.3 7.00 636.0 32.4 NA 73.0 316
> 64 23.0 4.10 300.0 21.0 NA 4.0 163
>
> Thanks,
>
> Rich
Here is your problem:
# 'DF' is the result of copying your data above from the
# clipboard on OSX
DF <- read.table(pipe("pbpaste"), header = TRUE)
> str(DF)
'data.frame': 64 obs. of 7 variables:
$ Ca : num NA NA 24.4 NA 21.4 NA NA NA NA NA ...
$ Cl : num 1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ...
$ Cond: num NA NA 190 187 184 NA NA NA NA NA ...
$ Mg : num NA NA 10 NA 9.1 NA NA NA NA NA ...
$ Na : int NA NA NA NA NA NA NA NA NA NA ...
$ SO4 : num 9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ...
$ TDS : int 105 181 112 144 114 308 96 430 108 108 …
> na.omit(DF)
Ca Cl Cond Mg Na SO4 TDS
22 28 1 248 11 4 13 125
After removing incomplete records (any records with NA values) which is the default behavior for R model functions, you only have one record left to fit the model to.
HTH,
Marc Schwartz
More information about the R-help
mailing list