[R] Interpreting Multiple Linear Regression Summary

Marc Schwartz marc_schwartz at me.com
Wed Nov 9 20:28:59 CET 2011


On Nov 9, 2011, at 1:17 PM, Rich Shepard wrote:

> On Wed, 9 Nov 2011, Daniel Nordlund wrote:
> 
>> I would guess that there is something problematic with the how the data
>> frame is structured relative to what lm() is expecting.
> 
> Dan,
> 
>  I was not comfortable with my explanation, but the formula (and data
> frame) was equivalent to those of the other 8 streams.
> 
>> So, I would not give up looking for a solution just yet.
> 
>  OK. I'm always up for learning more about R and its processes.
> 
>> Can you show us the result of str() on the data frame that you attached?
> 
>  Sure. I subset the original data frame to select only the 6 predictor
> variables and the response variable. Same lm() results. I'll provide the
> data frame, too.
> 
> summary(lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data =
> mod.stump.cast))
> 
> Call:
> lm(formula = TDS ~ Cond + Ca + Cl + Mg + Na + SO4, data = mod.stump.cast)
> 
> Residuals:
> ALL 1 residuals are 0: no residual degrees of freedom!
> 
> Coefficients: (6 not defined because of singularities)
>            Estimate Std. Error t value Pr(>|t|)
> (Intercept)      125         NA      NA       NA
> Cond              NA         NA      NA       NA
> Ca                NA         NA      NA       NA
> Cl                NA         NA      NA       NA
> Mg                NA         NA      NA       NA
> Na                NA         NA      NA       NA
> SO4               NA         NA      NA       NA
> 
> Residual standard error: NaN on 0 degrees of freedom
>  (63 observations deleted due to missingness)
> 
> str(mod.stump.cast)
> 'data.frame':	64 obs. of  7 variables:
> $ Ca  : num  NA NA 24.4 NA 21.4 NA NA NA NA NA ...
> $ Cl  : num  1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ...
> $ Cond: num  NA NA 190 187 184 NA NA NA NA NA ...
> $ Mg  : num  NA NA 10 NA 9.1 NA NA NA NA NA ...
> $ Na  : num  NA NA NA NA NA NA NA NA NA NA ...
> $ SO4 : num  9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ...
> $ TDS : num  105 181 112 144 114 308 96 430 108 108 ...
> 
> summary(mod.stump.cast)
>       Ca              Cl              Cond             Mg              Na
> Min.   : 0.60   Min.   : 1.000   Min.   :  2.2   Min.   : 9.10   Min.   : 4
> 1st Qu.:23.35   1st Qu.: 2.000   1st Qu.:214.8   1st Qu.:11.00   1st Qu.: 4
> Median :28.35   Median : 4.000   Median :282.5   Median :17.40   Median : 4
> Mean   :32.77   Mean   : 4.076   Mean   :294.6   Mean   :17.85   Mean   : 4
> 3rd Qu.:40.55   3rd Qu.: 5.600   3rd Qu.:372.0   3rd Qu.:22.10   3rd Qu.: 4
> Max.   :64.30   Max.   :13.000   Max.   :636.0   Max.   :32.40   Max.   : 4
> NA's   :50.00   NA's   :11.000   NA's   : 42.0   NA's   :51.00   NA's   :62
>      SO4              TDS
> Min.   :  4.00   Min.   : 14.0
> 1st Qu.:  7.00   1st Qu.:131.2
> Median :  9.40   Median :174.0
> Mean   : 16.31   Mean   :176.9
> 3rd Qu.: 17.00   3rd Qu.:195.5
> Max.   :105.00   Max.   :430.0
> NA's   :  3.00   NA's   :  2.0
> 
> mod.stump.cast
>     Ca    Cl  Cond   Mg Na   SO4 TDS
> 1    NA  1.58    NA   NA NA   9.4 105
> 2    NA  5.60    NA   NA NA   6.5 181
> 3  24.4  3.00 190.0 10.0 NA   9.0 112
> 4    NA    NA 187.0   NA NA    NA 144
> 5  21.4  1.00 184.0  9.1 NA   7.0 114
> 6    NA  5.00    NA   NA NA  55.0 308
> 7    NA  1.20    NA   NA NA   6.8  96
> 8    NA  4.00    NA   NA NA 105.0 430
> 9    NA  4.00    NA   NA NA  15.6 108
> 10   NA  8.40    NA   NA NA   8.4 108
> 11   NA  1.00    NA   NA NA   8.8 125
> 12   NA  1.40    NA   NA NA  19.4 129
> 13   NA  4.90    NA   NA NA  37.0 360
> 14   NA  1.70    NA   NA NA  12.0 140
> 15   NA  2.00    NA   NA NA  10.0  95
> 16   NA  1.60    NA   NA NA   9.1 120
> 17   NA  3.30    NA   NA NA  34.0 280
> 18   NA  2.20    NA   NA NA  11.0 130
> 19   NA  9.00    NA   NA NA  69.0 352
> 20   NA  1.00    NA   NA NA  18.0 148
> 21   NA  2.00    NA   NA NA   9.0 107
> 22 28.0  1.00 248.0 11.0  4  13.0 125
> 23 32.0  1.00    NA 12.0  4   9.0 139
> 24   NA  5.00    NA   NA NA   7.0 188
> 25   NA  4.00    NA   NA NA   6.0 201
> 26   NA  3.00    NA   NA NA   5.0 178
> 27   NA  2.27    NA   NA NA   7.8 197
> 28   NA  1.76    NA   NA NA   7.8 187
> 29   NA  5.81    NA   NA NA   7.5 182
> 30   NA  4.23    NA   NA NA   6.0 165
> 31   NA  4.23    NA   NA NA   7.3 186
> 32   NA  6.25    NA   NA NA   7.0 191
> 33   NA  6.72    NA   NA NA   7.5 190
> 34 34.7  4.00 304.0 17.4 NA   6.0 176
> 35   NA    NA 354.0   NA NA   7.0 175
> 36 42.5  5.00 379.0 21.1 NA   7.0 220
> 37   NA  5.80    NA   NA NA   5.6 163
> 38 26.0  5.80 300.0 24.0 NA   5.6 163
> 39   NA  2.20    NA   NA NA   5.4 152
> 40   NA  5.40    NA   NA NA  11.0 221
> 41   NA  5.40    NA   NA NA  10.5 171
> 42   NA  4.80    NA   NA NA   9.9 204
> 43   NA  8.00    NA   NA NA  11.7 174
> 44   NA  1.00    NA   NA NA   8.4 190
> 45   NA  4.80    NA   NA NA  12.1 174
> 46   NA  5.90    NA   NA NA  16.0 210
> 47   NA  5.90    NA   NA NA  20.0 190
> 48   NA 13.00    NA   NA NA   7.6 180
> 49   NA  5.60    NA   NA NA  17.0 200
> 50   NA  1.20    NA   NA NA   6.5 180
> 51  0.6    NA   2.2   NA NA    NA  NA
> 52 21.4    NA 187.0  9.5 NA   8.0 120
> 53   NA    NA 285.0   NA NA  22.0 135
> 54 48.3  3.00 378.0 22.1 NA  24.0 228
> 55 63.5  7.00 533.0 29.9 NA  44.0  14
> 56   NA    NA 207.0   NA NA    NA  NA
> 57   NA    NA 262.0   NA NA  13.0 156
> 58 28.7  2.00 244.0 12.6 NA  13.0 140
> 59   NA    NA 238.0   NA NA  12.0 128
> 60   NA    NA 280.0   NA NA  18.0 160
> 61   NA    NA 380.0   NA NA  23.0 215
> 62   NA    NA 402.0   NA NA  23.0 230
> 63 64.3  7.00 636.0 32.4 NA  73.0 316
> 64 23.0  4.10 300.0 21.0 NA   4.0 163
> 
> Thanks,
> 
> Rich


Here is your problem:

# 'DF' is the result of copying your data above from the
# clipboard on OSX
DF <- read.table(pipe("pbpaste"), header = TRUE)

> str(DF)
'data.frame':	64 obs. of  7 variables:
 $ Ca  : num  NA NA 24.4 NA 21.4 NA NA NA NA NA ...
 $ Cl  : num  1.58 5.6 3 NA 1 5 1.2 4 4 8.4 ...
 $ Cond: num  NA NA 190 187 184 NA NA NA NA NA ...
 $ Mg  : num  NA NA 10 NA 9.1 NA NA NA NA NA ...
 $ Na  : int  NA NA NA NA NA NA NA NA NA NA ...
 $ SO4 : num  9.4 6.5 9 NA 7 55 6.8 105 15.6 8.4 ...
 $ TDS : int  105 181 112 144 114 308 96 430 108 108 …

> na.omit(DF)
   Ca Cl Cond Mg Na SO4 TDS
22 28  1  248 11  4  13 125


After removing incomplete records (any records with NA values) which is the default behavior for R model functions, you only have one record left to fit the model to.

HTH,

Marc Schwartz



More information about the R-help mailing list