[R] Question About lm()
Bromaghin, Jeffrey F
jbrom@gh|n @end|ng |rom u@g@@gov
Wed Feb 9 23:00:40 CET 2022
Hello,
I was constructing a simple linear model with one categorical (3-levels) and one quantitative predictor variable for a colleague. I estimated model parameters with and without an intercept, sometimes called reference cell coding and cell means coding.
Model 1: yResp ~ -1 + xCat + xCont
Model 2: yResp ~ xCat + xCont
These models are equivalent and the estimated coefficients come out fine, but the R-squared and F statistics returned by summary() differ markedly. I spent some time looking at the code for both lm() and summary.lm() but did not find the source of the difference. aov() and anova() results also differ, so I suspect the issue involves how the sums of squares are being computed. I've also spent some time trying to search online for information on this, without success. I haven't used lm() for quite a while, but my memory is that these differences didn't occur in the distant past when I was teaching.
Thanks in advance for any insights you might have,
Jeff
Jeffrey F. Bromaghin
Research Statistician
USGS Alaska Science Center
907-786-7086
Jeffrey Bromaghin, Ph.D. | U.S. Geological Survey (usgs.gov)<https://www.usgs.gov/staff-profiles/jeffrey-bromaghin>
Ecosystems Analytics | U.S. Geological Survey (usgs.gov)<https://www.usgs.gov/centers/alaska-science-center/science/ecosystems-analytics>
[[alternative HTML version deleted]]
More information about the R-help
mailing list