[R] regression function for categorical predictor data
(Ted Harding)
Ted.Harding at manchester.ac.uk
Thu Sep 9 00:33:37 CEST 2010
On 08-Sep-10 21:11:27, karena wrote:
> Hi, do you guys know what function in R handles the multiple regression
> on categorical predictor data. i.e, 'lm' is used to handle continuous
> predictor data.
>
> thanks,
> karena
Karena,
lm() also handles categorical data, provided these are presented
as factors. For example:
set.seed(12345)
X <- 0.05*(-20:20) # Continuous predictor
F <- as.factor(c(rep("A",21),rep("B",20)))
##21 obs at level "A", 20 at level "B"
Y <- 0.5*X + c(0.25*rnorm(21),0.25*rnorm(20)+2.0)
## Y increases linearly with X (coeff = 0.5)
## Y at Level "B" is 2.0 higher than at Level "A"
## "Error" term has SD = 0.25
plot(X,Y)
summary(lm(Y ~ X + F))
# Call: lm(formula = Y ~ X + F)
# Residuals:
# Min 1Q Median 3Q Max
# -0.56511 -0.15807 -0.00034 0.16484 0.44048
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 0.09561 0.08869 1.078 0.288
# X 0.63621 0.13671 4.654 3.89e-05 ***
# FB 1.93821 0.16181 11.978 1.80e-14 ***
# ---
# Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
# Residual standard error: 0.2589 on 38 degrees of freedom
# Multiple R-squared: 0.965, Adjusted R-squared: 0.9631
# F-statistic: 523.4 on 2 and 38 DF, p-value: < 2.2e-16
The reported Estimate FB give the change in level resulting
from a change from "A" to "B" in F.
Hoping this helps,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Sep-10 Time: 23:33:34
------------------------------ XFMail ------------------------------
More information about the R-help
mailing list