[R] factor with numeric names
Saiwing Yeung
saiwing at berkeley.edu
Sat Mar 21 22:02:13 CET 2009
Hi all,
I have a pretty basic question about categorical variables but I can't
seem to be able to find answer so I am hoping someone here can help. I
found that if the factor names are all in numbers, fitting the model
in lm would return labels that are not very recognizable.
# Example: let's just assume that we want to fit this model
fit <- lm(height ~ age + Seed, data=Loblolly)
# See the category names are all mangled up here
fit
Call:
lm(formula = height ~ age + Seed, data = Loblolly)
Coefficients:
(Intercept) age Seed.L Seed.Q Seed.C
Seed^4
-1.31240 2.59052 4.86941 0.87307 0.37894
-0.46853
Seed^5 Seed^6 Seed^7 Seed^8 Seed^9
Seed^10
0.55237 0.39659 -0.06507 0.35074 -0.83442
0.42085
Seed^11 Seed^12 Seed^13
0.53906 -0.29803 -0.77254
One possible solution I found is to rename the categorical variables
seed.str <- paste("S", Loblolly$Seed, sep="")
seed.str <- factor(seed.str)
fit <- lm(height ~ age + seed.str, data=Loblolly)
fit
Call:
lm(formula = height ~ age + seed.str, data = Loblolly)
Coefficients:
(Intercept) age seed.strS303 seed.strS305 seed.strS307
-0.4301 2.5905 0.8600 1.8683 -1.9183
seed.strS309 seed.strS311 seed.strS315 seed.strS319 seed.strS321
0.5350 -1.5933 -0.8867 -0.3650 -2.0350
seed.strS323 seed.strS325 seed.strS327 seed.strS329 seed.strS331
0.3067 -1.3233 -2.6400 -2.9333 -2.2267
Now it is actually possible to see which one is which, but is kind of
lame. Can someone point me to a more elegant solution? Thank you so
much.
Saiwing Yeung
More information about the R-help
mailing list