[R] Discretize factors?
Noah Silverman
noah at smartmediacorp.com
Sun May 16 19:06:35 CEST 2010
Update,
I have it working, but now its producing really ugly labels. Must be a
small adjustment to the code. Any ideas??
##Create example data.frame
group <- c("A", "B","B","C","C","C")
a <- c(1,4,3,4,5,6)
b <- c(5,4,5,3,4,5)
d <- data.frame(cbind(a,b,group))
#create new frame with discretized group
>cbind(d[,1:2], model.matrix(~0+d[,3]) )
a b d[, 3]A d[, 3]B d[, 3]C
1 1 5 1 0 0
2 4 4 0 1 0
3 3 5 0 1 0
4 4 3 0 0 1
5 5 4 0 0 1
6 6 5 0 0 1
So, as you can see, it works, but the labels for the groups don't
I then tried using the column name instead of number and still got ugly
results:
> cbind(d[,1:2], model.matrix(~0+d[,"group"]) )
a b d[, "group"]A d[, "group"]B d[, "group"]C
1 1 5 1 0 0
2 4 4 0 1 0
3 3 5 0 1 0
4 4 3 0 0 1
5 5 4 0 0 1
6 6 5 0 0 1
Any ideas?
-N
On 5/15/10 11:02 AM, Noah Silverman wrote:
> Hi,
>
> I'm looking for an easy way to discretize factors in R
>
> I've noticed that the lm function does this automatically with a nice
> result.
>
> If I have
>
> group <- c("A", "B","B","C","C","C")
>
> and run:
>
> lm(result ~ x1 + group)
>
> The lm function has split the group into separate binary variables {0,1}
> before performing the regression. I now have:
> groupA
> groupB
> groupC
>
> Some of the other models that I want to try won't accept factors, so
> they need to be discretized this way.
>
> Is there a command in R for this, or some easy shortcut? (I tried
> digging into the lm code, but couldn't find where this is being done.)
>
> Thanks!
>
> -N
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
More information about the R-help
mailing list