[R] Discretize factors?
Peter Ehlers
ehlers at ucalgary.ca
Sun May 16 20:22:26 CEST 2010
On 2010-05-16 11:06, Noah Silverman wrote:
> Update,
>
> I have it working, but now its producing really ugly labels. Must be a
> small adjustment to the code. Any ideas??
>
> ##Create example data.frame
> group<- c("A", "B","B","C","C","C")
> a<- c(1,4,3,4,5,6)
> b<- c(5,4,5,3,4,5)
> d<- data.frame(cbind(a,b,group))
>
> #create new frame with discretized group
>> cbind(d[,1:2], model.matrix(~0+d[,3]) )
> a b d[, 3]A d[, 3]B d[, 3]C
> 1 1 5 1 0 0
> 2 4 4 0 1 0
> 3 3 5 0 1 0
> 4 4 3 0 0 1
> 5 5 4 0 0 1
> 6 6 5 0 0 1
>
>
> So, as you can see, it works, but the labels for the groups don't
>
> I then tried using the column name instead of number and still got ugly
> results:
>
>> cbind(d[,1:2], model.matrix(~0+d[,"group"]) )
> a b d[, "group"]A d[, "group"]B d[, "group"]C
> 1 1 5 1 0 0
> 2 4 4 0 1 0
> 3 3 5 0 1 0
> 4 4 3 0 0 1
> 5 5 4 0 0 1
> 6 6 5 0 0 1
>
>
>
> Any ideas?
>
Can't you just use names(...) <- c() on your final dataframe?
-Peter Ehlers
> -N
>
>
>
> On 5/15/10 11:02 AM, Noah Silverman wrote:
>> Hi,
>>
>> I'm looking for an easy way to discretize factors in R
>>
>> I've noticed that the lm function does this automatically with a nice
>> result.
>>
>> If I have
>>
>> group<- c("A", "B","B","C","C","C")
>>
>> and run:
>>
>> lm(result ~ x1 + group)
>>
>> The lm function has split the group into separate binary variables {0,1}
>> before performing the regression. I now have:
>> groupA
>> groupB
>> groupC
>>
>> Some of the other models that I want to try won't accept factors, so
>> they need to be discretized this way.
>>
>> Is there a command in R for this, or some easy shortcut? (I tried
>> digging into the lm code, but couldn't find where this is being done.)
>>
>> Thanks!
>>
>> -N
>>
More information about the R-help
mailing list