[R] How to generate a new factor variable by two other factor variables
Marc Schwartz
marc_schwartz at comcast.net
Tue Nov 4 17:24:48 CET 2008
The easiest way is probably to use interaction():
> df
factorA factorB
1 0 0
2 0 0
3 1 0
4 0 1
5 1 1
# Note the default separator of '.'
df$factorC <- with(df, interaction(factorA, factorB))
> df
factorA factorB factorC
1 0 0 0.0
2 0 0 0.0
3 1 0 1.0
4 0 1 0.1
5 1 1 1.1
If you want numeric values, you can adjust as follows:
df$factorC <- factor(as.numeric(df$factorC) - 1)
> df
factorA factorB factorC
1 0 0 0
2 0 0 0
3 1 0 1
4 0 1 2
5 1 1 3
The latter step takes advantage of the underlying numeric nature of the
factor levels and subtracts 1, since they are 1 based, not 0 based.
See ?interaction
HTH,
Marc Schwartz
on 11/04/2008 10:10 AM Jorge Ivan Velez wrote:
> Dear Shuguang,
> Here are two ways. Perhaps they are not efficient enough, but the work:
>
>
> # Data
> mydata=read.table(textConnection("
> factorA factorB
> 0 0
> 0 0
> 1 0
> 0 1
> 1 1"),header=TRUE)
> closeAllConnections()
>
> # Option 1
> mydata$factorC=as.factor(
> apply(mydata,1,function(x){
> paste(x,sep="",collapse="")
> }
> ))
> levels(mydata$factorC)<-list("0"="00", "1"="10", "2"="01","3"="11")
> mydata
>
> # Option 2
> # You'll need to read the data again to see how this option works
> mydata$factorC<-apply(mydata,1,function(x){
> ifelse(sum(x)==0,0,
> ifelse(x[1]==1 & x[2]==0,1,
> ifelse(x[1]==0 & x[2]==1,2,3)))
> }
> )
>
> mydata
>
>
> HTH,
>
>
> Jorge
>
>
>
> On Tue, Nov 4, 2008 at 10:29 AM, Shuguang Sun <shuguang at gmail.com> wrote:
>
>> How to generate a new factor variable by two other factor variables?
>>
>> For example, if I have two factor variables, factorA and factorB,
>> factorA factorB
>> 0 0
>> 0 0
>> 1 0
>> 0 1
>> 1 1
>>
>> Is there a simple way to generate a new 4-levels factor variable as
>>
>> factorC factorA factorB
>> 0 0 0
>> 0 0 0
>> 1 1 0
>> 2 0 1
>> 3 1 1
>>
>> --
>> Shuguang Sun
>> Fudan University, China
>>
More information about the R-help
mailing list