[R] Combine two variables

Rui Barradas ruipbarradas at sapo.pt
Tue Sep 11 17:54:12 CEST 2012


Hello,

Inline.

Em 11-09-2012 15:57, Simon Kiss escreveu:
> Hi:
> I have two variables in a data frame that are the results of a wording experiment in a survey. I'd like to create a third variable that combines the two variables.  Recode doesn't seem to work, because it just recodes the first variable into the third, then recodes the second variable into the third, overwriting the first recode. I can do this with a rather elaborate indexing process, subsetting the first column and then copying the data into the second etc. But I'm looking for a cleaner way to do this. The data frame looks like this.
>
>
> df<-data.frame(var1=sample(c('a','b','c',NA),replace=TRUE, size=100), var2=sample(c('a','b','c',NA),replace=TRUE,size=100))
>
> df<-subset(df, !is.na(var1) |!is.na(var2))
>
> As you can see, if one variable has an NA, then the other variable has a valid value,

No, not necessarily. You are using sample() and there's no reason to 
believe the sampled values for var1 and var2 are going to be different. 
My first try gave me several rows with both columns NA. Then I've used 
set.seed() and it became reproducible.

set.seed(1)
df1 <- data.frame(var1=sample(c('a','b','c',NA), replace=TRUE, size=100),
     var2=sample(c('a','b','c',NA), replace=TRUE, size=100))
sum(is.na(df1$var1) & is.na(df1$var2))  # 8

So I suppose this is not the case with your real dataset.
Try the following.

df1$var3 <- df1$var1
df1$var3[is.na(df1$var1)] <- df1$var2[is.na(df1$var1)]


Hope this helps,

Rui Barradas
>   so how do I just combine the two variables into one?
> Thank you for your assistance.
> Simon Kiss
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list