[R] how to create a new column from two columns with conditions

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Thu Apr 30 07:21:39 CEST 2020


Hello,

Inline.

Às 22:44 de 29/04/20, Ana Marija escreveu:
> Hi Rui,
> 
> thanks for getting back to me
> so I tried your method and I got:
>> sum(b$PHENO==2, na.rm=T)
> [1] 828
>> sum(b$PHENO==1, na.rm=T)
> [1] 859
> 
> Can you please tell me if
> b$PHENO <- (b$FLASER == 2 | b$PLASER == 2) + 1L
> 
> just assigns PHENO=2 if b$FLASER == 2 | b$PLASER == 2 and everything else is 1?

Yes, that's it. If b$FLASER == 2 | b$PLASER == 2 returns TRUE then 
adding 1 will give

TRUE + 1 -> 1 + 1 -> 2

This is because logical values are internally coded as integers 0 and 1.
And if the condition returns FALSE it becomes

FALSE + 1 -> 0 + 1 -> 1

In both cases the result is what you want.


> 
> Please see how my data looks like:
>> sum(b$FLASER==2, na.rm=T)
> [1] 92
>> sum(b$FLASER==1, na.rm=T)
> [1] 1533
>> sum(b$PLASER==1, na.rm=T)
> [1] 850
>> sum(b$PLASER==2, na.rm=T)
> [1] 806
>> dim(b)
> [1] 1698    5
>> unique(b$FLASER)
> [1]  1  3  2 NA
>> unique(b$PLASER)
> [1]  1  2  3 NA
> 

What I write above is valid even if your data contains NA's, like it 
does. This is because

(TRUE | x) == (x | TRUE) == TRUE

even if x is NA.

This is an example with some NA values in the data.

set.seed(1234)
b <- rbind(b, b)
i <- sample(nrow(b), 3)
b$FLASER[i] <- NA
i <- sample(nrow(b), 2)
b$PLASER[i] <- NA
b$PLASER[10] <- 2

b$PHENO <- (b$FLASER == 2 | b$PLASER == 2) + 1
b


As you can see,

row 5:

b$FLASER is NA, b$PLASER == 2 evaluates to TRUE -> b$PHENO is TRUE

row 10:

b$FLASER == 2 evaluates to TRUE, b$PLASER is NA -> b$PHENO is TRUE


So the code is not broken by NA's

Hope this helps,

Rui Barradas

> On Wed, Apr 29, 2020 at 4:10 PM Rui Barradas <ruipbarradas using sapo.pt> wrote:
>>
>> Hello,
>>
>> Here is another way. The condition returns FALSE/TRUE or 0/1. Add 1 to
>> get the expected result.
>> It has the advantage of being faster.
>>
>> b$PHENO <- (b$FLASER == 2 | b$PLASER == 2) + 1L
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>> Às 20:42 de 29/04/20, Ana Marija escreveu:
>>> Thanks, I did this:
>>> b$PHENO<- ifelse(b$FLASER ==2 | b$PLASER ==2, 2, 1)
>>>
>>> On Wed, Apr 29, 2020 at 2:36 PM Ivan Krylov <krylov.r00t using gmail.com> wrote:
>>>>
>>>> On Wed, 29 Apr 2020 14:19:18 -0500
>>>> Ana Marija <sokovic.anamarija using gmail.com> wrote:
>>>>
>>>>> My conditions for creating a new column PHENO would be this:
>>>>>
>>>>> if FLASER or PLASER =2 then PHENO=2
>>>>> otherwise PHENO=1
>>>>
>>>> On Wed, 29 Apr 2020 15:30:45 -0400
>>>> "Patrick (Malone Quantitative)" <malone using malonequantitative.com> wrote:
>>>>
>>>>> If you don't mind using tidyverse, you can do this easily with
>>>>> if_else.
>>>>
>>>> ...and if you want to stay with base R, you can use the ifelse
>>>> function.
>>>>
>>>> --
>>>> Best regards,
>>>> Ivan
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>



More information about the R-help mailing list