[R] "And" condition spanning over multiple columns in data frame

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Thu Sep 12 16:36:32 CEST 2024


Às 08:42 de 12/09/2024, Francesca escreveu:
> Dear contributors,
> I need to create a set of columns, based on conditions of a dataframe as
> follows.
> I have managed to do the trick for one column, but I do not seem to find
> any good example where the condition is extended to all the dataframe.
> 
> I have these dataframe called c10Dt:
> 
> 
> 
> id cp1 cp2 cp3 cp4 cp5 cp6 cp7 cp8 cp9 cp10 cp11 cp12
> 1  1  NA  NA  NA  NA  NA  NA  NA  NA  NA   NA   NA   NA
> 2  4   8  18  15  10  12  11   9  18   8   16   15   NA
> 3  3   8   5   5   4  NA   5  NA   6  NA   10   10   10
> 4  3   5   5   4   4   3   2   1   3   2    1    1    2
> 5  1  NA  NA  NA  NA  NA  NA  NA  NA  NA   NA   NA   NA
> 6  2   5   5  10  10   9  10  10  10  NA   10    9   10
> -- Columns are id, cp1, cp2.. and so on. What I need to do is the 
> following, made on just one column: c10Dt <- mutate(c10Dt, exit1= 
> ifelse(is.na(cp1) & id!=1, 1, 0)) So, I create a new variable, called 
> exit1, in which the program selects cp1, checks if it is NA, and if it 
> is NA but also the value of the column "id" is not 1, then it gives back 
> a 1, otherwise 0. So, what I want is that it selects all the cases in 
> which the id=2,3, or 4 is not NA in the corresponding values of the 
> matrix. I managed to do it manually column by column, but I feel there 
> should be something smarter here. The problem is that I need to 
> replicate this over all the columns from cp2, to cp12, but keeping fixed 
> the id column instead. I have tried with c10Dt %>% 
> mutate(x=across(starts_with("cp"), ~ifelse(. == NA)) & id!=1,1,0 ) but 
> the problem with across is that it will implement the condition only on 
> cp_ columns. How do I tell R to use the column id with all the other 
> columns? Thanks for any help provided. Francesca 
> ----------------------------------

Hello,

Something like this?

1. If an ifelse instruction is meant to create a binary result, coerce 
the logical condition to integer instead. You can make it more clear by 
substituting as.integer for the plus sign below;
2. the .names argument is used to create new columns and keeping the 
original ones.



df1 <- read.table(text = "id cp1 cp2 cp3 cp4 cp5 cp6 cp7 cp8 cp9 cp10 
cp11 cp12
1  1  NA  NA  NA  NA  NA  NA  NA  NA  NA   NA   NA   NA
2  4   8  18  15  10  12  11   9  18   8   16   15   NA
3  3   8   5   5   4  NA   5  NA   6  NA   10   10   10
4  3   5   5   4   4   3   2   1   3   2    1    1    2
5  1  NA  NA  NA  NA  NA  NA  NA  NA  NA   NA   NA   NA
6  2   5   5  10  10   9  10  10  10  NA   10    9   10", header = TRUE)
df1

library(dplyr)

df1 %>%
   mutate(across(starts_with("cp"),  ~ +(is.na(.) & id != 1), .names = 
"{col}_new"))



Hope this helps,

Rui Barradas


-- 
Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus.
www.avg.com



More information about the R-help mailing list