Hello: This seems like an obvious question, but I am having trouble answering it. I am new to R, so I apologize if its too simple to be posting. I have searched for solutions to no avail. I have data that I am trying to set up for further analysis ("training data"). What I need is 12 groups based on patterns of 4 variables. The complication comes in when missing data is present. Let me describe with an example - focusing on just 3 of the 12 groups: vec=c(1,1,1,1,1,1,NA,NA,1,1,0,0,1,NA,1,1,1,NA,0,0,1,NA,1,0,0,0,0,1,0,0,0,0,NA,NA,NA,NA,1,NA,0,NA,1,NA,1,NA) > a=matrix(vec, ncol=4,nrow=11, byrow=T) > edit(a) col1 col2 col3 col4 [1,] 1 1 1 1 [2,] 1 1 NA NA [3,] 1 1 0 0 [4,] 1 NA 1 1 [5,] 1 NA 0 0 [6,] 1 NA 1 0 [7,] 0 0 0 1 [8,] 0 0 0 0 [9,] NA NA NA NA [10,] 1 NA 0 NA [11,] 1 NA 1 NA Here are 11 individuals. I want the following groups (coded as three separate binary variables): Group1 - scored a 1 on col1 and multiple time Group2 - scored a 1 on col1 but only once Group3 - did not score a 1 in col1 This seems straightforward, except missingness complicates it. Take individual 5 - this person should be placed in Groups 1 AND 2 because we don'tknow the score col2. Same with individual 10, though the response pattern differs. I tried using if statements, but am running into the problem that if is not vecotrized, and I can't seem to make if run with apply. I can use ifelse, but its very clunky and inefficient to list all possible patterns: (Note this is not complete of all patterns, its just an example of what Ivebeen doing) dd$TEST1=ifelse(is.na(d$C8W1raw),1, (ifelse(d$C8W1raw==1 & is.na(d$C9W1raw) & is.na(d$C11AW1raw) & is.na (d$C12AW1rraw),777899, (ifelse((d$C8W1raw==1 & d$C9W1raw==1)| (d$C8W1raw==1 & d$C11AW1raw==1) |(d$C8W1raw==1 & d$C12AW1rraw==1),1, (ifelse(d$C8W1raw==1 & ((is.na(d$C9W1raw) | d$C9W1raw==0) & (is.na(d$C11AW1raw) | d$C11AW1raw==0)& (is.na(d$C12AW1rraw)|d$C12AW1rraw==0)),777899, 0))))))) Any ideas on how to approach this efficiently? Thanks, Andrea [[alternative HTML version deleted]]