[R] replace NA's with row means for specific columns

Zahra captiva24 at yahoo.com
Mon Nov 2 20:49:01 CET 2015


Hi there,

I am looking for some help replacing missing values in R with the row mean. This is survey data and I am trying to impute values for missing variables in each set of questions separately using the mean of the scores for the other questions within that set. 

I have a dataset that looks like this

ID      A1    A2    A3          B1     B2     B3         C1   C2   C3    C4
b        4       5      NA          2       NA      4          5      1        3      NA
c        4       5      1            NA      3        4          5      1        3      2
d       NA     5      1            1        NA      4          5      1        3      2
e        4       5      4            5       NA      4           5      1        3      2


I want to replace any NA's in columns A1:A3 with the row mean for those columns only. So for ID=b, I want the NA in A3[ID=b] to be (4+5)/2 which is the average of the values in A1 and A2 for that row. 
Same thing for columns B1:B3 - I want the NA in B2[ID=b] to be the mean of the values of B1 and B3 in row ID=b so that B2[ID=b] becomes 3 which is (2+4)/2. And same in C1:C4, I want C4[ID=b] to become (5+1+3)/3 which is the mean of C1:C3. 

Then I want to go to row ID=c and do the same thing and so on.

Can anybody help me do this? I have tried using rowMeans and subsetting but can't figure out the right code to do it. 

Thanks so much.
Zahra



More information about the R-help mailing list