[R] "apply" question

Mon May 2 16:58:39 CEST 2005

----- Original Message ----- 
From: "Christoph Scherber" <Christoph.Scherber at uni-jena.de>
To: <r-help at stat.math.ethz.ch>
Sent: Monday, May 02, 2005 10:52 AM
Subject: [R] "apply" question

> Dear R users,
>
> I´ve got a simple question but somehow I can´t find the solution:
>
> I have a data frame with columns 1-5 containing one set of integer values, 
> and columns 6-10 containing another set of integer values. Columns 6-10 
> contain NA´s at some places.
>
> I now want to calculate
> (1) the number of values in each row of columns 6-10 that were NA´s
> (2) the sum of all values on columns 1-5 for which there were no missing 
> values in the corresponding cells of columns 6-10.
>
>
> Example: (let´s call the data frame "data")
>
> Col1   Col2   Col3   Col4   Col5   Col6   Col7   Col8   Col9   Col10
> 1      2      5      2      3      NA      5      NA    1      4
> 3      1      4      5      2      6      NA      4     NA     1
>
> The result would then be (for the first row)
> (1) "There were 2 NA´s in columns 6-10."
> (2) The mean of Columns 1-5 was 2+2+3=7" (because there were NA´s in the 
> 1st and 3rd position in rows 6-10)
>
> So far, I know how to calculate the rowSums for the data.frame, but I 
> don´t know how to condition these on the values of columns 6-10
>
> rowSums(data[,1:5]) #that´s straightforward
> apply(data[,6:19],1,function(x)sum(is.na(x))) #this also works fine
>
> But I don´t know how to select just the desired values of columns 1-5 (as 
> described above)

tmp <- rowSums(data[apply(data[,6:19],1,function(x) sum(is.na(x)))==0,1:5])

Now, tmp contains only the rowsums for the rows with no NAs in the other 
columns.

Sean