[R] conditional rowsums in sapply

Dimitris Rizopoulos d.rizopoulos at erasmusmc.nl
Mon May 16 20:27:06 CEST 2011


assuming that the row entries for the columns with the same name are not 
all zero, you can try something in the following lines:

dfrm <- data.frame(a = 1:4, a = 1:4, b = 1:4,
     b = 1:4, b = 1:4, check.names = FALSE)
dfrm[3, 1:3] <- NA
dfrm

vals <- unlist(dfrm)
res <- tapply(vals, names(vals), sum, na.rm = TRUE)
res[res == 0] <- NA
as.data.frame(matrix(res, ncol = 2))


I hope it helps.

Best,
Dimitris


On 5/16/2011 4:25 PM, Assu wrote:
> Hi all
>
> I have a data frame with duplicate columns and i want to remove duplicates
> by adding rows in each group of duplicates, but have lots of NA's.
> Data:
> dfrm<- data.frame(a = 1:4, b= 1:4, cc= 1:4, dd=1:10, ee=1:4)
> names(dfrm)<- c("a", "a", "b", "b", "b")
> dfrm[3,2:3]<-NA
> dfrm
>      a  a  b  b  b
> 1   1  1  1  1  1
> 2   2  2  2  2  2
> 3  NA NA NA  3  3
> 4   4  4  4  4  4
> I did: sapply(unique(names(dfrm)),function(x){
> rowSums(dfrm[ ,grep(x, names(dfrm)),drop=FALSE])})
>   which works. However, I want rowSums conditional:
> 1) if there is at least one value non NA in a row of each group of
> duplicates, apply rowSums to get the value independently of the existence of
> other NA's in the group row.
> 2) if all values in a row of duplicates are NA, I get NA
> In my data dfrm I would get
>
>       a   b
> 1    2   3
> 2    4   6
> 3   NA  6
> 4    8  12
> Can't use na.rm=TRUE or FALSE.
> I tried: sapply(unique(names(dfrm)),function(x) ifelse(any(!is.na(dfrm[
> ,grep(x, names(dfrm))])), rowSums(dfrm[ ,grep(x,
> names(dfrm)),drop=FALSE],na.rm=TRUE),NA))
>
> and it doesn't work.
> Can someone please help me?
> Thanks in advance.
>
>
>
>
>
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/conditional-rowsums-in-sapply-tp3526332p3526332.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014
Web: http://www.erasmusmc.nl/biostatistiek/



More information about the R-help mailing list