[R] Summing certain values within columns that satisfy a certain condition

David L Carlson dcarlson at tamu.edu
Fri Feb 27 17:07:57 CET 2015


Here is another approach

> maxv <- apply(df, 2, max) # Get the column maximums
> maxv0 <- ifelse(maxv == 0, -1, maxv) # Replace 0 maximums with -1
> Sum <-  rowSums(sweep(df, 2, maxv0, "=="))
> data.frame(df, Sum)
   A B C D Sum
1  0 1 0 7   1
2  0 2 0 7   1
3  0 3 0 7   1
4  0 4 0 7   1
5  0 1 0 0   0
6  0 0 0 0   0
7  0 0 0 0   0
8  0 0 0 0   0
9  0 0 1 5   0
10 0 5 1 5   0
11 0 4 1 5   0
12 0 8 4 7   3
13 0 0 3 0   0
14 0 0 3 4   0
15 0 0 3 4   0
16 0 0 0 5   0
17 0 2 0 6   0
18 0 0 4 0   1
19 0 0 4 0   1
20 0 0 4 0   1


-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352



-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of Don McKenzie
Sent: Thursday, February 26, 2015 3:12 PM
To: Kate Ignatius
Cc: r-help
Subject: Re: [R] Summing certain values within columns that satisfy a certain condition

Kate — here is a transparent solution (tested but without NA treatment). Doubtless there are cleverer faster ones, which later posters will present.

HTH

# example with four columns and 20 rows
nrows <- 20

A <- sample(c(1:100), nrows, replace=T)
B <- sample(c(1:100), nrows, replace=T)
C <- sample(c(1:100), nrows, replace=T)
D <- sample(c(1:100), nrows, replace=T)

locs <- c(c(1:nrows)[A==max(A)],c(1:nrows)[B==max(B)],c(1:nrows)[C==max(C)],c(1:nrows)[D==max(D)])

mat1 <- matrix(rep(0,4*nrows),nrows,4)
for (i in 1:4)
	mat1[,i][locs[i]] <- 1
SUM <- rowSums(mat1)


> On Feb 26, 2015, at 12:23 PM, Kate Ignatius <kate.ignatius at gmail.com> wrote:
> 
> Hi,
> 
> Supposed I had a data frame like so:
> 
> A B C D
> 0 1 0 7
> 0 2 0 7
> 0 3 0 7
> 0 4 0 7
> 0 1 0 0
> 0 0 0 0
> 0 0 0 0
> 0 0 0 0
> 0 0 1 5
> 0 5 1 5
> 0 4 1 5
> 0 8 4 7
> 0 0 3 0
> 0 0 3 4
> 0 0 3 4
> 0 0 0 5
> 0 2 0 6
> 0 0 4 0
> 0 0 4 0
> 0 0 4 0
> 
> For each row, I want to count how many max column values appear to
> adventurely get the following outcome, while ignoring zeros and N/As:
> 
> A B C D Sum
> 0 1 0 7 1
> 0 2 0 7 1
> 0 3 0 7 1
> 0 4 0 7 1
> 0 1 0 0 0
> 0 0 0 0 0
> 0 0 0 0 0
> 0 0 0 0 0
> 0 0 1 5 0
> 0 5 1 5 0
> 0 4 1 5 0
> 0 8 4 7 3
> 0 0 3 0 0
> 0 0 3 4 0
> 0 0 3 4 0
> 0 0 0 5 0
> 0 2 0 6 0
> 0 0 4 0 1
> 0 0 4 0 1
> 0 0 4 0 1
> 
> I've used the following code but it doesn't seem to work (my sum
> column column is all 1s):
> 
> (apply(df,1, function(x)  (sum(x %in% c(pmax(x))))))
> 
> Is this code too simple?
> 
> Thanks!
> 
> K.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


More information about the R-help mailing list