[R] Computing means of multiple variables based on a condition

KMNanus kmnanus at gmail.com
Thu May 26 00:37:54 CEST 2016


I have a large dataset, a sample of which is:

a<- c(“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”)
b <-c(15, 35, 20,  99, 75, 64, 33, 78, 45, 20)
c<- c( 111, 234, 456, 876, 246, 662, 345, 480, 512, 179)
d<- c(1.1, 3.2, 14.2, 8.7, 12.5, 5.9, 8.3, 6.0, 2.9, 9.3) 

df <- data.frame(a,b,c,d)

I’m trying to construct a data frame that shows the means of c & b based on the condition of d and grouped by a.

I want to create the data frame below, then use ggplot2 to create a line plot of b at various conditions of d.

I can compute the grouped means (d>=2, d>=4, etc.) one at a time using dplyr but haven’t figured out how to put them all together or put them in one data frame.

I’d rather not use a loop and am relatively new to R.  Is there a way i can use tapply and set it to the conditions above so that I can create the df below?


        condition    mean(b)     mean(c)    
A        d>=2          ____         _____
B        d>=2          ____         _____
A        d>=4          ____         _____
B        d>=4         ____         _____
A        d>=6         ____         _____
B       d>=6         ____         _____



Ken
kmnanus at gmail.com
914-450-0816 (tel)
347-730-4813 (fax)





More information about the R-help mailing list