[R] ddply
arun
smartpink111 at yahoo.com
Wed Feb 26 02:03:48 CET 2014
Hi Felipe,
Pasting the code from your second email with ?which.max()
#changed 'test' to `hw` as
hw2 <- ddply(hw,"id",summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),
Recaps=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),
WaterTemp=mean(WaterTemp), TrapWeather=TrapWeather[which.max(Recaps)])
Now, change which.max to max,
ddply(hw,"id",summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),
Recaps=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),
WaterTemp=mean(WaterTemp), TrapWeather=max(Recaps))
# id subSiteName nReleased Recaps MeanFL TrapTurbidity WaterTemp TrapWeather
#1 1 north 686 5 36.05128 2.450 8.395417 5
#2 2 north 540 11 35.47000 2.770 8.824167 11
#3 3 north 1995 51 38.32692 1.700 9.220000 51
#4 4 north 1309 35 37.17000 1.615 9.277917 35
#5 5 north 995 47 38.84152 1.815 8.660625 47
So, it is better to rename the Recaps column to something else:
ddply(hw,"id",summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),
Recaps1=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),
WaterTemp=mean(WaterTemp), TrapWeather=max(Recaps))
##Check the difference
If there are multiple rows with max values, then:
hw1 <- data.frame(id=5, subSiteName="north", nReleased= 995, Recaps=46, MeanFL=38.42, TrapTurbidity=2.23, WaterTemp=8.6234, TrapWeather= "Clear")
hw2 <- rbind(hw,hw1)
#either create a list column
res <- ddply(hw2,.(id),summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),Recaps1=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),WaterTemp=mean(WaterTemp),TW=list(as.character(TrapWeather[Recaps %in% max(Recaps)])))
#or use paste()
res1 <- ddply(hw2,.(id),summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),Recaps1=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),WaterTemp=mean(WaterTemp),TW=paste(TrapWeather[Recaps %in% max(Recaps)],collapse=","))
A.K.
On Tuesday, February 25, 2014 5:31 PM, Felipe Carrillo <mazatlanmexico at yahoo.com> wrote:
Hi Arun,
Could you help me with this what appears to be a simple question?
I want to create a column called TW with a value from TrapWeather is
selected based on the max value of Recaps by id.
for example for id=4 the max Recaps value is 34 so I want TrapWeather to be 'Foggy'
and so on. Thanks Arun
library(plyr)
hw <- structure(list(id = c(1L, 2L, 2L, 3L, 4L, 4L, 5L, 5L), subSiteName = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "north", class = "factor"),
nReleased = c(686L, 540L, 540L, 1995L, 1309L, 1309L, 995L,
995L), Recaps = c(5L, 8L, 3L, 51L, 34L, 1L, 46L, 1L), MeanFL = c(36.05128205,
35.38, 35.56, 38.32692308, 36.48, 37.86, 38.44230769,
39.24074074
), TrapTurbidity = c(2.450000048, 2.710000038, 2.829999924,
1.700000048, 2.130000114, 1.100000024, 2, 1.629999995), WaterTemp = c(8.395416667,
8.55625, 9.092083333, 9.22, 9.180833333, 9.375, 8.63875,
8.6825), TrapWeather = structure(c(2L, 1L, 1L, 3L, 3L, 1L,
2L, 4L), .Label = c("Clear", "Cloudy", "Foggy", "Rainy day"
), class = "factor")), .Names = c("id", "subSiteName", "nReleased",
"Recaps", "MeanFL", "TrapTurbidity", "WaterTemp", "TrapWeather"
), class = "data.frame", row.names = c(NA, -8L))
hw2 <- ddply(test,"id",summarise, subSiteName=unique(subSiteName),nReleased=unique(nReleased),
Recaps=sum(Recaps),MeanFL=mean(MeanFL),TrapTurbidity=mean(TrapTurbidity),
WaterTemp=mean(WaterTemp), TW=TrapWeather Where Recaps==max(Recaps))
hw2
More information about the R-help
mailing list