[R] weighted average grouped by variables

Thu Nov 9 14:30:31 CET 2017

Sorry, I messed up. Only checked the final result after sending the 
previous mail. The solution is wrong.

Rui Barradas

Em 09-11-2017 13:27, Rui Barradas escreveu:
> Hello,
>
> Using base R only, the following seems to do what you want.
>
> with(mydf, ave(speed, date_time, type, FUN = weighted.mean, w =
> n_vehicles))
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 09-11-2017 13:16, Massimo Bressan escreveu:
>> Hello
>>
>> an update about my question: I worked out the following solution (with
>> the package "dplyr")
>>
>> library(dplyr)
>>
>> mydf%>%
>> mutate(speed_vehicles=n_vehicles*mydf$speed) %>%
>> group_by(date_time,type) %>%
>> summarise(
>> sum_n_times_speed=sum(speed_vehicles),
>> n_vehicles=sum(n_vehicles),
>> vel=sum(speed_vehicles)/sum(n_vehicles)
>> )
>>
>>
>> In fact I was hoping to manage everything in a "one-go": i.e. without
>> the need to create the "intermediate" variable called "speed_vehicles"
>> and with the use of the function weighted.mean()
>>
>> any hints for a different approach much appreciated
>>
>> thanks
>>
>>
>>
>> Da: "Massimo Bressan" <massimo.bressan at arpa.veneto.it>
>> A: "r-help" <r-help at r-project.org>
>> Inviato: Giovedì, 9 novembre 2017 12:20:52
>> Oggetto: weighted average grouped by variables
>>
>> hi all
>>
>> I have this dataframe (created as a reproducible example)
>>
>> mydf<-structure(list(date_time = structure(c(1508238000, 1508238000,
>> 1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class =
>> c("POSIXct", "POSIXt"), tzone = ""),
>> direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A",
>> "B"), class = "factor"),
>> type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car",
>> "light_duty", "heavy_duty", "motorcycle"), class = "factor"),
>> avg_speed = c(41.1029082774049, 40.3333333333333, 40.3157894736842,
>> 36.0869565217391, 33.4065155807365, 37.6222222222222, 35.5),
>> n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)),
>> .Names = c("date_time", "direction", "type", "speed", "n_vehicles"),
>> row.names = c(NA, -7L),
>> class = "data.frame")
>>
>> mydf
>>
>> and I need to get to this final result
>>
>> mydf_final<-structure(list(date_time = structure(c(1508238000,
>> 1508238000, 1508238000, 1508238000), class = c("POSIXct", "POSIXt"),
>> tzone = ""),
>> type = structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty",
>> "heavy_duty", "motorcycle"), class = "factor"),
>> weighted_avg_speed = c(36.39029, 38.56521, 37.53333, 36.08696),
>> n_vehicles = c(1153L,69L,45L,23L)),
>> .Names = c("date_time", "type", "weighted_avg_speed", "n_vehicles"),
>> row.names = c(NA, -4L),
>> class = "data.frame")
>>
>> mydf_final
>>
>>
>> my question:
>> how to compute a weighted mean i.e. "weighted_avg_speed"
>> from "speed" (the values whose weighted mean is to be computed) and
>> "n_vehicles" (the weights)
>> grouped by "date_time" and "type"?
>>
>> to be noted the complication of the case "motorcycle" (not present in
>> both directions)
>>
>> any help for that?
>>
>> thank you
>>
>> max
>>
>>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.