Dear Massimo, It seems straightforward to use weighted.mean() in a dplyr context
library(dplyr) mydf %>% group_by(date_time, type) %>% summarise(vel = weighted.mean(speed, n_vehicles)) Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkel...@inbo.be Kliniekstraat 25, B-1070 Brussel www.inbo.be /////////////////////////////////////////////////////////////////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey /////////////////////////////////////////////////////////////////////////////////////////// [image: Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis. Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel.] <https://overheid.vlaanderen.be/mobiliteitsplan-herman-teirlinckgebouw> Van 14 tot en met 19 december 2017 verhuizen we uit onze vestiging in Brussel naar het Herman Teirlinckgebouw op de site Thurn & Taxis. Vanaf dan ben je welkom op het nieuwe adres: Havenlaan 88 bus 73, 1000 Brussel. /////////////////////////////////////////////////////////////////////////////////////////// <https://www.inbo.be> 2017-11-09 14:16 GMT+01:00 Massimo Bressan <massimo.bres...@arpa.veneto.it>: > Hello > > an update about my question: I worked out the following solution (with the > package "dplyr") > > library(dplyr) > > mydf%>% > mutate(speed_vehicles=n_vehicles*mydf$speed) %>% > group_by(date_time,type) %>% > summarise( > sum_n_times_speed=sum(speed_vehicles), > n_vehicles=sum(n_vehicles), > vel=sum(speed_vehicles)/sum(n_vehicles) > ) > > > In fact I was hoping to manage everything in a "one-go": i.e. without the > need to create the "intermediate" variable called "speed_vehicles" and with > the use of the function weighted.mean() > > any hints for a different approach much appreciated > > thanks > > > > Da: "Massimo Bressan" <massimo.bres...@arpa.veneto.it> > A: "r-help" <r-help@r-project.org> > Inviato: Giovedì, 9 novembre 2017 12:20:52 > Oggetto: weighted average grouped by variables > > hi all > > I have this dataframe (created as a reproducible example) > > mydf<-structure(list(date_time = structure(c(1508238000, 1508238000, > 1508238000, 1508238000, 1508238000, 1508238000, 1508238000), class = > c("POSIXct", "POSIXt"), tzone = ""), > direction = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A", "B"), > class = "factor"), > type = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L), .Label = c("car", > "light_duty", "heavy_duty", "motorcycle"), class = "factor"), > avg_speed = c(41.1029082774049, 40.3333333333333, 40.3157894736842, > 36.0869565217391, 33.4065155807365, 37.6222222222222, 35.5), > n_vehicles = c(447L, 24L, 19L, 23L, 706L, 45L, 26L)), > .Names = c("date_time", "direction", "type", "speed", "n_vehicles"), > row.names = c(NA, -7L), > class = "data.frame") > > mydf > > and I need to get to this final result > > mydf_final<-structure(list(date_time = structure(c(1508238000, > 1508238000, 1508238000, 1508238000), class = c("POSIXct", "POSIXt"), tzone > = ""), > type = structure(c(1L, 2L, 3L, 4L), .Label = c("car", "light_duty", > "heavy_duty", "motorcycle"), class = "factor"), > weighted_avg_speed = c(36.39029, 38.56521, 37.53333, 36.08696), > n_vehicles = c(1153L,69L,45L,23L)), > .Names = c("date_time", "type", "weighted_avg_speed", "n_vehicles"), > row.names = c(NA, -4L), > class = "data.frame") > > mydf_final > > > my question: > how to compute a weighted mean i.e. "weighted_avg_speed" > from "speed" (the values whose weighted mean is to be computed) and > "n_vehicles" (the weights) > grouped by "date_time" and "type"? > > to be noted the complication of the case "motorcycle" (not present in both > directions) > > any help for that? > > thank you > > max > > > > -- > > ------------------------------------------------------------ > Massimo Bressan > > ARPAV > Agenzia Regionale per la Prevenzione e > Protezione Ambientale del Veneto > > Dipartimento Provinciale di Treviso > Via Santa Barbara, 5/a > 31100 Treviso, Italy > > tel: +39 0422 558545 > fax: +39 0422 558516 > e-mail: massimo.bres...@arpa.veneto.it > ------------------------------------------------------------ > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.