On Tue, 14 Sep 2021, Eric Berger wrote:
This code is not correct: disc_by_month %>% group_by(year, month) %>% summarize(disc_by_month, vol = mean(cfs, na.rm = TRUE)) It should be: disc %>% group_by(year,month) %>% summarize(vol=mean(cfs,na.rm=TRUE)
Eric/Avi: That makes no difference:
disc_by_month
# A tibble: 590,940 × 6 # Groups: year, month [66] year month day hour min cfs <int> <int> <int> <int> <int> <dbl> 1 2016 3 3 12 0 149000 2 2016 3 3 12 10 150000 3 2016 3 3 12 20 151000 4 2016 3 3 12 30 156000 5 2016 3 3 12 40 154000 6 2016 3 3 12 50 150000 7 2016 3 3 13 0 153000 8 2016 3 3 13 10 156000 9 2016 3 3 13 20 154000 10 2016 3 3 13 30 155000 # … with 590,930 more rows I wondered if I need to group first by hour, then day, then year-month. This, too, produces the same output: disc %>% group_by(hour) %>% group_by(day) %>% group_by(year, month) %>% summarize(disc_by_month, vol = mean(cfs, na.rm = TRUE)) And disc shows the read dataframe. I don't understand why the columns are not grouping. Thanks, Rich ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.