subject:"\[R\] Calculate daily means from 5\-minute interval data"

Re: [R] Calculate daily means from 5-minute interval data

2021-09-05 Thread Jeff Newmiller

This problem nearly always boils down to using meta knowledge about the file. Having informal TZ info in the file is very helpful, but PST is not necessarily a uniquely-defined time zone specification so you have to draw on information outside of the file to know that these codes correspond to -

Re: [R] Calculate daily means from 5-minute interval data

2021-09-05 Thread Bill Dunlap

What is the best way to read (from a text file) timestamps from the fall time change, where there are two 1:15am's? E.g., here is an extract from a US Geological Survey web site giving data on the river through our county on 2020-11-01, when we changed from PDT to PST, https://nwis.waterdata.usgs.

Re: [R] Calculate daily means from 5-minute interval data

2021-09-04 Thread Rich Shepard

On Fri, 3 Sep 2021, Jeff Newmiller wrote: The fact that your projects are in a single time zone is irrelevant. I am not sure how you can be so confident in saying it does not matter whether the data were recorded in PDT or PST, since if it were recorded in PDT then there would be a day in March

Re: [R] Calculate daily means from 5-minute interval data

2021-09-03 Thread Jeff Newmiller

On Fri, 3 Sep 2021, Rich Shepard wrote: On Thu, 2 Sep 2021, Jeff Newmiller wrote: Regardless of whether you use the lower-level split function, or the higher-level aggregate function, or the tidyverse group_by function, the key is learning how to create the column that is the same for all reco

Re: [R] Calculate daily means from 5-minute interval data

2021-09-03 Thread Rich Shepard

On Thu, 2 Sep 2021, Jeff Newmiller wrote: Regardless of whether you use the lower-level split function, or the higher-level aggregate function, or the tidyverse group_by function, the key is learning how to create the column that is the same for all records corresponding to the time interval of

Re: [R] Calculate daily means from 5-minute interval data

2021-09-03 Thread Rich Shepard

On Thu, 2 Sep 2021, Jeff Newmiller wrote: Regardless of whether you use the lower-level split function, or the higher-level aggregate function, or the tidyverse group_by function, the key is learning how to create the column that is the same for all records corresponding to the time interval of

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Jeff Newmiller

Regardless of whether you use the lower-level split function, or the higher-level aggregate function, or the tidyverse group_by function, the key is learning how to create the column that is the same for all records corresponding to the time interval of interest. If you convert the sampdate to

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Rich Shepard

On Thu, 2 Sep 2021, Andrew Simmons wrote: You could use 'split' to create a list of data frames, and then apply a function to each to get the means and sds. cols <- "cfs" # add more as necessary S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m")) means <- do.call("rbind",

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Andrew Simmons

You could use 'split' to create a list of data frames, and then apply a function to each to get the means and sds. cols <- "cfs" # add more as necessary S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m")) means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE)) sds <-

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Rich Shepard

On Thu, 2 Sep 2021, Rich Shepard wrote: If I correctly understand the output of as.POSIXlt each date and time element is separate, so input such as 2016-03-03 12:00 would now be 2016 03 03 12 00 (I've not read how the elements are separated). (The TZ is not important because all data are either

Re: [R] Calculate daily means from 5-minute interval data

2021-09-02 Thread Rich Shepard

On Mon, 30 Aug 2021, Richard O'Keefe wrote: x <- rnorm(samples.per.day * 365) length(x) [1] 105120 Reshape the fake data into a matrix where each row represents one 24-hour period. m <- matrix(x, ncol=samples.per.day, byrow=TRUE) Richard, Now I understand the need to keep the date and tim

Re: [R] Calculate daily means from 5-minute interval data [RESOLVED]

2021-09-01 Thread Rich Shepard

On Tue, 31 Aug 2021, Jeff Newmiller wrote: Never use stringsAsFactors on uncleaned data. For one thing you give a factor to as.Date and it tries to make sense of the integer representation, not the character representation. Jeff, Oops! I had changed it in a previous version of the script and

Re: [R] Calculate daily means from 5-minute interval data

2021-09-01 Thread Rich Shepard

On Wed, 1 Sep 2021, Richard O'Keefe wrote: You have missed the point. The issue is not the temporal distance, but the fact that the data you have are NOT the raw instrumental data and are NOT subject to the limitations of the recording instruments. The data you get from the USGS is not the raw i

Re: [R] Calculate daily means from 5-minute interval data

2021-08-31 Thread Richard O'Keefe

I wrote: > > By the time you get the data from the USGS, you are already far past the > > point > > where what the instruments can write is important. Rich Shepard replied: > The data are important because they show what's happened in that period of > record. Don't physicians take a medical histor

Re: [R] Calculate daily means from 5-minute interval data

2021-08-31 Thread Jeff Newmiller

Never use stringsAsFactors on uncleaned data. For one thing you give a factor to as.Date and it tries to make sense of the integer representation, not the character representation. library(dplyr) dta <- read.csv( text = "sampdate,samptime,cfs 2020-08-26,09:30,136000 2020-08-26,09:35,126000 2020-

Re: [R] Calculate daily means from 5-minute interval data

2021-08-31 Thread Rich Shepard

On Sun, 29 Aug 2021, Jeff Newmiller wrote: The general idea is to create a "grouping" column with repeated values for each day, and then to use aggregate to compute your combined results. The dplyr package's group_by/summarise functions can also do this, and there are also proponents of the data

Re: [R] Calculate daily means from 5-minute interval data

2021-08-31 Thread Rich Shepard

On Tue, 31 Aug 2021, Richard O'Keefe wrote: By the time you get the data from the USGS, you are already far past the point where what the instruments can write is important. Richard, The data are important because they show what's happened in that period of record. Don't physicians take a med

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Richard O'Keefe

By the time you get the data from the USGS, you are already far past the point where what the instruments can write is important. (Obviously an instrument can be sufficiently broken that it cannot write anything.) The data for Rogue River that I just downloaded include this comment: # Data for the

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Bert Gunter

I do not wish to express any opinion on what should be done or how. But... 1. I assume that when data are missing, they are missing -- i.e. simply not present in the data. So there will be possibly several/many in succession missing rows of data corresponding to those times, right? (Apologies for

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Avi Gross via R-help

means from 5-minute interval data On Tue, 31 Aug 2021, Richard O'Keefe wrote: > I made up fake data in order to avoid showing untested code. It's not > part of the process I was recommending. I expect data recorded every N > minutes to use NA when something is missing, n

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Rich Shepard

On Tue, 31 Aug 2021, Richard O'Keefe wrote: I made up fake data in order to avoid showing untested code. It's not part of the process I was recommending. I expect data recorded every N minutes to use NA when something is missing, not to simply not be recorded. Well and good, all that means is th

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Richard O'Keefe

I made up fake data in order to avoid showing untested code. It's not part of the process I was recommending. I expect data recorded every N minutes to use NA when something is missing, not to simply not be recorded. Well and good, all that means is that reshaping the data is not a trivial call to

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Rich Shepard

On Mon, 30 Aug 2021, Richard O'Keefe wrote: Why would you need a package for this? samples.per.day <- 12*24 That's 12 5-minute intervals per hour and 24 hours per day. Generate some fake data. Richard, The problem is that there are days with fewer than 12 recorded values for various reason

Re: [R] Calculate daily means from 5-minute interval data

2021-08-30 Thread Richard O'Keefe

It is not clear to me who Jeff Newmiller's comment about periodicity is addressed to. The original poster, for asking for daily summaries? A summary of what I wrote: - daily means and standard deviations are a very poor choice for river flow data - if you insist on doing that anyway, no fancy packa

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Jeff Newmiller

IMO assuming periodicity is a bad practice for this. Missing timestamps happen too, and there is no reason to build a broken analysis process. On August 29, 2021 7:09:01 PM PDT, Richard O'Keefe wrote: >Why would you need a package for this? >> samples.per.day <- 12*24 > >That's 12 5-minute inter

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Richard O'Keefe

Why would you need a package for this? > samples.per.day <- 12*24 That's 12 5-minute intervals per hour and 24 hours per day. Generate some fake data. > x <- rnorm(samples.per.day * 365) > length(x) [1] 105120 Reshape the fake data into a matrix where each row represents one 24-hour period. > m

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rich Shepard

On Sun, 29 Aug 2021, Andrew Simmons wrote: I would suggest something like: Thanks, Andrew. Stay well, Rich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rich Shepard

On Sun, 29 Aug 2021, Rui Barradas wrote: Hope this helps, Rui, Greatly! I'll study it carefully so I fully understand the process. Many thanks. Stay well, Rich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.eth

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rui Barradas

Hello, I forgot in my previous answer, sorry for the duplicated mails. The function in my previous mail has a na.rm argument, defaulting to FALSE, pass na.rm = TRUE to remove the NA's. agg <- aggregate(cfs ~ date, df1, fun, na.rm = TRUE) Or simply change the default. I prefer to set na.rm

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rich Shepard

On Sun, 29 Aug 2021, Jeff Newmiller wrote: You may find something useful on handling timestamp data here: https://jdnewmil.github.io/ Jeff, I'll certainly read those articles. Many thanks, Rich __ R-help@r-project.org mailing list -- To UNSUBSCRI

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rui Barradas

Hello, You have date and hour in two separate columns, so to compute daily stats part of the work is already done. (Were they in the same column you would have to extract the date only.) # convert to class "Date" df1$date <- as.Date(df1$date) # function to compute the stats required # it's

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Andrew Simmons

Hello, I would suggest something like: date <- seq(as.Date("2020-01-01"), as.Date("2020-12-31"), 1) time <- sprintf("%02d:%02d", rep(0:23, each = 12), seq.int(0, 55, 5)) x <- data.frame( date = rep(date, each = length(time)), time = time ) x$cfs <- stats::rnorm(nrow(x)) cols2aggregate

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rich Shepard

On Sun, 29 Aug 2021, Rui Barradas wrote: I forgot in my previous answer, sorry for the duplicated mails. The function in my previous mail has a na.rm argument, defaulting to FALSE, pass na.rm = TRUE to remove the NA's. agg <- aggregate(cfs ~ date, df1, fun, na.rm = TRUE) Or simply change the

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Jeff Newmiller

You may find something useful on handling timestamp data here: https://jdnewmil.github.io/ On August 29, 2021 9:23:31 AM PDT, Jeff Newmiller wrote: >The general idea is to create a "grouping" column with repeated values for >each day, and then to use aggregate to compute your combined results.

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rich Shepard

On Sun, 29 Aug 2021, Jeff Newmiller wrote: The general idea is to create a "grouping" column with repeated values for each day, and then to use aggregate to compute your combined results. The dplyr package's group_by/summarise functions can also do this, and there are also proponents of the data

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rich Shepard

On Sun, 29 Aug 2021, Eric Berger wrote: Provide dummy data (e.g. 5-10 lines), say like the contents of a csv file, and calculate by hand what you'd like to see in the plot. (And describe what the plot would look like.) Eric, Mea culpa! I extracted a set of sample data and forgot to include it

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Jeff Newmiller

The general idea is to create a "grouping" column with repeated values for each day, and then to use aggregate to compute your combined results. The dplyr package's group_by/summarise functions can also do this, and there are also proponents of the data.table package which is high performance bu

Re: [R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Eric Berger

Hi Rich, Your request is a bit open-ended but here's a suggestion that might help get you an answer. Provide dummy data (e.g. 5-10 lines), say like the contents of a csv file, and calculate by hand what you'd like to see in the plot. (And describe what the plot would look like.) It sounds like what

[R] Calculate daily means from 5-minute interval data

2021-08-29 Thread Rich Shepard

I have a year's hydraulic data (discharge, stage height, velocity, etc.) from a USGS monitoring gauge recording values every 5 minutes. The data files contain 90K-93K lines and plotting all these data would produce a solid block of color. What I want are the daily means and standard deviation fro

39 matches

Mail list logo