Hi Phil,

Well thank you very much for this detailed explanation. It will help me when summarizing information over periods of time using either summarize (Hmisc) or summaryBy (doBy). Until now, doing so resulted in "mean" time for each "group" being transformed as a number of seconds, as you explain below. But both these functions do not put it back in a POSIX date-time object. I tried to do so by using "as.POSIXct()" but this failed because I did not provide a reference. From now on I'll try the structure command you used below.

Denis
Le 09-03-10 à 19:04, Phil Spector a écrit :

Denis -
  If you look inside of summary.POSIXct, you'll see the
following:

x <- summary.default(unclass(object), digits = digits, ...)[1:6]

In other words, summary accepts the POSIX object, unclasses it
(resulting in a numeric value representing the number of seconds
since January 1, 1960), performs the operation, and then reassigns
the class. You can do this basic trick yourself. Suppose we have a vector of dates and want the median:

dates = as.POSIXct(c('2009-3-15','2009-2-19','2009-3-20','2009-2-18'))
median(dates)
Error in Summary.POSIXct(c(1235030400, 1237100400), na.rm = FALSE) :
 'sum' not defined for "POSIXt" objects
res = median(as.numeric(dates))
structure(res,class='POSIXct')
[1] "2009-03-02 23:30:00 PST"

  I think it's clear that you can do any arithmetic operation on
dates this way, even if it doesn't make sense:

sum(dates)
Error in Summary.POSIXct(c(1237100400, 1235030400, 1237532400,
1234944000 :
 'sum' not defined for "POSIXt" objects
res = sum(as.numeric(dates))
structure(res,class='POSIXct')
[1] "2126-09-08 23:00:00 PDT"

  I'm quite certain that median.POSIXct will be fixed pretty quickly,
but you can always unclass and reclass to do what you need.

                                                    - Phil






On Tue, 10 Mar 2009, Denis Chabot wrote:

Thanks Phil,

but how does summary() finds the median of the same type of object? I would have thought the algorithm used when the vector is even would also require the SUM of the POSIX vector. I am glad of the solution you propose, but still puzzled a bit!

Denis
Le 09-03-10 à 12:39, Phil Spector a écrit :

Denis -
There is no median method for POSIX objects, although
there is a summary object.  Thus, when you pass a POSIX
object to median, it uses median.default, which contains
the following code:

 if (n%%2L == 1L)
    sort(x, partial = half)[half]
 else sum(sort(x, partial = half + 0L:1L)[half + 0L:1L])/2
So when the length of your POSIX vector is odd, it works, but if it's even, it would need to take the sum of a POSIX object. Of course, there is no sum method for POSIX objects, since it doesn't make sense.
Right now, it looks like your best bet for a summary of POSIX
objects is
summary(a)['Median']

                                    - Phil Spector
                                         Statistical Computing Facility
                                         Department of Statistics
                                         UC Berkeley
                                         spec...@stat.berkeley.edu
On Tue, 10 Mar 2009, Denis Chabot wrote:
Hi,
I don't understand the following. When I create a small artificial set of date information in class POSIXct, I can calculate the mean and the median:
a = as.POSIXct(Sys.time())
a = a + 60*0:10; a
[1] "2009-03-10 11:30:16 EDT" "2009-03-10 11:31:16 EDT" "2009-03-10 11:32:16 EDT" [4] "2009-03-10 11:33:16 EDT" "2009-03-10 11:34:16 EDT" "2009-03-10 11:35:16 EDT" [7] "2009-03-10 11:36:16 EDT" "2009-03-10 11:37:16 EDT" "2009-03-10 11:38:16 EDT"
[10] "2009-03-10 11:39:16 EDT" "2009-03-10 11:40:16 EDT"
median(a)
[1] "2009-03-10 11:35:16 EDT"
mean(a)
[1] "2009-03-10 11:35:16 EDT"
But for real data (for this post, a short subset is in object c) that I have converted into a POSIXct object, I cannot calculate the median with median(), though I do get it with summary():
c
[1] "2009-02-24 14:51:18 EST" "2009-02-24 14:51:19 EST" "2009-02-24 14:51:19 EST" [4] "2009-02-24 14:51:20 EST" "2009-02-24 14:51:20 EST" "2009-02-24 14:51:21 EST" [7] "2009-02-24 14:51:21 EST" "2009-02-24 14:51:22 EST" "2009-02-24 14:51:22 EST"
[10] "2009-02-24 14:51:22 EST"
class(c)
[1] "POSIXt"  "POSIXct"
median(c)
Erreur dans Summary.POSIXct(c(1235505080.6, 1235505081.1), na.rm = FALSE) :
'sum' not defined for "POSIXt" objects
One difference is that in my own date-time series, some events are repeated (the original data contained fractions of seconds). But then, why can I get a median through summary()?
summary(c)
Min. 1st Qu. Median "2009-02-24 14:51:18 EST" "2009-02-24 14:51:19 EST" "2009-02-24 14:51:20 EST" Mean 3rd Qu. Max. "2009-02-24 14:51:20 EST" "2009-02-24 14:51:21 EST" "2009-02-24 14:51:22 EST"
Thanks in advance,
Denis Chabot
sessionInfo()
R version 2.8.1 Patched (2009-01-19 r47650)
i386-apple-darwin9.6.0
locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] doBy_3.7     chron_2.3-30
loaded via a namespace (and not attached):
[1] Hmisc_3.5-2 cluster_1.11.12 grid_2.8.1 lattice_0.17-20 tools_2.8.1
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to