Hi Phil,
Well thank you very much for this detailed explanation. It will help
me when summarizing information over periods of time using either
summarize (Hmisc) or summaryBy (doBy). Until now, doing so resulted in
"mean" time for each "group" being transformed as a number of seconds,
as you explain below. But both these functions do not put it back in a
POSIX date-time object. I tried to do so by using "as.POSIXct()" but
this failed because I did not provide a reference. From now on I'll
try the structure command you used below.
Denis
Le 09-03-10 à 19:04, Phil Spector a écrit :
Denis -
If you look inside of summary.POSIXct, you'll see the
following:
x <- summary.default(unclass(object), digits = digits, ...)[1:6]
In other words, summary accepts the POSIX object, unclasses it
(resulting in a numeric value representing the number of seconds
since January 1, 1960), performs the operation, and then reassigns
the class. You can do this basic trick yourself. Suppose we have a
vector of dates and want the median:
dates =
as.POSIXct(c('2009-3-15','2009-2-19','2009-3-20','2009-2-18'))
median(dates)
Error in Summary.POSIXct(c(1235030400, 1237100400), na.rm = FALSE) :
'sum' not defined for "POSIXt" objects
res = median(as.numeric(dates))
structure(res,class='POSIXct')
[1] "2009-03-02 23:30:00 PST"
I think it's clear that you can do any arithmetic operation on
dates this way, even if it doesn't make sense:
sum(dates)
Error in Summary.POSIXct(c(1237100400, 1235030400, 1237532400,
1234944000 :
'sum' not defined for "POSIXt" objects
res = sum(as.numeric(dates))
structure(res,class='POSIXct')
[1] "2126-09-08 23:00:00 PDT"
I'm quite certain that median.POSIXct will be fixed pretty quickly,
but you can always unclass and reclass to do what you need.
- Phil
On Tue, 10 Mar 2009, Denis Chabot wrote:
Thanks Phil,
but how does summary() finds the median of the same type of object?
I would have thought the algorithm used when the vector is even
would also require the SUM of the POSIX vector. I am glad of the
solution you propose, but still puzzled a bit!
Denis
Le 09-03-10 à 12:39, Phil Spector a écrit :
Denis -
There is no median method for POSIX objects, although
there is a summary object. Thus, when you pass a POSIX
object to median, it uses median.default, which contains
the following code:
if (n%%2L == 1L)
sort(x, partial = half)[half]
else sum(sort(x, partial = half + 0L:1L)[half + 0L:1L])/2
So when the length of your POSIX vector is odd, it works, but if
it's even, it would need to take the sum of a POSIX
object. Of course, there is no sum method for POSIX objects,
since it doesn't make sense.
Right now, it looks like your best bet for a summary of POSIX
objects is
summary(a)['Median']
- Phil Spector
Statistical Computing Facility
Department of Statistics
UC Berkeley
spec...@stat.berkeley.edu
On Tue, 10 Mar 2009, Denis Chabot wrote:
Hi,
I don't understand the following. When I create a small
artificial set of date information in class POSIXct, I can
calculate the mean and the median:
a = as.POSIXct(Sys.time())
a = a + 60*0:10; a
[1] "2009-03-10 11:30:16 EDT" "2009-03-10 11:31:16 EDT"
"2009-03-10 11:32:16 EDT"
[4] "2009-03-10 11:33:16 EDT" "2009-03-10 11:34:16 EDT"
"2009-03-10 11:35:16 EDT"
[7] "2009-03-10 11:36:16 EDT" "2009-03-10 11:37:16 EDT"
"2009-03-10 11:38:16 EDT"
[10] "2009-03-10 11:39:16 EDT" "2009-03-10 11:40:16 EDT"
median(a)
[1] "2009-03-10 11:35:16 EDT"
mean(a)
[1] "2009-03-10 11:35:16 EDT"
But for real data (for this post, a short subset is in object c)
that I have converted into a POSIXct object, I cannot calculate
the median with median(), though I do get it with summary():
c
[1] "2009-02-24 14:51:18 EST" "2009-02-24 14:51:19 EST"
"2009-02-24 14:51:19 EST"
[4] "2009-02-24 14:51:20 EST" "2009-02-24 14:51:20 EST"
"2009-02-24 14:51:21 EST"
[7] "2009-02-24 14:51:21 EST" "2009-02-24 14:51:22 EST"
"2009-02-24 14:51:22 EST"
[10] "2009-02-24 14:51:22 EST"
class(c)
[1] "POSIXt" "POSIXct"
median(c)
Erreur dans Summary.POSIXct(c(1235505080.6, 1235505081.1), na.rm
= FALSE) :
'sum' not defined for "POSIXt" objects
One difference is that in my own date-time series, some events
are repeated (the original data contained fractions of seconds).
But then, why can I get a median through summary()?
summary(c)
Min. 1st Qu.
Median
"2009-02-24 14:51:18 EST" "2009-02-24 14:51:19 EST" "2009-02-24
14:51:20 EST"
Mean 3rd
Qu. Max.
"2009-02-24 14:51:20 EST" "2009-02-24 14:51:21 EST" "2009-02-24
14:51:22 EST"
Thanks in advance,
Denis Chabot
sessionInfo()
R version 2.8.1 Patched (2009-01-19 r47650)
i386-apple-darwin9.6.0
locale:
fr_CA.UTF-8/fr_CA.UTF-8/C/C/fr_CA.UTF-8/fr_CA.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods
base
other attached packages:
[1] doBy_3.7 chron_2.3-30
loaded via a namespace (and not attached):
[1] Hmisc_3.5-2 cluster_1.11.12 grid_2.8.1
lattice_0.17-20 tools_2.8.1
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.