Marie Sivertsen wrote:
I am relatively new to R, so maybe I am miss something, but I now
tried the as.Date now and have problems understanding how it works (or
don't work as it seem).
Brian D Ripley wrote:
On Thu, 22 Jan 2009, Terry Therneau wrote:
One idea is to use the as.date function, for the older (and less capable) 'date'
class. This is currently loaded by default with library(survival). It returns
NA for an invalid date rather than dying.
So does as.Date **if you specify the format** (as you have to with your as.date:
it has a default one):
as.Date("2001/1/1")
Works fine
as.Date("1/1/2001")
Prints "1-01-20" ???
as.Date("13/1/2001")
Prints "13-01-20" ???
as.Date("1/13/2001")
Prints error: not in standard unambigous format
It seems that as if both "1/1/2001" and "13/1/2001" were considered by
R to be in a
standard unambiguous format (or otherwise an error be reported?) and yet they
are parsed incorrectly according to what one could think is obvious.
It is also
surprizing that not only "13/1/2001" but also "1/2/2001" and "2/1/2001" are
successful but incorrect parsed as if they are unambiguous, and yet
"13/1/2001" is ambiguous, though there is really just one way to
parse it meaningfully.
I think the strings that are incorrectly parsed should raise errors,
and the last example should be succesful parsed. What is the reason
for the observed?
There are two issues:
a) as.Date ignores trailing characters. This is what causes it to read
trailing 4 digit years as "20" or "19". I.e., 2/1/2001 makes sense as
2/1/20 (January 20, year 2 AD) followed by "01". This is a documented
feature, although the usefulness may not be clear to you. I suspect that
the point is that you sometimes get odd date strings, say
"Jan 24, 2009 - pd"
and you don't want to have to add code to strip off the unneeded part.
b) The error message could be better. Formats are never ambiguous and
none of them are defined uniquely by their strings ("03-02-01" is always
a problem) and we're not trying to auto-detect anyway. What it really
means is that the string is clearly not "%Y-%m-%d" or "%Y/%m/%d".
--
O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.