Marie Sivertsen wrote:
I am relatively new to R, so maybe I am miss something, but I now
tried the as.Date now and have problems understanding how it works (or
don't work as it seem).


Brian D Ripley wrote:
On Thu, 22 Jan 2009, Terry Therneau wrote:
One idea is to use the as.date function, for the older (and less capable) 'date'
class.  This is currently loaded by default with library(survival).  It returns
NA for an invalid date rather than dying.
So does as.Date **if you specify the format** (as you have to with your as.date:
it has a default one):


as.Date("2001/1/1")
Works fine

as.Date("1/1/2001")
Prints "1-01-20" ???

as.Date("13/1/2001")
Prints "13-01-20" ???

as.Date("1/13/2001")
Prints error: not in standard unambigous format

It seems that as if both "1/1/2001" and "13/1/2001" were considered by
R to be in a
standard unambiguous format (or otherwise an error be reported?) and yet they
are parsed incorrectly according to what one could think is obvious.
It is also
surprizing that not only "13/1/2001" but also "1/2/2001" and "2/1/2001" are
successful but incorrect parsed as if they are unambiguous, and yet
"13/1/2001" is ambiguous, though there is really just one way to
parse it meaningfully.

I think the strings that are incorrectly parsed should raise errors,
and the last example should be succesful parsed.  What is the reason
for the observed?


There are two issues:

a) as.Date ignores trailing characters. This is what causes it to read trailing 4 digit years as "20" or "19". I.e., 2/1/2001 makes sense as 2/1/20 (January 20, year 2 AD) followed by "01". This is a documented feature, although the usefulness may not be clear to you. I suspect that the point is that you sometimes get odd date strings, say
"Jan 24, 2009 - pd"
and you don't want to have to add code to strip off the unneeded part.

b) The error message could be better. Formats are never ambiguous and none of them are defined uniquely by their strings ("03-02-01" is always a problem) and we're not trying to auto-detect anyway. What it really means is that the string is clearly not "%Y-%m-%d" or "%Y/%m/%d".

--
   O__  ---- Peter Dalgaard             Ă˜ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalga...@biostat.ku.dk)              FAX: (+45) 35327907

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to