Hello Everyone,
 
I have a question about how best to check dates for entry errors. I recently 
discovered that R will read the incorrectly entered date "11/23/21931" without 
producing a warning or an error message at least under some circumstances. 
 
> as.Date("11/23/21931", format = "%m/%d/%Y")
[1] "2193-11-23"

> as.Date("21931-11-23")
Error in charToDate(x) : 
  character string is not in a standard unambiguous format

Similarly, under some circumstances, R will convert an impossible date like 
February 31, 2011 to NA rather than issuing a warning.

> as.Date("02/31/2011", format = "%m/%d/%Y")
[1] NA

> as.Date("2011-02-31")
Error in charToDate(x) : 
  character string is not in a standard unambiguous format
 
In the former case, one could easily lose the date rather than recognizing it 
is in error and needs to be corrected.
 
So my question is how best to check these sorts of date values.
 
So far, I've been checking date values with things like:
 
sort( unique(DOB) )
sort( unique(substr(DOB, 1, 4) ) )
sort( unique(substr(DOB, 6, 7) ) )
sort( unique(substr(DOB, 9, 10) ) )
 
These are good for seeing, say, year values that are clearly in error, but 
don't do much to catch the impossible date I cited above.
 
How can I use R to better scrutinize my date data? Is there any way to make it 
complain more when there is a problem with my date data?
 
Thanks,
 
Paul
 
 

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to