Hello,
Inline.
Às 09:22 de 21/01/20, Chris Evans escreveu:
I think that might risk giving the wrong date for a date like 1/3/1990 which I
think in Val's data is mdy data not dmy.
As I read the data, where the separator is "/" the format is mdy and where the separator is "-" it's dmy.
Maybe you're right. But I really don't know, in my country (Portugal) we
use "/" and dmy. Anyway, what's important is that the OP must have a
much better understanding of the data, the way it is posted is likely to
cause errors. See, for instance, the expected output with numbers
greater than 12 in the 1st and 2nd places, depending on the row.
So I would
go for:
library(lubridate)
DFX$dnew[grep("-", DFX$ddate, fixed = TRUE)] <- dmy(DFX$ddate[grep("-",
DFX$ddate, fixed = TRUE)])
DFX$dnew[grep("/", DFX$ddate, fixed = TRUE)] <- mdy(DFX$ddate[grep("/",
DFX$ddate, fixed = TRUE)])
DFX <- DFX[!is.na(DFX$dnew),]
DFX
name ddate dnew
1 A 19-10-02 2002-10-19
2 B 22-11-20 2020-11-22
3 C 19-01-15 2015-01-19
4 D 11/19/2006 2006-11-19
5 F 9/9/2011 2011-09-09
6 G 12/29/2010 2010-12-29
But I am so much in awe of Rui's skills with R, and that of most of the regular
commentators here, that I submit
this a little nervously!
Thanks!
Rui Barradas
Many thanks to all who teach me so much here, lovely, if I am correct, to
contribute for a change!
Chris
----- Original Message -----
From: "Rui Barradas" <ruipbarra...@sapo.pt>
To: "Val" <valkr...@gmail.com>, "r-help@R-project.org (r-help@r-project.org)"
<r-help@r-project.org>
Sent: Tuesday, 21 January, 2020 00:40:29
Subject: Re: [R] Mixed format
Hello,
The following strategy works with your data.
It uses the fact that most dates are in one of 3 formats, dmy, mdy, ymd.
It tries those formats one by one, after each try looks for NA's in the
new column.
# first round, format is dmy
DFX$dnew <- lubridate::dmy(DFX$ddate)
na <- is.na(DFX$dnew)
# second round, format is mdy
DFX$dnew[na] <- lubridate::mdy(DFX$ddate[na])
na <- is.na(DFX$dnew)
# last round, format is ymd
DFX$dnew[na] <- lubridate::ymd(DFX$ddate[na])
# remove what didn't fit any format
DFX <- DFX[!is.na(DFX$dnew), ]
DFX
Hope this helps,
Rui Barradas
Às 22:58 de 20/01/20, Val escreveu:
Hi All,
I have a data frame where one column is a mixed date format,
a date in the form "%m-%d-%y" and "%m/%d/%Y", also some are not in date format.
Is there a way to delete the rows that contain non-dates and
standardize the dates in one date format like %m-%d-%Y?
Please see my sample data and desired output
DFX<-read.table(text="name ddate
A 19-10-02
B 22-11-20u
C 19-01-15
D 11/19/2006
F 9/9/2011
G 12/29/2010
H DEX",header=TRUE)
Desired output
name ddate
A 19-10-2002
B 22-11-2020
C 19-01-2015
D 11-19-2006
F 09-09-2011
G 12-29-2010
Thank you
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.