Hello,

Inline.

Às 09:22 de 21/01/20, Chris Evans escreveu:
I think that might risk giving the wrong date for a date like 1/3/1990 which I 
think in Val's data is mdy data not dmy.

As I read the data, where the separator is "/" the format is mdy and where the separator is "-" it's dmy.

Maybe you're right. But I really don't know, in my country (Portugal) we use "/" and dmy. Anyway, what's important is that the OP must have a much better understanding of the data, the way it is posted is likely to cause errors. See, for instance, the expected output with numbers greater than 12 in the 1st and 2nd places, depending on the row.


So I would
go for:

library(lubridate)
DFX$dnew[grep("-", DFX$ddate, fixed = TRUE)] <- dmy(DFX$ddate[grep("-", 
DFX$ddate, fixed = TRUE)])
DFX$dnew[grep("/", DFX$ddate, fixed = TRUE)] <- mdy(DFX$ddate[grep("/", 
DFX$ddate, fixed = TRUE)])
DFX <- DFX[!is.na(DFX$dnew),]
DFX

   name      ddate       dnew
1    A   19-10-02 2002-10-19
2    B   22-11-20 2020-11-22
3    C   19-01-15 2015-01-19
4    D 11/19/2006 2006-11-19
5    F   9/9/2011 2011-09-09
6    G 12/29/2010 2010-12-29

But I am so much in awe of Rui's skills with R, and that of most of the regular 
commentators here, that I submit
this a little nervously!

Thanks!

Rui Barradas

Many thanks to all who teach me so much here, lovely, if I am correct, to 
contribute for a change!

Chris


----- Original Message -----
From: "Rui Barradas" <ruipbarra...@sapo.pt>
To: "Val" <valkr...@gmail.com>, "r-help@R-project.org (r-help@r-project.org)" 
<r-help@r-project.org>
Sent: Tuesday, 21 January, 2020 00:40:29
Subject: Re: [R] Mixed format

Hello,

The following strategy works with your data.
It uses the fact that most dates are in one of 3 formats, dmy, mdy, ymd.
It tries those formats one by one, after each try looks for NA's in the
new column.


# first round, format is dmy
DFX$dnew <- lubridate::dmy(DFX$ddate)
na <- is.na(DFX$dnew)

# second round, format is mdy
DFX$dnew[na] <- lubridate::mdy(DFX$ddate[na])
na <- is.na(DFX$dnew)

# last round, format is ymd
DFX$dnew[na] <- lubridate::ymd(DFX$ddate[na])

# remove what didn't fit any format
DFX <- DFX[!is.na(DFX$dnew), ]
DFX


Hope this helps,

Rui Barradas

Às 22:58 de 20/01/20, Val escreveu:
Hi All,

I have a data frame where one column is  a mixed date format,
a date in the form "%m-%d-%y"  and "%m/%d/%Y", also some are not in date format.

Is there a way to delete the rows that contain non-dates  and
standardize the dates in one date format like  %m-%d-%Y?
Please see my  sample data and desired output

DFX<-read.table(text="name ddate
    A  19-10-02
    B  22-11-20u
    C  19-01-15
    D  11/19/2006
    F  9/9/2011
    G  12/29/2010
    H  DEX",header=TRUE)

Desired output
name ddate
A  19-10-2002
B  22-11-2020
C  19-01-2015
D  11-19-2006
F  09-09-2011
G  12-29-2010

Thank you

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to