Re: [Rd] Date class shows Inf as NA; this confuses the use of is.na()

2018-06-11 Thread Martin Maechler
> Joris Meys 
> on Sat, 9 Jun 2018 13:45:21 +0200 writes:

> And now I've seen I copied the wrong part of ?is.na
>> The default method for is.na applied to an atomic vector
>> returns a
> logical vector of the same length as its argument x,
> containing TRUE for those elements marked NA or, for
> numeric or complex vectors, NaN, and FALSE otherwise.

> Key point being "atomic vector" here.

and a Date vector *is* atomic .. (so I'm confused about what
that issue is .. but read one.


> On Sat, Jun 9, 2018 at 1:41 PM, Joris Meys
>  wrote:

>> Hi Werner,
>> 
>> on ?is.na it says:
>> 
>> > The default method for anyNA handles atomic vectors
>> without a class and NULL.
>> 
>> I hear you, and it is confusing to say the least. Looking
>> deeper, the culprit seems to be in the conversion of a
>> Date to POSIXlt prior to the formatting:
>> 
>> > x <- as.Date(Inf,origin = '1970-01-01')
>> > is.na(as.POSIXlt(x)) [1] TRUE
>> 
>> Given this implicit conversion, I'd argue that as.Date
>> should really return NA as well when passed an infinite
>> value. The other option is to provide an is.na method for
>> the Date class, which is -given is.na is an internal
>> generic- rather trivial:
>> 
>> > is.na.Date <- function(x) is.na(as.POSIXlt(x)) 
>> > is.na(x) [1] TRUE
>> 
>> This might be a workaround for your current problem
>> without needing changes to R itself. But this will give a
>> "wrong" answer in the sense that this still works:
>> 
>> > Sys.Date() - x Time difference of -Inf days
>> 

>> I personally would go for NA as the "correct" date for an
>> infinite value, but given that this will have
>> implications in other areas, there is a possibility of
>> breaking code and it should be investigated a bit further
>> imho.  Cheers Joris

Indeed.  I could argue it is wrong to treat '+/- Inf' as NA for
dates (as well as for date times), because the Inf *does*
contain information in some sense:

 Infinitely far in the future
vs   Infinitely far in the past

which may make sense in some case ... in the same sense +Inf and
-Inf do make sense for numbers in some cases.

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Date class shows Inf as NA; this confuses the use of is.na()

2018-06-11 Thread Joris Meys
On Mon, Jun 11, 2018 at 11:12 AM, Martin Maechler <
maech...@stat.math.ethz.ch> wrote:

>
> and a Date vector *is* atomic .. (so I'm confused about what
> that issue is .. but read one.
>

Indeed. I tend to exclude everything with a formal class from "atomic" (eg
factors et al) because they do behave differently sometimes, but
technically that's not correct. Thank you for pointing that out.


> Indeed.  I could argue it is wrong to treat '+/- Inf' as NA for
> dates (as well as for date times), because the Inf *does*
> contain information in some sense:
>
>  Infinitely far in the future
> vs   Infinitely far in the past
>
> which may make sense in some case ... in the same sense +Inf and
> -Inf do make sense for numbers in some cases.
>
> Martin
>

I considered that too. But as shown in the code above: anything that relies
on POSIXlt to process the date, will actually convert the Inf value to NA.

The problem becomes a bit more confusing, as as.POSIXct() does not convert
to NA.

> x <-  as.Date(Inf, origin = '1970-01-01')
> is.na(x)
[1] FALSE
> is.na(as.POSIXct(x))
[1] FALSE
> is.na(as.POSIXlt(x))
[1] TRUE

I can guess why this happens. For a date that's infinitely far in the
future, it is impossible to determine an exact hour, minute, second, day,
month, ... So these values in the POSIXlt "list" format can't be anything
but NA.

So I totally understand the value of having Inf dates. The trade-off to
consider here is whether we strive for consistency among the different
datetime classes, or strive for correct representation of the actual value
of the date.

Cheers
Joris
-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Date class shows Inf as NA; this confuses the use of is.na()

2018-06-11 Thread Emil Bode
I don't think there's much wrong with is.na(as_date(Inf, 
origin='1970-01-01'))==FALSE, as there still is some "non-NA-ness" about the 
value (as difftime shows), but that the output when printing is confusing. The 
way cat is treating it is clearer: it does print Inf.

So would this be a solution?

format.Date <- function (x, ...) 
{
  xx <- format(as.POSIXlt(x), ...)
  names(xx) <- names(x)
  xx[is.na(xx) & !is.na(x)] <- paste('Invalid date:',as.numeric(x[is.na(xx) & 
!is.na(x)]))
  xx
}

Which causes this behaviour, which I think is clearer:

environment(print.Date) <- .GlobalEnv
x <- as_date(Inf, origin='1970-01-01')
print(x)
# [1] "Invalid date: Inf"

Best regards, 
Emil Bode
 
Data-analyst
 
+31 6 43 83 89 33
emil.b...@dans.knaw.nl
 
DANS: Netherlands Institute for Permanent Access to Digital Research Resources
Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | 
i...@dans.knaw.nl  | dans.knaw.nl 

DANS is an institute of the Dutch Academy KNAW  and funding 
organisation NWO .
 
Who will be the winner of the Dutch Data Prize 2018? Go to researchdata.nl to 
nominate. 

On 09/06/2018, 13:52, "R-devel on behalf of Joris Meys" 
 wrote:

And now I've seen I copied the wrong part of ?is.na

> The default method for is.na applied to an atomic vector returns a
logical vector of the same length as its argument x, containing TRUE for
those elements marked NA or, for numeric or complex vectors, NaN, and FALSE
otherwise.

Key point being "atomic vector" here.


On Sat, Jun 9, 2018 at 1:41 PM, Joris Meys  wrote:

> Hi Werner,
>
> on ?is.na it says:
>
> > The default method for anyNA handles atomic vectors without a class and
> NULL.
>
> I hear you, and it is confusing to say the least. Looking deeper, the
> culprit seems to be in the conversion of a Date to POSIXlt prior to the
> formatting:
>
> > x <- as.Date(Inf,origin = '1970-01-01')
> > is.na(as.POSIXlt(x))
> [1] TRUE
>
> Given this implicit conversion, I'd argue that as.Date should really
> return NA as well when passed an infinite value. The other option is to
> provide an is.na method for the Date class, which is -given is.na is an
> internal generic- rather trivial:
>
> > is.na.Date <- function(x) is.na(as.POSIXlt(x))
> > is.na(x)
> [1] TRUE
>
> This might be a workaround for your current problem without needing
> changes to R itself. But this will give a "wrong" answer in the sense that
> this still works:
>
> > Sys.Date() - x
> Time difference of -Inf days
>
> I personally would go for NA as the "correct" date for an infinite value,
> but given that this will have implications in other areas, there is a
> possibility of breaking code and it should be investigated a bit further
> imho.
> Cheers
> Joris
>
>
>
>
> On Fri, Jun 8, 2018 at 11:21 PM, Werner Grundlingh 
> wrote:
>
>> Indeed. as_date is from lubridate, but the same holds for as.Date.
>>
>> The output and it's interpretation should be consistent, otherwise it
>> leads
>> to confusion when programming. I understand that the difference exists
>> after asking a question on Stack Overflow:
>>   https://stackoverflow.com/q/50766089/914686
>> This understanding is never mentioned in the documentation - that an Inf
>> date is actually represented as NA:
>>   https://www.rdocumentation.org/packages/base/versions/3.5.0/
>> topics/as.Date
>> So I'm of the impression that the display should be fixed as a first
>> option
>> (thereby providing clarity/transparency in terms of back-end and output),
>> or the documentation amended (to highlight this) as a second option.
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> 

>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>



-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)



---
Biowiskundedagen 2017-2018

[Rd] Rgui 3.5.0 print issue

2018-06-11 Thread Diego Zardetto
Dear all,

 

I would like to have your opinion about an issue I have recently run into
while using tcltk in R 3.5.0 under Windows 7 64bit.

 

Here is a reproducible example of the issue, along with information about
platform and OS.

 

###

# R 3.5.0 issue: print does not work properly for data.frames #

#when called from a tcltk window. #

# #

# NOTE: The issue shows up when using Rgui, but disappears#

#   if Rterm is used. #

# #

# NOTE: The issue starts with R 3.5.0, and is still there #

#   in R 3.5.0 patched build for Windows, as well as in   #

#   R-Devel.  #

###

# Reproducible example

library(tcltk)

data(cars)

win1 <- tktoplevel()

butOK <- tkbutton(win1, text = "OK", command = function() print(cars))

tkgrid(butOK)

# NOTE: Upon pressing OK, the rownames of cars are not printed on 

#   screen, but end up into R's prompt.

 

 



# R version and platform info. #



> R.version

   _   

platform   x86_64-w64-mingw32  

arch   x86_64  

os mingw32 

system x86_64, mingw32 

status 

major  3   

minor  5.0 

year   2018

month  04  

day23  

svn rev74626   

language   R   

version.string R version 3.5.0 (2018-04-23)

nickname   Joy in Playing

 

##

# Operating system info. #

##

> Sys.info()

 sysname  release 

   "Windows"  "7 x64" 

 version nodename 

"build 7601, Service Pack 1""PC79258" 

 machinelogin 

"x86-64"   "zardetto" 

user   effective_user 

  "zardetto"   "zardetto"

 

 

I would appreciate any feedback you could provide.

 

Thanks

D.


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Date class shows Inf as NA; this confuses the use of is.na()

2018-06-11 Thread Gabe Becker
Emil et al.,


On Mon, Jun 11, 2018 at 1:08 AM, Emil Bode  wrote:

> I don't think there's much wrong with is.na(as_date(Inf,
> origin='1970-01-01'))==FALSE, as there still is some "non-NA-ness" about
> the value (as difftime shows), but that the output when printing is
> confusing. The way cat is treating it is clearer: it does print Inf.
>
> So would this be a solution?
>
> format.Date <- function (x, ...)
> {
>   xx <- format(as.POSIXlt(x), ...)
>   names(xx) <- names(x)
>   xx[is.na(xx) & !is.na(x)] <- paste('Invalid date:',as.numeric(x[is.na(xx)
> & !is.na(x)]))
>   xx
> }
>
> Which causes this behaviour, which I think is clearer:
>
> environment(print.Date) <- .GlobalEnv
> x <- as_date(Inf, origin='1970-01-01')
> print(x)
> # [1] "Invalid date: Inf"
>

In my opinion, it's either invalid or it isn't. If it's actually invalid,
as_date (and the equivalent core function which is actually relevant on
this list) should fail; because it's an invalid date.

If it *isn't* invalid, having the print method tell users it is seems
problematic.

And I think people seem to be leaning towards it not being invalid. A bit
surprising to me, as my personal first thought was that infinite dates
don't make any sense, but I don't really have a horse in this race and so
defer to the cooler heads that are saying having an infinite date perhaps
should not be disallowed explicitly. If it's not, though, it's not invalid
and we shouldn't confuse users by saying it is, imho.

Best,
~G


>
> Best regards,
> Emil Bode
>
> Data-analyst
>
> +31 6 43 83 89 33
> emil.b...@dans.knaw.nl
>
> DANS: Netherlands Institute for Permanent Access to Digital Research
> Resources
> Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 |
> i...@dans.knaw.nl  | dans.knaw.nl
> 
> DANS is an institute of the Dutch Academy KNAW  and
> funding organisation NWO .
>
> Who will be the winner of the Dutch Data Prize 2018? Go to researchdata.nl
> to nominate.
>
> On 09/06/2018, 13:52, "R-devel on behalf of Joris Meys" <
> r-devel-boun...@r-project.org on behalf of jorism...@gmail.com> wrote:
>
> And now I've seen I copied the wrong part of ?is.na
>
> > The default method for is.na applied to an atomic vector returns a
> logical vector of the same length as its argument x, containing TRUE
> for
> those elements marked NA or, for numeric or complex vectors, NaN, and
> FALSE
> otherwise.
>
> Key point being "atomic vector" here.
>
>
> On Sat, Jun 9, 2018 at 1:41 PM, Joris Meys 
> wrote:
>
> > Hi Werner,
> >
> > on ?is.na it says:
> >
> > > The default method for anyNA handles atomic vectors without a
> class and
> > NULL.
> >
> > I hear you, and it is confusing to say the least. Looking deeper, the
> > culprit seems to be in the conversion of a Date to POSIXlt prior to
> the
> > formatting:
> >
> > > x <- as.Date(Inf,origin = '1970-01-01')
> > > is.na(as.POSIXlt(x))
> > [1] TRUE
> >
> > Given this implicit conversion, I'd argue that as.Date should really
> > return NA as well when passed an infinite value. The other option is
> to
> > provide an is.na method for the Date class, which is -given is.na
> is an
> > internal generic- rather trivial:
> >
> > > is.na.Date <- function(x) is.na(as.POSIXlt(x))
> > > is.na(x)
> > [1] TRUE
> >
> > This might be a workaround for your current problem without needing
> > changes to R itself. But this will give a "wrong" answer in the
> sense that
> > this still works:
> >
> > > Sys.Date() - x
> > Time difference of -Inf days
> >
> > I personally would go for NA as the "correct" date for an infinite
> value,
> > but given that this will have implications in other areas, there is a
> > possibility of breaking code and it should be investigated a bit
> further
> > imho.
> > Cheers
> > Joris
> >
> >
> >
> >
> > On Fri, Jun 8, 2018 at 11:21 PM, Werner Grundlingh <
> wgrundli...@gmail.com>
> > wrote:
> >
> >> Indeed. as_date is from lubridate, but the same holds for as.Date.
> >>
> >> The output and it's interpretation should be consistent, otherwise
> it
> >> leads
> >> to confusion when programming. I understand that the difference
> exists
> >> after asking a question on Stack Overflow:
> >>   https://stackoverflow.com/q/50766089/914686
> >> This understanding is never mentioned in the documentation - that
> an Inf
> >> date is actually represented as NA:
> >>   https://www.rdocumentation.org/packages/base/versions/3.5.0/
> >> topics/as.Date
> >> So I'm of the impression that the display should be fixed as a first
> >> option
> >> (thereby providing clarity/transparency in terms of back-end and
> output),
> >> or the documentation amended (to highlight this) as a second option.
> >>
> >>