Re: [Rd] Bug with `[<-.POSIXlt` on specific OSes

2022-10-12 Thread Martin Maechler
> Kurt Hornik 
> on Tue, 11 Oct 2022 16:44:13 +0200 writes:

> Davis Vaughan writes:
>> I've got a bit more information about this one. It seems like it
>> (only? not sure) appears when `TZ = "UTC"`, which is why I didn't see
>> it before on my Mac, which defaults to `TZ = ""`. I think this is at
>> least explainable by the fact that those "optional" fields aren't
>> technically needed when the time zone is UTC.

> Exactly.  Debugging `[<-.POSIlt` with

> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
> Sys.setenv(TZ = "UTC")
> x[1] <- NA

> shows we get into

> value <- unclass(as.POSIXlt(value))
> if (ici) {
> for (n in names(x)) names(x[[n]]) <- nms
> }
> for (n in names(x)) x[[n]][i] <- value[[n]]

> where

> Browse[2]> names(value)
> [1] "sec"   "min"   "hour"  "mday"  "mon"   "year"  "wday"  "yday"  
"isdst"
> Browse[2]> names(x)
> [1] "sec""min""hour"   "mday"   "mon""year"   "wday"   "yday" 
 
> [9] "isdst"  "zone"   "gmtoff"

> Without having looked at the code, the docs say

> ‘zone’ (Optional.) The abbreviation for the time zone in force at
> that time: ‘""’ if unknown (but ‘""’ might also be used for
> UTC).

> ‘gmtoff’ (Optional.) The offset in seconds from GMT: positive
> values are East of the meridian.  Usually ‘NA’ if unknown,
> but ‘0’ could mean unknown.

> so perhaps we should fill with the values for the unknown case?

> -k

Well,

I think you both know  I'm in the midst of dealing with these
issues, to fix both

[.POSIXlt  and
[<-.POSIXlt

Yes, one needs a way to not only "fill" the partially filled
entries but also to *normalize* out-of-range values
(say negative seconds, minutes > 60, etc)

All this is available in our C code, but not on the R level,
so yesterday, I wrote a C function to be called via .Internal(.)
from a new R that provides this.

Provisionally called

   balancePOXIXlt()

because it both balances the 9 to 11 list-components of POSIXlt
and it also puts all numbers of (sec, min, hour, mday, mon)
into a correct range (and also computes correctl wday and yday numbers).
but I'm happy for proposals of better names.
I had contemplated  validatePOSIXlt() as alternative, but then
dismissed that as in some sense we now do agree that
"imbalanced" POSIXlt's are not really invalid ..

.. and yes, to Davis:  Even though I've spent so many hours with
POSIXlt, POSIXct and Date during the last week, I'm still
surprised more often than I like by the effects of timezone
settings there.

Martin


>> I can reproduce this now on my personal Mac:

>> ```

>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))

>> Sys.setenv(TZ = "")

>> x[1] <- NA

>> x

>> #> [1] NA


>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))

>> Sys.setenv(TZ = "America/New_York")

>> x[1] <- NA

>> x

>> #> [1] NA


>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))

>> Sys.setenv(TZ = "UTC")

>> x[1] <- NA
>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero

>> x

>> #> [1] "2013-01-31 CST"
>> ```

>> Here are `sessionInfo()` and `Sys.getenv("TZ")` outputs for 3 GitHub
>> Actions platforms where the bug exists (note they all set `TZ = "UTC"`!):

>> Linux:

>> ```

>>> sessionInfo()

>> R version 4.2.1 (2022-06-23)

>> Platform: x86_64-pc-linux-gnu (64-bit)

>> Running under: Ubuntu 18.04.6 LTS


>> Matrix products: default

>> BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3

>> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so


>> locale:

>> [1] LC_CTYPE=C.UTF-8   LC_NUMERIC=C   LC_TIME=C.UTF-8

>> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8LC_MESSAGES=C.UTF-8

>> [7] LC_PAPER=C.UTF-8   LC_NAME=C  LC_ADDRESS=C

>> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C


>> attached base packages:

>> [1] stats graphics  grDevices utils datasets  methods   base


>> loaded via a namespace (and not attached):

>> [1] compiler_4.2.1


>>> Sys.getenv("TZ")

>> [1] "UTC"
>> ```

>> Mac:

>> ```

>>> sessionInfo()

>> R version 4.2.1 (2022-06-23)

>> Platform: x86_64-apple-darwin17.0 (64-bit)

>> Running under: macOS Big Sur ... 10.16


>> Matrix products: default

>> BLAS:
>> 
/Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib

>> LAPACK:
>> 
/Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib


>> locale:

>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8


>> attached base packages:

>> [1] stats graphics  grDevices utils datasets  methods   base


>> loaded via a namespace (and not attached):

>> [1

Re: [Rd] A potential POSIXlt->Date bug introduced in r-devel

2022-10-12 Thread Prof Brian Ripley
Confirmed on Fedora 36 which has a 32-bit time_t for an i686 compile.  I 
was a bit surprised that has not been changed, but gather Linux distros 
are preferring to drop ix86 than fix it.


There is a simple workaround, to configure R with 
--with-internal-tzcode, which always uses a 64-bit time_t.  Given that 
2038 is not that far away, avoiding 32-bit time_t is generally a very 
good idea (not just for people working with dates in 5881580!).


That test should not really be run on platforms with 32-bit time_t, but 
that is not currently known at R level.


On 06/10/2022 13:38, Prof Brian Ripley wrote:

On 06/10/2022 09:41, Berwin A Turlach wrote:

G'day all,

On Thu, 6 Oct 2022 10:15:29 +0200
Martin Maechler  wrote:


Davis Vaughan
 on Wed, 5 Oct 2022 17:04:11 -0400 writes:



 > # Weird, where is the `NA`?
 > as.Date(x)
 > #> [1] "2013-01-31" "1970-01-01" "2013-03-31"
 > ```

I agree that the above is wrong, i.e., a bug in current  R-devel.


I have no intention of hijacking this thread, but I wonder whether this
is a good opportunity to mention that the 32 bit build of R-devel falls
over on my machine since 25 September.  It fails one of the regression
tests in reg-tests-1d.R.  The final lines of reg-tests-1d.Rout.fail
are:


tools::Rd2txt(rd, out <- textConnection(NULL, "w"), fragment = TRUE)
stopifnot(any(as.character(rd) != "\n"),

+   identical(textConnectionValue(out)[2L], "LaTeX"));
close(out)

## empty output in R <= 4.2.x


Yes, known for a few days on the R-core list. I am in the middle of an 
OS upgrade on that machine and won't have time to do more than report 
until that (and all the re-building and re-checking) is complete.



## as.POSIXlt()  gave integer overflow
stopifnot(as.POSIXlt(.Date(2^31 + 10))$year == 5879680L)

Error: as.POSIXlt(.Date(2^31 + 10))$year == 5879680L is not TRUE
Execution halted


I should have reported this earlier, but somehow did not find the time
to do so.  So I thought I mention it here. :)

Cheers,

Berwin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug with `[<-.POSIXlt` on specific OSes

2022-10-12 Thread Martin Maechler
> Martin Maechler 
> on Wed, 12 Oct 2022 10:17:28 +0200 writes:

> Kurt Hornik 
> on Tue, 11 Oct 2022 16:44:13 +0200 writes:

> Davis Vaughan writes:
>>> I've got a bit more information about this one. It seems like it
>>> (only? not sure) appears when `TZ = "UTC"`, which is why I didn't see
>>> it before on my Mac, which defaults to `TZ = ""`. I think this is at
>>> least explainable by the fact that those "optional" fields aren't
>>> technically needed when the time zone is UTC.

>> Exactly.  Debugging `[<-.POSIlt` with

>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))
>> Sys.setenv(TZ = "UTC")
>> x[1] <- NA

>> shows we get into

>> value <- unclass(as.POSIXlt(value))
>> if (ici) {
>> for (n in names(x)) names(x[[n]]) <- nms
>> }
>> for (n in names(x)) x[[n]][i] <- value[[n]]

>> where

>> Browse[2]> names(value)
>> [1] "sec"   "min"   "hour"  "mday"  "mon"   "year"  "wday"  "yday"  
"isdst"
>> Browse[2]> names(x)
>> [1] "sec""min""hour"   "mday"   "mon""year"   "wday"   
"yday"  
>> [9] "isdst"  "zone"   "gmtoff"

>> Without having looked at the code, the docs say

>> ‘zone’ (Optional.) The abbreviation for the time zone in force at
>> that time: ‘""’ if unknown (but ‘""’ might also be used for
>> UTC).

>> ‘gmtoff’ (Optional.) The offset in seconds from GMT: positive
>> values are East of the meridian.  Usually ‘NA’ if unknown,
>> but ‘0’ could mean unknown.

>> so perhaps we should fill with the values for the unknown case?

>> -k

> Well,

> I think you both know  I'm in the midst of dealing with these
> issues, to fix both

> [.POSIXlt  and
> [<-.POSIXlt

> Yes, one needs a way to not only "fill" the partially filled
> entries but also to *normalize* out-of-range values
> (say negative seconds, minutes > 60, etc)

> All this is available in our C code, but not on the R level,
> so yesterday, I wrote a C function to be called via .Internal(.)
> from a new R that provides this.

> Provisionally called

> balancePOSIXlt()

> because it both balances the 9 to 11 list-components of POSIXlt
> and it also puts all numbers of (sec, min, hour, mday, mon)
> into a correct range (and also computes correctl wday and yday numbers).
> but I'm happy for proposals of better names.
> I had contemplated  validatePOSIXlt() as alternative, but then
> dismissed that as in some sense we now do agree that
> "imbalanced" POSIXlt's are not really invalid ..

> .. and yes, to Davis:  Even though I've spent so many hours with
> POSIXlt, POSIXct and Date during the last week, I'm still
> surprised more often than I like by the effects of timezone
> settings there.

> Martin

I have committed the new R and C code now, defining  balancePOSIXlt(),
to get feedback from the community.

I've extended the documentation in  help(DateTimeClasses),
and notably factored out the description
of  POSIXlt  mentioning the  "ragged" and "out-of-range" cases.

This needs more testing and experiments, and I have not
announced it  NEWS  yet.

Planned next is to use it in  [.POSIXlt and [<-.POSIXlt
so they will work correctly.

But please share your thoughts, propositions, ...

Martin


>>> I can reproduce this now on my personal Mac:

>>> ```

>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))

>>> Sys.setenv(TZ = "")

>>> x[1] <- NA

>>> x

>>> #> [1] NA


>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))

>>> Sys.setenv(TZ = "America/New_York")

>>> x[1] <- NA

>>> x

>>> #> [1] NA


>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago"))

>>> Sys.setenv(TZ = "UTC")

>>> x[1] <- NA
>>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero

>>> x

>>> #> [1] "2013-01-31 CST"
>>> ```

>>> Here are `sessionInfo()` and `Sys.getenv("TZ")` outputs for 3 GitHub
>>> Actions platforms where the bug exists (note they all set `TZ = 
"UTC"`!):

>>> Linux:

>>> ```

 sessionInfo()

>>> R version 4.2.1 (2022-06-23)

>>> Platform: x86_64-pc-linux-gnu (64-bit)

>>> Running under: Ubuntu 18.04.6 LTS


>>> Matrix products: default

>>> BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3

>>> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so


>>> locale:

>>> [1] LC_CTYPE=C.UTF-8   LC_NUMERIC=C   LC_TIME=C.UTF-8

>>> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8LC_MESSAGES=C.UTF-8

>>> [7] LC_PAPER=C.UTF-8   LC_NAME=C  LC_ADDRESS=C

>>> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C


>>> attached base packages:

>>> [1] stats graphics  grDevices utils datasets  methods   base


>>> loaded via 

Re: [Rd] Question about grid.group compositing operators in cairo

2022-10-12 Thread Paul Murrell

Hi

This issue has expanded to include the behaviour of compositing 
operators in R graphics more generally.


For the record, the discussion is continuing here ...

https://github.com/pmur002/rgraphics-compositing

Paul

On 4/10/22 09:20, Paul Murrell wrote:


Interim update:  I have spoken with Thomas Lin Pedersen (cc'ed), the 
author/maintainer of 'ragg' and 'svglite', who is working on adding 
group support for those graphics devices and he has voted in support of 
the current Cairo implementation, so the needle has shifted towards 
Cairo at this stage.


I still want to do more tests on other devices to gather more evidence.

Paul

p.s.  Attached (if it makes it through the filters) is a manual 
modification of your original dsvg() example that has been changed so 
that it produces the Cairo result.  This is probably not exactly how you 
would want to implement the dsvg() solution, but it is at least a proof 
of concept that the Cairo result can be produced in SVG.


On 30/09/22 10:49, Paul Murrell wrote:

Hi

Some more thoughts ...

<1>
I said before that currently, dev->group() does this ...

[OVER] shape shape shape OP shape shape shape

... and one option would be an implicit group on 'src' and 'dst' like 
this ...


([OVER] shape shape shape) OP ([OVER] shape shape shape)

... but another approach could be just an implicit group on each 
shape, like this ...


[OVER] ([OVER] shape) ([OVER] shape) OP ([OVER] shape) ([OVER] shape)

That may be a better representation of what you are already doing with 
dsvg() ?  It may also better reflect what naturally occurs in some 
graphics systems.


<2>
Changing the Cairo implementation to work like that would I think 
produce the same result as your dsvg() for ...


grid.group(src, "in", dst)

... and it would make what constitutes more than one shape much less 
surprising ...


gList(rectGrob(), rectGrob())  ## multiple shapes (obviously)
rectGrob(width=1:2/2)  ## multiple shapes (less obvious)
rectGrob(gp=gpar(col=, fill=)) ## NOT multiple shapes (no surprise)

... and it should not break any pre-existing non-group behaviour.

<3>
One casualty from this third option would be that the following would 
no longer solve the overlapping fill and stroke problem ...


grid.group(overlapRect, "source")

... although the fact that that currently works is really a bit 
surprising AND that result could still be achieved by explicitly 
drawing separate shapes ...


grid.group(rectGrob(gp=gpar(col=rgb(1,0,0,.5), lwd=20, fill=NA)),
    "source",
    rectGrob(gp=gpar(col=NA, fill="green")))

<4>
I need to try some of this out and also check in with some other 
people who I think are working on implementing groups on different 
graphics devices.


<5>
In summary, don't go changing dsvg() too much just yet!

Paul

On 29/09/2022 1:30 pm, Paul Murrell wrote:

Hi

Would it work to explicitly record a filled-and-stroked shape as two 
separate elements (one only filled and one only stroked) ?


Then it should only be as hard to apply the active operator on both 
of those elements as it is to apply the active operator to more than 
one shape (?)


Paul

On 29/09/22 10:17, Panagiotis Skintzos wrote:

Thank you for the very thorough explanation Paul.

To answer your question on 11: The dsvg device, simply defines svg
elements with their attributes (rect with fill & stroke in my 
examples).

It does not do any internal image processing like cairo.

My concern is how to proceed with the implementation in dsvg.

If I leave it as it is now, they're will be cases where it will give
different results from cairo (and perhaps other devices that will
implement group compositing in similar way).

On the other hand It would be quite challenging in practice to simulate
the cairo implementation and apply first the fill and then the stroke
with the active operator, on the element itself.

Any suggestions? :-)

Panagiotis


On 28/9/22 02:56, Paul Murrell wrote:
 > Hi
 >
 > Thanks for the code (and for the previous attachments).
 >
 > Some thoughts so far (HTML version with images attached) ...
 >
 > <1>
 > As you have pointed out, the Cairo device draws a stroked-and-filled
 > shape with two separate drawing operations: the path is filled and
 > then the path is stroked.  I do not believe that there is any
 > alternative in Cairo graphics (apart from filling and stroking as an
 > isolated group and then drawing the group, which we will come 
back to).

 >
 > <2>
 > This fill-then-stroke approach is easy to demonstrate just with a 
thick

 > semitransparent border ...
 >
 > library(grid)
 > overlapRect <- rectGrob(width=.5, height=.5,
 >     gp=gpar(fill="green", lwd=20,
 >     col=rgb(1,0,0,.5)))
 > grid.newpage()
 > grid.draw(overlapRect)
 >
 > <3>
 > This fill-then-stroke approach is what happens on many (most?)
 > graphics devices, including, for example, the core windows() device,
 > the core quartz() device, the 'ragg'