>>>>> Martin Maechler >>>>> on Wed, 12 Oct 2022 10:17:28 +0200 writes:
>>>>> Kurt Hornik >>>>> on Tue, 11 Oct 2022 16:44:13 +0200 writes: >>>>> Davis Vaughan writes: >>> I've got a bit more information about this one. It seems like it >>> (only? not sure) appears when `TZ = "UTC"`, which is why I didn't see >>> it before on my Mac, which defaults to `TZ = ""`. I think this is at >>> least explainable by the fact that those "optional" fields aren't >>> technically needed when the time zone is UTC. >> Exactly. Debugging `[<-.POSIlt` with >> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >> Sys.setenv(TZ = "UTC") >> x[1] <- NA >> shows we get into >> value <- unclass(as.POSIXlt(value)) >> if (ici) { >> for (n in names(x)) names(x[[n]]) <- nms >> } >> for (n in names(x)) x[[n]][i] <- value[[n]] >> where >> Browse[2]> names(value) >> [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" "isdst" >> Browse[2]> names(x) >> [1] "sec" "min" "hour" "mday" "mon" "year" "wday" "yday" >> [9] "isdst" "zone" "gmtoff" >> Without having looked at the code, the docs say >> ‘zone’ (Optional.) The abbreviation for the time zone in force at >> that time: ‘""’ if unknown (but ‘""’ might also be used for >> UTC). >> ‘gmtoff’ (Optional.) The offset in seconds from GMT: positive >> values are East of the meridian. Usually ‘NA’ if unknown, >> but ‘0’ could mean unknown. >> so perhaps we should fill with the values for the unknown case? >> -k > Well, > I think you both know I'm in the midst of dealing with these > issues, to fix both > [.POSIXlt and > [<-.POSIXlt > Yes, one needs a way to not only "fill" the partially filled > entries but also to *normalize* out-of-range values > (say negative seconds, minutes > 60, etc) > All this is available in our C code, but not on the R level, > so yesterday, I wrote a C function to be called via .Internal(.) > from a new R that provides this. > Provisionally called > balancePOSIXlt() > because it both balances the 9 to 11 list-components of POSIXlt > and it also puts all numbers of (sec, min, hour, mday, mon) > into a correct range (and also computes correctl wday and yday numbers). > but I'm happy for proposals of better names. > I had contemplated validatePOSIXlt() as alternative, but then > dismissed that as in some sense we now do agree that > "imbalanced" POSIXlt's are not really invalid .. > .. and yes, to Davis: Even though I've spent so many hours with > POSIXlt, POSIXct and Date during the last week, I'm still > surprised more often than I like by the effects of timezone > settings there. > Martin I have committed the new R and C code now, defining balancePOSIXlt(), to get feedback from the community. I've extended the documentation in help(DateTimeClasses), and notably factored out the description of POSIXlt mentioning the "ragged" and "out-of-range" cases. This needs more testing and experiments, and I have not announced it NEWS yet. Planned next is to use it in [.POSIXlt and [<-.POSIXlt so they will work correctly. But please share your thoughts, propositions, ... Martin >>> I can reproduce this now on my personal Mac: >>> ``` >>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >>> Sys.setenv(TZ = "") >>> x[1] <- NA >>> x >>> #> [1] NA >>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >>> Sys.setenv(TZ = "America/New_York") >>> x[1] <- NA >>> x >>> #> [1] NA >>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >>> Sys.setenv(TZ = "UTC") >>> x[1] <- NA >>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero >>> x >>> #> [1] "2013-01-31 CST" >>> ``` >>> Here are `sessionInfo()` and `Sys.getenv("TZ")` outputs for 3 GitHub >>> Actions platforms where the bug exists (note they all set `TZ = "UTC"`!): >>> Linux: >>> ``` >>>> sessionInfo() >>> R version 4.2.1 (2022-06-23) >>> Platform: x86_64-pc-linux-gnu (64-bit) >>> Running under: Ubuntu 18.04.6 LTS >>> Matrix products: default >>> BLAS: /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3 >>> LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so >>> locale: >>> [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8 >>> [4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8 >>> [7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C >>> [10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> loaded via a namespace (and not attached): >>> [1] compiler_4.2.1 >>>> Sys.getenv("TZ") >>> [1] "UTC" >>> ``` >>> Mac: >>> ``` >>>> sessionInfo() >>> R version 4.2.1 (2022-06-23) >>> Platform: x86_64-apple-darwin17.0 (64-bit) >>> Running under: macOS Big Sur ... 10.16 >>> Matrix products: default >>> BLAS: >>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib >>> LAPACK: >>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> loaded via a namespace (and not attached): >>> [1] compiler_4.2.1 >>>> Sys.getenv("TZ") >>> [1] "UTC" >>> ``` >>> Windows: >>> This is the best I can get you, sorry (remote worker issues), but note that >>> it does also say `tz UTC` like the others. >>> ``` >>> version R version 4.2.1 (2022-06-23 ucrt) >>> os Windows Server x64 (build 20348) >>> system x86_64, mingw32 >>> ui RTerm >>> language (EN) >>> collate English_United States.utf8 >>> ctype English_United States.utf8 >>> tz UTC >>> date 2022-10-11 >>> ``` >>> And here is my Mac where the bug doesn't show up by default because `TZ = >>> ""`: >>> ``` >>>> sessionInfo() >>> R version 4.2.1 (2022-06-23) >>> Platform: x86_64-apple-darwin17.0 (64-bit) >>> Running under: macOS Big Sur ... 10.16 >>> Matrix products: default >>> BLAS: >>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib >>> LAPACK: >>> /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib >>> locale: >>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 >>> attached base packages: >>> [1] stats graphics grDevices utils datasets methods base >>> loaded via a namespace (and not attached): >>> [1] compiler_4.2.1 >>>> Sys.getenv("TZ") >>> [1] "" >>>> Sys.timezone() >>> [1] "America/New_York" >>> ``` >>> -Davis >>> On Thu, Oct 6, 2022 at 9:33 AM Davis Vaughan <da...@rstudio.com> wrote: >>>> Hi all, >>>> >>>> I have found another POSIXlt bug while I've been fiddling around with it. >>>> This one only appears on specific OSes, because it has to do with the fact >>>> that the `gmtoff` field is optional, and isn't always used on all OSes. It >>>> also doesn't seem to be specific to r-devel, I think it has been there >>>> awhile. >>>> >>>> Here is the bug: >>>> >>>> ``` >>>> x <- as.POSIXlt(as.POSIXct("2013-01-31", tz = "America/Chicago")) >>>> >>>> # Oh no! >>>> x[1] <- NA >>>> #> Error in x[[n]][i] <- value[[n]] : replacement has length zero >>>> ``` >>>> >>>> If you look at the objects, you can see that `x` has a `gmtoff` field, but >>>> `NA` (when converted to POSIXlt, which is what `[<-.POSIXlt` does) does not: >>>> >>>> ``` >>>> unclass(x) >>>> #> $sec >>>> #> [1] 0 >>>> #> >>>> #> $min >>>> #> [1] 0 >>>> #> >>>> #> $hour >>>> #> [1] 0 >>>> #> >>>> #> $mday >>>> #> [1] 31 >>>> #> >>>> #> $mon >>>> #> [1] 0 >>>> #> >>>> #> $year >>>> #> [1] 113 >>>> #> >>>> #> $wday >>>> #> [1] 4 >>>> #> >>>> #> $yday >>>> #> [1] 30 >>>> #> >>>> #> $isdst >>>> #> [1] 0 >>>> #> >>>> #> $zone >>>> #> [1] "CST" >>>> #> >>>> #> $gmtoff >>>> #> [1] -21600 >>>> #> >>>> #> attr(,"tzone") >>>> #> [1] "America/Chicago" "CST" "CDT" >>>> >>>> unclass(as.POSIXlt(NA)) >>>> #> $sec >>>> #> [1] NA >>>> #> >>>> #> $min >>>> #> [1] NA >>>> #> >>>> #> $hour >>>> #> [1] NA >>>> #> >>>> #> $mday >>>> #> [1] NA >>>> #> >>>> #> $mon >>>> #> [1] NA >>>> #> >>>> #> $year >>>> #> [1] NA >>>> #> >>>> #> $wday >>>> #> [1] NA >>>> #> >>>> #> $yday >>>> #> [1] NA >>>> #> >>>> #> $isdst >>>> #> [1] -1 >>>> #> >>>> #> attr(,"tzone") >>>> #> [1] "UTC" >>>> ``` >>>> >>>> The problem seems to be that `[<-.POSIXlt` assumes that if the field was >>>> there in `x` then it must also be there in `value`: >>>> >>>> https://github.com/wch/r-source/blob/e10a971dee6a0ab851279c183cc21954d66b3be4/src/library/base/R/datetime.R#L1303-L1304 >>>> >>>> But this isn't the case for the `NA` value that was converted to POSIXlt. >>>> >>>> I can't reproduce this on my personal Mac, but it affects the Linux, Mac, >>>> and Windows machines we use for the lubridate CI checks through GitHub >>>> Actions. >>>> >>>> Thanks, >>>> Davis >>>> > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel