Re: [Rd] R-devel: as.character() for hexmode no longer pads with zeros
> Henrik Bengtsson > on Wed, 22 Sep 2021 20:48:05 -0700 writes: > The update in rev 80946 > (https://github.com/wch/r-source/commit/d970867722e14811e8ba6b0ba8e0f478ff482f5e) > caused as.character() on hexmode objects to no longer pads with zeros. Yes -- very much on purpose; by me, after discussing a related issue within R-core which showed "how wrong" the previous (current R) behavior of the as.character() method is for hexmode and octmode objects : If you look at the whole rev 80946 , you also read NEWS * as.character() for "hexmode" or "octmode" objects now fulfills the important basic rule as.character(x)[j] === as.character(x[j]) ^ rather than just calling format(). The format() generic (notably for "atomic-alike" objects) should indeed return a character vector where each string has the same "width", however, the result of as.character(x) --- at least for all "atomic-alike" / "vector-alike" objects -- for a single x[j] should not be influenced by other elements in x. > Before: >> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode") >> x > [1] "00" "08" "10" "18" "20" >> as.character(x) > [1] "00" "08" "10" "18" "20" > After: >> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode") >> x > [1] "00" "08" "10" "18" "20" >> as.character(x) > [1] "0" "8" "10" "18" "20" > Was that intended? Yes! You have to explore your example a bit to notice how "illogical" the behavior before was: > as.character(as.hexmode(0:15)) [1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "a" "b" "c" "d" "e" "f" > as.character(as.hexmode(0:16)) [1] "00" "01" "02" "03" "04" "05" "06" "07" "08" "09" "0a" "0b" "0c" "0d" "0e" [16] "0f" "10" > as.character(as.hexmode(16^(0:2))) [1] "001" "010" "100" > as.character(as.hexmode(16^(0:3))) [1] "0001" "0010" "0100" "1000" > as.character(as.hexmode(16^(0:4))) [1] "1" "00010" "00100" "01000" "1" all breaking the rule in the NEWS and given above. If you want format() you should use format(), but as.character() should never have used format() .. Martin > /Henrik > PS. This breaks R.utils::intToHex() > [https://cran.r-project.org/web/checks/check_results_R.utils.html] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] R-devel: as.character() for hexmode no longer pads with zeros
Thanks for confirming and giving details on the rationale (... and I'll updated R.utils to use format() instead). Regarding as.character(x)[j] === as.character(x[j]): I agree with this - is that property of as.character()/subsetting explicitly stated/documented somewhere? I wonder if this is a property we should all strive for for other types of objects? /Henrik On Thu, Sep 23, 2021 at 12:46 AM Martin Maechler wrote: > > > Henrik Bengtsson > > on Wed, 22 Sep 2021 20:48:05 -0700 writes: > > > The update in rev 80946 > > > (https://github.com/wch/r-source/commit/d970867722e14811e8ba6b0ba8e0f478ff482f5e) > > caused as.character() on hexmode objects to no longer pads with zeros. > > Yes -- very much on purpose; by me, after discussing a related issue > within R-core which showed "how wrong" the previous (current R) > behavior of the as.character() method is for > hexmode and octmode objects : > > If you look at the whole rev 80946 , you also read NEWS > > * as.character() for "hexmode" or "octmode" objects now >fulfills the important basic rule > > as.character(x)[j] === as.character(x[j]) > ^ > > rather than just calling format(). > > The format() generic (notably for "atomic-alike" objects) should indeed > return a character vector where each string has the same "width", > however, the result of as.character(x) --- at least for all > "atomic-alike" / "vector-alike" objects -- > for a single x[j] should not be influenced by other elements in x. > > > > > > Before: > > >> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode") > >> x > > [1] "00" "08" "10" "18" "20" > >> as.character(x) > > [1] "00" "08" "10" "18" "20" > > > After: > > >> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode") > >> x > > [1] "00" "08" "10" "18" "20" > >> as.character(x) > > [1] "0" "8" "10" "18" "20" > > > Was that intended? > > Yes! > You have to explore your example a bit to notice how "illogical" > the behavior before was: > > > as.character(as.hexmode(0:15)) > [1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "a" "b" "c" "d" "e" "f" > > as.character(as.hexmode(0:16)) > [1] "00" "01" "02" "03" "04" "05" "06" "07" "08" "09" "0a" "0b" "0c" "0d" > "0e" > [16] "0f" "10" > > > as.character(as.hexmode(16^(0:2))) > [1] "001" "010" "100" > > as.character(as.hexmode(16^(0:3))) > [1] "0001" "0010" "0100" "1000" > > as.character(as.hexmode(16^(0:4))) > [1] "1" "00010" "00100" "01000" "1" > > all breaking the rule in the NEWS and given above. > > If you want format() you should use format(), > but as.character() should never have used format() .. > > Martin > > > /Henrik > > > PS. This breaks R.utils::intToHex() > > [https://cran.r-project.org/web/checks/check_results_R.utils.html] > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Detect UCRT-built R from within R sessions (and in configure.win)
On 9/20/21 11:03 AM, Hiroaki Yutani wrote: I tried to use configure.ucrt, and found it results in the following NOTE on the released version of R, unfortunately. * checking top-level files ... NOTE Non-standard file/directory found at top level: 'configure.ucrt' Will this be accepted by CRAN if I submit a package that contains configure.ucrt? Or, is it too early to use it in a CRAN package? Thanks, that's right, so I've ported this part to R-devel and R-patched, configure.ucrt and cleanup.ucrt will be treated as "standard". There is nothing we can do about already released versions, the NOTE will appear. You can also use configure.win and branch on R.version$crt, e.g. !is.null(R.version$crt) && R.version$crt == "ucrt" or identical(R.version$crt, "ucrt") In either case, while I don't have a strong opinion here, I'm starting to feel that it might be preferable to provide an environmental variable rather than creating ".ucrt" versions of files. In my understanding, the plan is to switch all the Windows R to UCRT at some point in future. But, it's not clear to me how to unify these ".win" files and ".ucrt" files smoothly. With R.version$crt, you can already get a make (or even environment) variable. Writing R Extensions has examples how to invoke R in make files to get "R CMD config" values, so here you would invoke "Rscript" instead with one of the conditions above. Either is fine. With .ucrt files, you can avoid copy pasting of common code using "include" directives. With the variable, you can use make conditionals. As you found now, with the variable you have the advantage of not getting a NOTE with already released versions of R. The .ucrt files are easier to maintain in hot-patches, but that is not an advantage for package authors. Once a package depends on a version of R that will already use UCRT, one either would refactor/remove the conditionals, or integrate the ".ucrt" files back into the ".win". So, in the long term, there should be no conditionals on R.version$crt nor ".ucrt" files. Best Tomas Best, Hiroaki Yutani 2021年9月14日(火) 23:44 Hiroaki Yutani : Thanks for both, I'll try these features. 2021年9月14日(火) 22:40 Tomas Kalibera : On 9/9/21 5:54 AM, Hiroaki Yutani wrote: Thank you for the prompt reply. There in not such a mechanism, yet, but can be added, at least for diagnostics. For example, can R.version somehow contain the information? Yes, now added to the experimental builds. R.version$crt contains "ucrt" (and would contain "msvcrt" if R was built against MSVCRT). We could add support for configure.ucrt, which would take precedence over configure.win on the UCRT builds (like Makevars.ucrt takes precedence over Makevars.win). Would that work for you? Yes, configure.ucrt should work for me. There might be someone who prefers to switch by some envvar rather than creating another file, but I don't have a strong opinion here. The experimental builds now support configure.ucrt and cleanup.ucrt files. Best Tomas Best, Hiroaki Yutani 2021年9月9日(木) 0:48 Tomas Kalibera : On 9/8/21 2:08 PM, Hiroaki Yutani wrote: Hi, Are there any proper ways to know whether the session is running on the R that is built with the UCRT toolchain or not? Checking if the encoding is UTF-8 might do the trick, but I'm not sure if it's always reliable. There in not such a mechanism, yet, but can be added, at least for diagnostics. You are right that checking for UTF-8 encoding would not always be reliable. For example, the version of Windows may be too old to allow R use UTF-8 as native encoding (e.g. Windows server 2016), then R will use the native code page as it does today in the MSVCRT builds. Also, I'd like to know if there's any mechanism to detect the UCRT in configure.win. I know there are Makevars.ucrt and Makefile.ucrt, but one might want to do some feature test that is specific to the UCRT toolchain. We could add support for configure.ucrt, which would take precedence over configure.win on the UCRT builds (like Makevars.ucrt takes precedence over Makevars.win). Would that work for you? Best Tomas Best, Hiroaki Yutani __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Detect UCRT-built R from within R sessions (and in configure.win)
> Thanks, that's right, so I've ported this part to R-devel and R-patched, I noticed R-devel no longer complains about this from a while ago, thanks. > With R.version$crt, you can already get a make (or even environment) > variable. Writing R Extensions has examples how to invoke R in make > files to get "R CMD config" values, so here you would invoke "Rscript" > instead with one of the conditions above. This slipped my mind, thanks for pointing it out! Yes, this works perfectly without configure.ucrt. I will stick with this at least for a while until the next version of R gets released. > ... The .ucrt > files are easier to maintain in hot-patches, but that is not an > advantage for package authors. I see, I think now I get your point. So, even if all the package authors would choose to use the Rscript way, the .ucrt files would be still needed to make room for the (R? or CRAN?) maintainers to hot-patch the packages that don't work on UCRT nicely. Thanks for all the efforts to make the UCRT R a reality. Best, Hiroaki Yutani 2021年9月24日(金) 2:16 Tomas Kalibera : > > > On 9/20/21 11:03 AM, Hiroaki Yutani wrote: > > I tried to use configure.ucrt, and found it results in the following > > NOTE on the released version of R, unfortunately. > > > > * checking top-level files ... NOTE > > Non-standard file/directory found at top level: > > 'configure.ucrt' > > > > Will this be accepted by CRAN if I submit a package that contains > > configure.ucrt? Or, is it too early to use it in a CRAN package? > > Thanks, that's right, so I've ported this part to R-devel and R-patched, > configure.ucrt and cleanup.ucrt will be treated as "standard". There is > nothing we can do about already released versions, the NOTE will appear. > > You can also use configure.win and branch on R.version$crt, e.g. > > !is.null(R.version$crt) && R.version$crt == "ucrt" > > or > > identical(R.version$crt, "ucrt") > > > In either case, while I don't have a strong opinion here, I'm starting > > to feel that it might be preferable to provide an environmental > > variable rather than creating ".ucrt" versions of files. In my > > understanding, the plan is to switch all the Windows R to UCRT at some > > point in future. But, it's not clear to me how to unify these ".win" > > files and ".ucrt" files smoothly. > > With R.version$crt, you can already get a make (or even environment) > variable. Writing R Extensions has examples how to invoke R in make > files to get "R CMD config" values, so here you would invoke "Rscript" > instead with one of the conditions above. > > Either is fine. With .ucrt files, you can avoid copy pasting of common > code using "include" directives. With the variable, you can use make > conditionals. As you found now, with the variable you have the advantage > of not getting a NOTE with already released versions of R. The .ucrt > files are easier to maintain in hot-patches, but that is not an > advantage for package authors. > > Once a package depends on a version of R that will already use UCRT, one > either would refactor/remove the conditionals, or integrate the ".ucrt" > files back into the ".win". So, in the long term, there should be no > conditionals on R.version$crt nor ".ucrt" files. > > Best > Tomas > > Best, > > Hiroaki Yutani > > > > 2021年9月14日(火) 23:44 Hiroaki Yutani : > > > >> Thanks for both, I'll try these features. > >> > >> 2021年9月14日(火) 22:40 Tomas Kalibera : > >> > >>> > >>> On 9/9/21 5:54 AM, Hiroaki Yutani wrote: > >>> > >>> Thank you for the prompt reply. > >>> > There in not such a mechanism, yet, but can be added, at least for > diagnostics. > >>> For example, can R.version somehow contain the information? > >>> > >>> Yes, now added to the experimental builds. R.version$crt contains "ucrt" > >>> (and would contain "msvcrt" if R was built against MSVCRT). > >>> > >>> > We could add support for configure.ucrt, which would take precedence > over configure.win on the UCRT builds (like Makevars.ucrt takes > precedence over Makevars.win). Would that work for you? > >>> Yes, configure.ucrt should work for me. There might be someone who > >>> prefers to switch by some envvar rather than creating another file, but I > >>> don't have a strong opinion here. > >>> > >>> The experimental builds now support configure.ucrt and cleanup.ucrt files. > >>> > >>> Best > >>> Tomas > >>> > >>> > >>> Best, > >>> Hiroaki Yutani > >>> > >>> 2021年9月9日(木) 0:48 Tomas Kalibera : > > On 9/8/21 2:08 PM, Hiroaki Yutani wrote: > > Hi, > > > > Are there any proper ways to know whether the session is running on > > the R that is built with the UCRT toolchain or not? Checking if the > > encoding is UTF-8 might do the trick, but I'm not sure if it's always > > reliable. > There in not such a mechanism, yet, but can be added, at least for > diagnostics. > > You are right that checking for UTF-8 encoding would not always be > reliabl