date:20210923

Re: [Rd] R-devel: as.character() for hexmode no longer pads with zeros

2021-09-23 Thread Martin Maechler

> Henrik Bengtsson 
> on Wed, 22 Sep 2021 20:48:05 -0700 writes:

> The update in rev 80946
> 
(https://github.com/wch/r-source/commit/d970867722e14811e8ba6b0ba8e0f478ff482f5e)
> caused as.character() on hexmode objects to no longer pads with zeros.

Yes -- very much on purpose; by me, after discussing a related issue
within R-core which showed "how wrong" the previous (current R)
behavior of the as.character() method is for
hexmode and octmode objects :

If you look at the whole rev 80946 , you also read NEWS

 * as.character() for "hexmode" or "octmode" objects now
   fulfills the important basic rule

  as.character(x)[j] === as.character(x[j]) 
  ^

rather than just calling format().

The format() generic (notably for "atomic-alike" objects) should indeed
return a character vector where each string has the same "width",
however, the result of  as.character(x) --- at least for all
"atomic-alike" / "vector-alike" objects --
for a single x[j] should not be influenced by other elements in x.




> Before:

>> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode")
>> x
> [1] "00" "08" "10" "18" "20"
>> as.character(x)
> [1] "00" "08" "10" "18" "20"

> After:

>> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode")
>> x
> [1] "00" "08" "10" "18" "20"
>> as.character(x)
> [1] "0"  "8"  "10" "18" "20"

> Was that intended?

Yes!
You have to explore your example a bit to notice how "illogical"
the behavior before was:

> as.character(as.hexmode(0:15))
 [1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "a" "b" "c" "d" "e" "f"
> as.character(as.hexmode(0:16))
 [1] "00" "01" "02" "03" "04" "05" "06" "07" "08" "09" "0a" "0b" "0c" "0d" "0e"
[16] "0f" "10"

> as.character(as.hexmode(16^(0:2)))
[1] "001" "010" "100"
> as.character(as.hexmode(16^(0:3)))
[1] "0001" "0010" "0100" "1000"
> as.character(as.hexmode(16^(0:4)))
[1] "1" "00010" "00100" "01000" "1"

all breaking the rule in the NEWS  and given above.

If you want format()  you should use format(),
but as.character() should never have used format() ..

Martin

> /Henrik

> PS. This breaks R.utils::intToHex()
> [https://cran.r-project.org/web/checks/check_results_R.utils.html]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-devel: as.character() for hexmode no longer pads with zeros

2021-09-23 Thread Henrik Bengtsson

Thanks for confirming and giving details on the rationale (... and
I'll updated R.utils to use format() instead).

Regarding as.character(x)[j] === as.character(x[j]): I agree with this
- is that property of as.character()/subsetting explicitly
stated/documented somewhere?  I wonder if this is a property we should
all strive for for other types of objects?

/Henrik

On Thu, Sep 23, 2021 at 12:46 AM Martin Maechler
 wrote:
>
> > Henrik Bengtsson
> > on Wed, 22 Sep 2021 20:48:05 -0700 writes:
>
> > The update in rev 80946
> > 
> (https://github.com/wch/r-source/commit/d970867722e14811e8ba6b0ba8e0f478ff482f5e)
> > caused as.character() on hexmode objects to no longer pads with zeros.
>
> Yes -- very much on purpose; by me, after discussing a related issue
> within R-core which showed "how wrong" the previous (current R)
> behavior of the as.character() method is for
> hexmode and octmode objects :
>
> If you look at the whole rev 80946 , you also read NEWS
>
>  * as.character() for "hexmode" or "octmode" objects now
>fulfills the important basic rule
>
>   as.character(x)[j] === as.character(x[j])
>   ^
>
> rather than just calling format().
>
> The format() generic (notably for "atomic-alike" objects) should indeed
> return a character vector where each string has the same "width",
> however, the result of  as.character(x) --- at least for all
> "atomic-alike" / "vector-alike" objects --
> for a single x[j] should not be influenced by other elements in x.
>
>
>
>
> > Before:
>
> >> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode")
> >> x
> > [1] "00" "08" "10" "18" "20"
> >> as.character(x)
> > [1] "00" "08" "10" "18" "20"
>
> > After:
>
> >> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode")
> >> x
> > [1] "00" "08" "10" "18" "20"
> >> as.character(x)
> > [1] "0"  "8"  "10" "18" "20"
>
> > Was that intended?
>
> Yes!
> You have to explore your example a bit to notice how "illogical"
> the behavior before was:
>
> > as.character(as.hexmode(0:15))
>  [1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "a" "b" "c" "d" "e" "f"
> > as.character(as.hexmode(0:16))
>  [1] "00" "01" "02" "03" "04" "05" "06" "07" "08" "09" "0a" "0b" "0c" "0d" 
> "0e"
> [16] "0f" "10"
>
> > as.character(as.hexmode(16^(0:2)))
> [1] "001" "010" "100"
> > as.character(as.hexmode(16^(0:3)))
> [1] "0001" "0010" "0100" "1000"
> > as.character(as.hexmode(16^(0:4)))
> [1] "1" "00010" "00100" "01000" "1"
>
> all breaking the rule in the NEWS  and given above.
>
> If you want format()  you should use format(),
> but as.character() should never have used format() ..
>
> Martin
>
> > /Henrik
>
> > PS. This breaks R.utils::intToHex()
> > [https://cran.r-project.org/web/checks/check_results_R.utils.html]
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Detect UCRT-built R from within R sessions (and in configure.win)

2021-09-23 Thread Tomas Kalibera




On 9/20/21 11:03 AM, Hiroaki Yutani wrote:

I tried to use configure.ucrt, and found it results in the following
NOTE on the released version of R, unfortunately.

 * checking top-level files ... NOTE
 Non-standard file/directory found at top level:
 'configure.ucrt'

Will this be accepted by CRAN if I submit a package that contains
configure.ucrt? Or, is it too early to use it in a CRAN package?


Thanks, that's right, so I've ported this part to R-devel and R-patched, 
configure.ucrt and cleanup.ucrt will be treated as "standard". There is 
nothing we can do about already released versions, the NOTE will appear.


You can also use configure.win and branch on R.version$crt, e.g.

!is.null(R.version$crt) && R.version$crt == "ucrt"

or

identical(R.version$crt, "ucrt")


In either case, while I don't have a strong opinion here, I'm starting
to feel that it might be preferable to provide an environmental
variable rather than creating ".ucrt" versions of files. In my
understanding, the plan is to switch all the Windows R to UCRT at some
point in future. But, it's not clear to me how to unify these ".win"
files and ".ucrt" files smoothly.


With R.version$crt, you can already get a make (or even environment) 
variable. Writing R Extensions has examples how to invoke R in make 
files to get "R CMD config" values, so here you would invoke "Rscript" 
instead with one of the conditions above.


Either is fine. With .ucrt files, you can avoid copy pasting of common 
code using "include" directives. With the variable, you can use make 
conditionals. As you found now, with the variable you have the advantage 
of not getting a NOTE with already released versions of R. The .ucrt 
files are easier to maintain in hot-patches, but that is not an 
advantage for package authors.


Once a package depends on a version of R that will already use UCRT, one 
either would refactor/remove the conditionals, or integrate the ".ucrt" 
files back into the ".win". So, in the long term, there should be no 
conditionals on R.version$crt nor ".ucrt" files.


Best
Tomas

Best,
Hiroaki Yutani

2021年9月14日(火) 23:44 Hiroaki Yutani :


Thanks for both, I'll try these features.

2021年9月14日(火) 22:40 Tomas Kalibera :



On 9/9/21 5:54 AM, Hiroaki Yutani wrote:

Thank you for the prompt reply.


There in not such a mechanism, yet, but can be added, at least for
diagnostics.

For example, can R.version somehow contain the information?

Yes, now added to the experimental builds. R.version$crt contains "ucrt" (and would 
contain "msvcrt" if R was built against MSVCRT).



We could add support for configure.ucrt, which would take precedence
over configure.win on the UCRT builds (like Makevars.ucrt takes
precedence over Makevars.win). Would that work for you?

Yes, configure.ucrt should work for me. There might be someone who prefers to 
switch by some envvar rather than creating another file, but I don't have a 
strong opinion here.

The experimental builds now support configure.ucrt and cleanup.ucrt files.

Best
Tomas


Best,
Hiroaki Yutani

2021年9月9日(木) 0:48 Tomas Kalibera :


On 9/8/21 2:08 PM, Hiroaki Yutani wrote:

Hi,

Are there any proper ways to know whether the session is running on
the R that is built with the UCRT toolchain or not? Checking if the
encoding is UTF-8 might do the trick, but I'm not sure if it's always
reliable.

There in not such a mechanism, yet, but can be added, at least for
diagnostics.

You are right that checking for UTF-8 encoding would not always be
reliable. For example, the version of Windows may be too old to allow R
use UTF-8 as native encoding (e.g. Windows server 2016), then R will use
the native code page as it does today in the MSVCRT builds.


Also, I'd like to know if there's any mechanism to detect the UCRT in
configure.win. I know there are Makevars.ucrt and Makefile.ucrt, but
one might want to do some feature test that is specific to the UCRT
toolchain.

We could add support for configure.ucrt, which would take precedence
over configure.win on the UCRT builds (like Makevars.ucrt takes
precedence over Makevars.win). Would that work for you?

Best
Tomas


Best,
Hiroaki Yutani

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Detect UCRT-built R from within R sessions (and in configure.win)

2021-09-23 Thread Hiroaki Yutani

> Thanks, that's right, so I've ported this part to R-devel and R-patched,

I noticed R-devel no longer complains about this from a while ago, thanks.

> With R.version$crt, you can already get a make (or even environment)
> variable. Writing R Extensions has examples how to invoke R in make
> files to get "R CMD config" values, so here you would invoke "Rscript"
> instead with one of the conditions above.

This slipped my mind, thanks for pointing it out! Yes, this works
perfectly without configure.ucrt. I will stick with this at least for
a while until the next version of R gets released.

> ... The .ucrt
> files are easier to maintain in hot-patches, but that is not an
> advantage for package authors.

I see, I think now I get your point. So, even if all the package
authors would choose to use the Rscript way, the .ucrt files would be
still needed to make room for the (R? or CRAN?) maintainers to
hot-patch the packages that don't work on UCRT nicely. Thanks for all
the efforts to make the UCRT R a reality.

Best,
Hiroaki Yutani

2021年9月24日(金) 2:16 Tomas Kalibera :


>
>
> On 9/20/21 11:03 AM, Hiroaki Yutani wrote:
> > I tried to use configure.ucrt, and found it results in the following
> > NOTE on the released version of R, unfortunately.
> >
> >  * checking top-level files ... NOTE
> >  Non-standard file/directory found at top level:
> >  'configure.ucrt'
> >
> > Will this be accepted by CRAN if I submit a package that contains
> > configure.ucrt? Or, is it too early to use it in a CRAN package?
>
> Thanks, that's right, so I've ported this part to R-devel and R-patched,
> configure.ucrt and cleanup.ucrt will be treated as "standard". There is
> nothing we can do about already released versions, the NOTE will appear.
>
> You can also use configure.win and branch on R.version$crt, e.g.
>
> !is.null(R.version$crt) && R.version$crt == "ucrt"
>
> or
>
> identical(R.version$crt, "ucrt")
>
> > In either case, while I don't have a strong opinion here, I'm starting
> > to feel that it might be preferable to provide an environmental
> > variable rather than creating ".ucrt" versions of files. In my
> > understanding, the plan is to switch all the Windows R to UCRT at some
> > point in future. But, it's not clear to me how to unify these ".win"
> > files and ".ucrt" files smoothly.
>
> With R.version$crt, you can already get a make (or even environment)
> variable. Writing R Extensions has examples how to invoke R in make
> files to get "R CMD config" values, so here you would invoke "Rscript"
> instead with one of the conditions above.
>
> Either is fine. With .ucrt files, you can avoid copy pasting of common
> code using "include" directives. With the variable, you can use make
> conditionals. As you found now, with the variable you have the advantage
> of not getting a NOTE with already released versions of R. The .ucrt
> files are easier to maintain in hot-patches, but that is not an
> advantage for package authors.
>
> Once a package depends on a version of R that will already use UCRT, one
> either would refactor/remove the conditionals, or integrate the ".ucrt"
> files back into the ".win". So, in the long term, there should be no
> conditionals on R.version$crt nor ".ucrt" files.
>
> Best
> Tomas
> > Best,
> > Hiroaki Yutani
> >
> > 2021年9月14日(火) 23:44 Hiroaki Yutani :
> >
> >> Thanks for both, I'll try these features.
> >>
> >> 2021年9月14日(火) 22:40 Tomas Kalibera :
> >>
> >>>
> >>> On 9/9/21 5:54 AM, Hiroaki Yutani wrote:
> >>>
> >>> Thank you for the prompt reply.
> >>>
>  There in not such a mechanism, yet, but can be added, at least for
>  diagnostics.
> >>> For example, can R.version somehow contain the information?
> >>>
> >>> Yes, now added to the experimental builds. R.version$crt contains "ucrt" 
> >>> (and would contain "msvcrt" if R was built against MSVCRT).
> >>>
> >>>
>  We could add support for configure.ucrt, which would take precedence
>  over configure.win on the UCRT builds (like Makevars.ucrt takes
>  precedence over Makevars.win). Would that work for you?
> >>> Yes, configure.ucrt should work for me. There might be someone who 
> >>> prefers to switch by some envvar rather than creating another file, but I 
> >>> don't have a strong opinion here.
> >>>
> >>> The experimental builds now support configure.ucrt and cleanup.ucrt files.
> >>>
> >>> Best
> >>> Tomas
> >>>
> >>>
> >>> Best,
> >>> Hiroaki Yutani
> >>>
> >>> 2021年9月9日(木) 0:48 Tomas Kalibera :
> 
>  On 9/8/21 2:08 PM, Hiroaki Yutani wrote:
> > Hi,
> >
> > Are there any proper ways to know whether the session is running on
> > the R that is built with the UCRT toolchain or not? Checking if the
> > encoding is UTF-8 might do the trick, but I'm not sure if it's always
> > reliable.
>  There in not such a mechanism, yet, but can be added, at least for
>  diagnostics.
> 
>  You are right that checking for UTF-8 encoding would not always be
>  reliabl

Re: [Rd] R-devel: as.character() for hexmode no longer pads with zeros

Re: [Rd] R-devel: as.character() for hexmode no longer pads with zeros

Re: [Rd] Detect UCRT-built R from within R sessions (and in configure.win)

Re: [Rd] Detect UCRT-built R from within R sessions (and in configure.win)

4 matches

Site Navigation

Mail list logo

Footer information