from:"Jake Elmstedt"

[Rd] Locking of base environment in R 4.1.0 breaks simple assignment of .First() (etc) from Rprofile.site

2021-05-24 Thread Jake Elmstedt

Commits 80162 and 80163 lock the base environment and namespace during
startup, leading to an error when attempting to directly assign anything
from within Rprofile.site. While this is intentional and good, the help
file has not been updated to reflect this change.

Startup.Rd ( Description, paragraph 3) reads,

> ...This code is sourced into the base package. Users need to be careful
not to unintentionally overwrite objects in base, and it is normally
advisable to use local if code needs to be executed: see the examples.

Since the base environment and namespace are locked as of 4.1.0, I
recommend this be edited to something to the effect of:

> Prior to R 4.1.0, this code is sourced into the base package. Users need
to be careful not to unintentionally overwrite objects in base, and it is
normally advisable to use local if code needs to be executed: see the
examples. As of R 4.1.0 the base environment and namespace are locked and
any attempted direct assignment will fail with an error, and none of the
subsequent commands will be invoked. Users migrating to R 4.1.0 and above
can edit assignments made in Rprofile.site from the form `x <- value` to
`assign("x", value, envir = globalenv())`. Common uses of this may include
the binding of functions `.First` and `.Last`.

Description paragraph 8,

> "A function .First (and .Last) can be defined in appropriate ‘.Rprofile’
or ‘Rprofile.site’ files..."

which might be misleading to users not well-versed in the intricacies of
the R startup process. Should this be edited to indicate variable
assignment from Rprofile.site must be done using assign() to bind them to
globalenv()? Something to the effect of,

> A function .First (and .Last) can be defined in appropriate ‘.Rprofile’
or ‘Rprofile.site’ files (note: as of R 4.1.0, assignments from
Rprofile.site must target the global environment, e.g. use
`assign(".First", function(){}, envir = globalenv())` rather than `.First
<- function(){}`) or have been saved in ‘.RData’.

I also recommend adding to Examples, under the Example of Rprofile.site
section, something to the effect of,

# Setting .First and .Last from within Rprofile.site requires
# assignment into the global environment for R versions
# 4.1.0 and later and is a best practice for earlier versions.
assign(".First", function() cat("\n   Welcome to R!\n\n"), envir =
globalenv())
assign(".Last", function() cat("\n   Goodbye!\n\n"), envir = globalenv())

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Feature Request with Proposed Solution: Update utils:::format.object_size() and utils:::print.object_size() to Respect Optional Formatting Arguments

2021-06-02 Thread Jake Elmstedt

Problem:

When running the following commands:
x <- numeric(1e8)
format(object.size(x), units = "kB", standard = "SI")
#> [1] "8e+05 kB"
The object size is returned in scientific notation.

It is natural to assume we could use the argument 'scientific = FASLE'
to solve this.
format(object.size(x), units = "kB", standard = "SI", scientific = FALSE)
#> [1] "8e+05 kB"
But, the output is unchanged.

We can change the global scipen option to fix this, but this is not
ideal, nor does it address other potential optional arguments a user
may want to pass to format()/print().


Proposed Solution:
File: src/library/utils/R/object.size.R
Function:  format.object_size()

ADD lines at top of function:
dots <- list(...)

DELETE:
paste(round(x/base^power, digits = digits), unit)

ADD lines at the end of the function:
value <- c(round(x/base^power, digits))
dots[["width"]] <- NULL
dots[["digits"]] <- max(ceiling(log10(value)), 0) + digits
dots[["x"]] <- value
format(paste(do.call(format, dots), unit), ...)

By removing any potential 'width' argument and updating the 'digits'
argument to reflect significant digits rather than decimal places in
'dots', the initial value is formatted with any additional arguments
included in the originating generic `format()` call, notably:
digits, nsmall, scientific, big.mark, big.interval, small.mark,
small.interval, decimal.mark, zero.print, and drop0trailing

The outer call to format() will format the character result of paste()
with arguments 'width' and 'justify'.


Function:  print.object_size()

DELETE:
y <- format.object_size(x, units = units, standard = standard, digits = digits)

ADD:
y <- format.object_size(x, units, standard, digits, ...)

This simply passes additional arguments to the above edited format.object_size()

These changes have the effect of allowing all possible optional
arguments to format() to be meaningfully used in format.object_size().

Potential conflicts:
Results of the updated function will be identical to the current
results in all cases where no additional arguments have been passed to
the function. So, code which does not rely on these arguments being
ignored will be unaffected.

However, existing code which passes additional formatting arguments
for "object_size" class objects which are currently being ignored may
result in different output. This could potentially cause errors if the
end user is doing anything programmatic with the results, though this
is only likely to cause problems if the code uses but ignores the
'big.mark', 'small.mark', 'decimal.mark', or 'zero.print' arguments
when formatting or printing "object_size" class objects. That said, it
can probably be expected most programmatic work will be done directly
on the "object_size" objects rather than the results of the format()
or print() methods for them. So, I would expect nearly zero issues
with existing code.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] WISH: set.seed(seed) to produce error if length(seed) != 1 (now silent)

2021-09-17 Thread Jake Elmstedt

What about splitting the baby and having set.seed(1:2), set.seed(6.1),
etc. issue a warning rather than throw an error?

It informs the user that their expectations have deviated from
reality, encourages proper programming practices, and carries
substantially lower risk of breaking things than an exception.


On Fri, Sep 17, 2021 at 1:13 PM Avi Gross via R-devel
 wrote:
>
> R wobbles a bit as there is no normal datatype that is a singleton variable.  
> Saying x <- 5 just creates a vector of current length 1. It is perfectly 
> legal to then write x [2] <- 6 and so on. The vector lengthens. You can 
> truncate it back to 1, if you wish: length(x) <- 1
>
> So the question here is what happens if you supply more info than is needed? 
> If it is an integer vector of length greater than one, should it ignore 
> everything but the first entry? I note it happily accepts not-quite integers 
> like TRUE and FALSE.  it also accepts floating point numbers like 1.23 or 
> 1.2e5.
>
> The goal seems to be to set a unique starting point, rounded or transformed 
> if needed. The visible part of the function does not even look at the seed 
> before calling the internal representation. So although superficially 
> choosing the first integer in a vector makes some sense, it can be a problem 
> if a program assumes the entire vector is consumed and perhaps hashed in some 
> way to make a seed. If the program later changes parts of the vector other 
> than the first entry, it may assume re-setting the seed gets something else 
> and yet it may be exactly the same.
>
> So, yes, I suspect it is an ERROR to take anything that cannot be coerced by 
> something like as.integer() into a vector of length 1.
>
> I have noted other places in R where I may get a warning when giving a longer 
> vector that only the fist element will be used.  Are they all problems that 
> need to be addressed?
>
> Here is a short one:
>
> > x <- c(1:3)
> > if (x > 2) y <- TRUE
> Warning message:
>   In if (x > 2) y <- TRUE :
>   the condition has length > 1 and only the first element will be used
> > y
> Error: object 'y' not found
>
> The above is not vectorized and makes the choice of x==1 and thus does not 
> set y.
>
> Now a vectorized variant works as expected, making a vector of length 3 for y:
>
> > x
> [1] 1 2 3
>
> > y <- ifelse(x > 2, TRUE, FALSE)
> > y
> [1] FALSE FALSE  TRUE
>
> I have no doubt fixing lots of this stuff, if indeed it is a fix, can break 
> lots of existing code. Sure, it is not harmful to ask a programmer to always 
> say x[1] to guarantee they are getting what they want, or to add a function 
> like first(x) that does the same.
>
> R has some compromises or features I sometimes wonder about. If it had a 
> concept of a numeric scalar, then some things that now happen might start 
> being an error.
>
> What happens when you multiply a vector by a scalar as in 5*x is that every 
> component of x is multiplied by 5. but x*x does componentwise multiplication. 
>  So say x is c(1:3) what should this do using a twosome times a threesome?
>
> x[1:2]*x
> [1] 1 4 3
> Warning message:
>   In x[1:2] * x :
>   longer object length is not a multiple of shorter object length
>
> Is it recycling to get a 1 in pseudo-position 3?
>
> Yep, this shows recycling:
>
> > x[1:2]*x
> [1]  1  4  3  8  5 12  7 16  9
> Warning message:
>   In x[1:2] * x :
>   longer object length is not a multiple of shorter object length
>
> You do get a warning but not telling you what it did.
>
> In essence, the earlier case of 5*x arguably recycled the 5 as many times as 
> needed but with no warning.
>
> My point is that many languages, especially older ones, were designed a 
> certain way and have been updated but we may be stuck with what we have. A 
> brand new language might come up with a new way that includes vectorizing the 
> heck out of things but allowing and even demanding that you explicitly 
> convert things to a scalar in a context that needs it or to explicitly asking 
> for recycling when you want it or ...
>
>
>
>
> -Original Message-
> From: R-devel  On Behalf Of Henrik Bengtsson
> Sent: Friday, September 17, 2021 8:39 AM
> To: GILLIBERT, Andre 
> Cc: R-devel 
> Subject: Re: [Rd] WISH: set.seed(seed) to produce error if length(seed) != 1 
> (now silent)
>
> > I’m curious, other than proper programming practice, why?
>
> Life's too short for troubleshooting silent mistakes - mine or others.
>
> While at it, searching the interwebs for use of set.seed(), gives 
> mistakes/misunderstandings like using set.seed(), e.g.
>
> > set.seed(6.1); sum(.Random.seed)
> [1] 73930104
> > set.seed(6.2); sum(.Random.seed)
> [1] 73930104
>
> which clearly is not what the user expected.  There are also a few cases of 
> set.seed(), e.g.
>
> > set.seed("42"); sum(.Random.seed)
> [1] -2119381568
> > set.seed(42); sum(.Random.seed)
> [1] -2119381568
>
> which works just because as.numeric("42") is used.
>
> /Henrik
>
> On Fri, Sep 17, 2021 at 12:55 PM GILLIBE

[Rd] Locking of base environment in R 4.1.0 breaks simple assignment of .First() (etc) from Rprofile.site

[Rd] Feature Request with Proposed Solution: Update utils:::format.object_size() and utils:::print.object_size() to Respect Optional Formatting Arguments

Re: [Rd] WISH: set.seed(seed) to produce error if length(seed) != 1 (now silent)

3 matches

Site Navigation

Mail list logo

Footer information