R wobbles a bit as there is no normal datatype that is a singleton variable.  
Saying x <- 5 just creates a vector of current length 1. It is perfectly legal 
to then write x [2] <- 6 and so on. The vector lengthens. You can truncate it 
back to 1, if you wish: length(x) <- 1

So the question here is what happens if you supply more info than is needed? If 
it is an integer vector of length greater than one, should it ignore everything 
but the first entry? I note it happily accepts not-quite integers like TRUE and 
FALSE.  it also accepts floating point numbers like 1.23 or 1.2e5. 

The goal seems to be to set a unique starting point, rounded or transformed if 
needed. The visible part of the function does not even look at the seed before 
calling the internal representation. So although superficially choosing the 
first integer in a vector makes some sense, it can be a problem if a program 
assumes the entire vector is consumed and perhaps hashed in some way to make a 
seed. If the program later changes parts of the vector other than the first 
entry, it may assume re-setting the seed gets something else and yet it may be 
exactly the same.

So, yes, I suspect it is an ERROR to take anything that cannot be coerced by 
something like as.integer() into a vector of length 1.

I have noted other places in R where I may get a warning when giving a longer 
vector that only the fist element will be used.  Are they all problems that 
need to be addressed?

Here is a short one:

> x <- c(1:3)
> if (x > 2) y <- TRUE
Warning message:
  In if (x > 2) y <- TRUE :
  the condition has length > 1 and only the first element will be used
> y
Error: object 'y' not found

The above is not vectorized and makes the choice of x==1 and thus does not set 
y.

Now a vectorized variant works as expected, making a vector of length 3 for y:

> x
[1] 1 2 3

> y <- ifelse(x > 2, TRUE, FALSE)
> y
[1] FALSE FALSE  TRUE

I have no doubt fixing lots of this stuff, if indeed it is a fix, can break 
lots of existing code. Sure, it is not harmful to ask a programmer to always 
say x[1] to guarantee they are getting what they want, or to add a function 
like first(x) that does the same. 

R has some compromises or features I sometimes wonder about. If it had a 
concept of a numeric scalar, then some things that now happen might start being 
an error.

What happens when you multiply a vector by a scalar as in 5*x is that every 
component of x is multiplied by 5. but x*x does componentwise multiplication.  
So say x is c(1:3) what should this do using a twosome times a threesome?

x[1:2]*x
[1] 1 4 3
Warning message:
  In x[1:2] * x :
  longer object length is not a multiple of shorter object length

Is it recycling to get a 1 in pseudo-position 3?

Yep, this shows recycling:

> x[1:2]*x
[1]  1  4  3  8  5 12  7 16  9
Warning message:
  In x[1:2] * x :
  longer object length is not a multiple of shorter object length

You do get a warning but not telling you what it did.

In essence, the earlier case of 5*x arguably recycled the 5 as many times as 
needed but with no warning. 

My point is that many languages, especially older ones, were designed a certain 
way and have been updated but we may be stuck with what we have. A brand new 
language might come up with a new way that includes vectorizing the heck out of 
things but allowing and even demanding that you explicitly convert things to a 
scalar in a context that needs it or to explicitly asking for recycling when 
you want it or ...




-----Original Message-----
From: R-devel <r-devel-boun...@r-project.org> On Behalf Of Henrik Bengtsson
Sent: Friday, September 17, 2021 8:39 AM
To: GILLIBERT, Andre <andre.gillib...@chu-rouen.fr>
Cc: R-devel <r-devel@r-project.org>
Subject: Re: [Rd] WISH: set.seed(seed) to produce error if length(seed) != 1 
(now silent)

> I’m curious, other than proper programming practice, why?

Life's too short for troubleshooting silent mistakes - mine or others.

While at it, searching the interwebs for use of set.seed(), gives 
mistakes/misunderstandings like using set.seed(<double>), e.g.

> set.seed(6.1); sum(.Random.seed)
[1] 73930104
> set.seed(6.2); sum(.Random.seed)
[1] 73930104

which clearly is not what the user expected.  There are also a few cases of 
set.seed(<character>), e.g.

> set.seed("42"); sum(.Random.seed)
[1] -2119381568
> set.seed(42); sum(.Random.seed)
[1] -2119381568

which works just because as.numeric("42") is used.

/Henrik

On Fri, Sep 17, 2021 at 12:55 PM GILLIBERT, Andre 
<andre.gillib...@chu-rouen.fr> wrote:
>
> Hello,
>
> A vector with a length >= 2 to set.seed would probably be a bug. An error 
> message will help the user to fix his R code. The bug may be accidental or 
> due to bad understanding of the set.seed function. For instance, a user may 
> think that the whole state of the PRNG can be passed to set.seed.
>
> The "if" instruction, emits a warning when the condition has length >= 2, 
> because it is often a bug. I would expect a warning or error with set.seed().
>
> Validating inputs and emitting errors early is a good practice.
>
> Just my 2 cents.
>
> Sincerely.
> Andre GILLIBERT
>
> -----Message d'origine-----
> De : R-devel [mailto:r-devel-boun...@r-project.org] De la part de 
> Avraham Adler Envoyé : vendredi 17 septembre 2021 12:07 À : Henrik 
> Bengtsson Cc : R-devel Objet : Re: [Rd] WISH: set.seed(seed) to 
> produce error if length(seed) != 1 (now silent)
>
> Hi, Henrik.
>
> I’m curious, other than proper programming practice, why?
>
> Avi
>
> On Fri, Sep 17, 2021 at 11:48 AM Henrik Bengtsson < 
> henrik.bengts...@gmail.com> wrote:
>
> > Hi,
> >
> > according to help("set.seed"), argument 'seed' to set.seed() should be:
> >
> >   a single value, interpreted as an integer, or NULL (see ‘Details’).
> >
> > From code inspection (src/main/RNG.c) and testing, it turns out that 
> > if you pass a 'seed' with length greater than one, it silently uses 
> > seed[1], e.g.
> >
> > > set.seed(1); sum(.Random.seed)
> > [1] 4070365163
> > > set.seed(1:3); sum(.Random.seed)
> > [1] 4070365163
> > > set.seed(1:100); sum(.Random.seed)
> > [1] 4070365163
> >
> > I'd like to suggest that set.seed() produces an error if 
> > length(seed)
> > > 1.  As a reference, for length(seed) == 0, we get:
> >
> > > set.seed(integer(0))
> > Error in set.seed(integer(0)) : supplied seed is not a valid integer
> >
> > /Henrik
> >
> > ______________________________________________
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> --
> Sent from Gmail Mobile
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to