Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-31 Thread Emil Bode

On 30/08/2018, 20:15, "R-devel on behalf of Hadley Wickham" 
 wrote:

On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler
 wrote:
>
> > Joris Meys
> > on Thu, 30 Aug 2018 14:48:01 +0200 writes:
>
> > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
> >  wrote:
> >> Note that `||` and `&&` have never been symmetric:
> >>
> >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
> >> error
> >>
> >>
> > Fair point. So the suggestion would be to check whether x
> > is of length 1 and whether y is of length 1 only when
> > needed. I.e.
>
> > c(TRUE,FALSE) || TRUE
>
> > would give an error and
>
> > TRUE || c(TRUE, FALSE)
>
> > would pass.
>
> > Thought about it a bit more, and I can't come up with a
> > use case where the first line must pass. So if the short
> > circuiting remains and the extra check only gives a small
> > performance penalty, adding the error could indeed make
> > some bugs more obvious.
>
> I agree "in theory".
> Thank you, Henrik, for bringing it up!
>
> In practice I think we should start having a warning signalled.
> I have checked the source code in the mean time, and the check
> is really very cheap
> { because it can/should be done after checking isNumber(): so
>   then we know we have an atomic and can use XLENGTH() }
>
>
> The 0-length case I don't think we should change as I do find
> NA (is logical!) to be an appropriate logical answer.

Can you explain your reasoning a bit more here? I'd like to understand
the general principle, because from my perspective it's more
parsimonious to say that the inputs to || and && must be length 1,
rather than to say that inputs could be length 0 or length 1, and in
the length 0 case they are replaced with NA.

Hadley

I would say the value NA would cause warnings later on, that are easy to track 
down, so a return of NA is far less likely to cause problems than an unintended 
TRUE or FALSE. And I guess there would be some code reliant on 'logical(0) || 
TRUE' returning TRUE, that wouldn't necessarily be a mistake.

But I think it's hard to predict how exactly people are using functions. I 
personally can't imagine a situation where I'd use || or && outside an 
if-statement, so I'd rather have the current behaviour, because I'm not sure if 
I'm reliant on logical(0) || TRUE  somewhere in my code (even though that would 
be ugly code, it's not wrong per se)
But I could always rewrite it, so I believe it's more a question of how much 
would have to be rewritten. Maybe implement it first in devel, to see how many 
people would complain?

Emil Bode




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] build package with unicode (farsi) strings

2018-08-31 Thread Farid Ch
Thank you all for your valuable insights. The most viable workaround is a 
modification to the Hadley�s line of code:



stringi::stri_escape_unicode(letters_fa) %>%

paste0("'",.,"'",collapse=',') %>%

paste0('c(',.,')')



which then, the output string could be easily copied and pasted without manual 
editing. However, imagine you had to do this process to all of your English 
strings that you write daily! It is not that much productive. Is it?



I think R deserves a better support for internationalization and I know this 
implies fundamental revisions to the code to avoid the unecessary conversion to 
a (OS) native locale; i.e. directly reading/writing as unicode.



Farid




From: Hadley Wickham 
Sent: Friday, August 31, 2018 2:48:17 AM
To: ONKELINX, Thierry
Cc: faridc...@gmail.com; r-devel@r-project.org
Subject: Re: [Rd] build package with unicode (farsi) strings

On Thu, Aug 30, 2018 at 2:11 AM Thierry Onkelinx
 wrote:
>
> Dear Farid,
>
> Try using the ASCII notation. letters_fa <- c("\u0627", "\u0641"). The full
> code table is available at https://www.utf8-chartable.de

It's a little easier to do this with code:

letters_fa <- c('���','�','�','�','�','�','�','�','�','�','�','�')
writeLines(stringi::stri_escape_unicode(letters_fa))
#> \u0627\u0644\u0641
#> \u0628
#> \u067e
#> \u062a
#> \u062b
#> \u062c
#> \u0686
#> \u062d
#> \u062e
#> \u0631
#> \u0632
#> \u062f

Hadley

--
http://hadley.nz

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Detecting whether a process exists or not by its PID?

2018-08-31 Thread Tomas Kalibera

On 08/31/2018 01:18 AM, Henrik Bengtsson wrote:

Hi, I'd like to test whether a (localhost) PSOCK cluster node is still
running or not by its PID, e.g. it may have crashed / core dumped.
I'm ok with getting false-positive results due to *another* process
with the same PID has since started.
kill(sig=0) is specified by POSIX but indeed as you say there is a race 
condition due to PID-reuse.  In principle, detecting that a worker 
process is still alive cannot be done correctly outside base R. At 
user-level I would probably consider some watchdog, e.g. the parallel 
tasks would be repeatedly touching a file.


In base R, one can do this correctly for forked processes via 
mcparallel/mccollect, not for PSOCK cluster workers which are based on 
system() (and I understand it would be a useful feature)


> j <- mcparallel(Sys.sleep(1000))
> mccollect(j, wait=FALSE)
NULL

# kill the child process

> mccollect(j, wait=FALSE)
$`1542`
NULL

More details indeed in ?mcparallel. The key part is that the job must be 
started as non-detached and as soon as mccollect() collects is, 
mccollect() must never be called on it again.


Tomas



I can the PID of each cluster nodes by querying them for their
Sys.getpid(), e.g.

 pids <- parallel::clusterEvalQ(cl, Sys.getpid())

Is there a function in core R for testing whether a process with a
given PID exists or not? From trial'n'error, I found that on Linux:

   pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L))

returns TRUE for existing processes and FALSE otherwise, but I'm not
sure if I can trust this.  It's not a documented feature in
?tools::pskill, which also warns about 'signal' not being standardized
across OSes.

The other Linux alternative I can imagine is:

   pid_exists <- function(pid) system2("ps", args = c("--pid", pid),
stdout = FALSE) == 0L

Can I expect this to work on macOS as well?  What about other *nix systems?

And, finally, what can be done on Windows?

I'm sure there are packages on CRAN that provides this, but I'd like
to keep dependencies at a minimum.

I appreciate any feedback. Thxs,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] compairing doubles

2018-08-31 Thread Felix Ernst
Dear all,

I a bit unsure, whether this qualifies as a bug, but it is definitly a strange 
behaviour. That why I wanted to discuss it.

With the following function, I want to test for evenly space numbers, starting 
from anywhere.

.is_continous_evenly_spaced <- function(n){
  if(length(n) < 2) return(FALSE)
  n <- n[order(n)]
  n <- n - min(n)
  step <- n[2] - n[1]
  test <- seq(from = min(n), to = max(n), by = step)
  if(length(n) == length(test) &&
 all(n == test)){
return(TRUE)
  }
  return(FALSE)
}

> .is_continous_evenly_spaced(c(1,2,3,4))
[1] TRUE
> .is_continous_evenly_spaced(c(1,3,4,5))
[1] FALSE
> .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
[1] FALSE

I expect the result for 1 and 2, but not for 3. Upon Investigation it turns 
out, that n == test is TRUE for every pair, but not for the pair of 0.2.

The types reported are always double, however n[2] == 0.1 reports FALSE as well.

The whole problem is solved by switching from all(n == test) to 
all(as.character(n) == as.character(test)). However that is weird, isn�t it?

Does this work as intended? Thanks for any help, advise and suggestions in 
advance.

Best regards,
Felix


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Segfault when performing match on POSIXlt object

2018-08-31 Thread Marco Giuliano
Hi All,
I found a possible unexpected behavior when performing match/%in% on
POSIXlt objects, e.g. :

  d <- as.POSIXlt('2018-01-01')

  # match(,) --> segfault
  match(0,d)

  # consequently also this fails :
  0 %in% d

REPORTED ERROR ON LINUX:
   *** caught segfault ***
  address 0x16dc2, cause 'memory not mapped'

Verified on 3.5.0 on linux, 3.5.1 on Windows.

I think this could be a bug, since even if that match operation makes no
sense, the R session is not supposed to crash with segmentation fault, but
rather throw an exception.

Thanks in advance

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Detecting whether a process exists or not by its PID?

2018-08-31 Thread Gábor Csárdi
On Fri, Aug 31, 2018 at 2:51 PM Tomas Kalibera  wrote:
[...]
> kill(sig=0) is specified by POSIX but indeed as you say there is a race
> condition due to PID-reuse.  In principle, detecting that a worker
> process is still alive cannot be done correctly outside base R.

I am not sure why you think so.

> At user-level I would probably consider some watchdog, e.g. the parallel
> tasks would be repeatedly touching a file.

I am pretty sure that there are simpler and better solutions. E.g. one
would be to
ask the worker process for its startup time (with as much precision as possible)
and then use the (pid, startup_time) pair as a unique id.

With this you can check if the process is still running, by checking
that the pid exists,
and that its startup time matches.

This is all very simple with the ps package, on Linux, macOS and Windows.

Gabor

> In base R, one can do this correctly for forked processes via
> mcparallel/mccollect, not for PSOCK cluster workers which are based on
> system() (and I understand it would be a useful feature)
>
>  > j <- mcparallel(Sys.sleep(1000))
>  > mccollect(j, wait=FALSE)
> NULL
>
> # kill the child process
>
>  > mccollect(j, wait=FALSE)
> $`1542`
> NULL
>
> More details indeed in ?mcparallel. The key part is that the job must be
> started as non-detached and as soon as mccollect() collects is,
> mccollect() must never be called on it again.
>
> Tomas
>
> >
> > I can the PID of each cluster nodes by querying them for their
> > Sys.getpid(), e.g.
> >
> >  pids <- parallel::clusterEvalQ(cl, Sys.getpid())
> >
> > Is there a function in core R for testing whether a process with a
> > given PID exists or not? From trial'n'error, I found that on Linux:
> >
> >pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L))
> >
> > returns TRUE for existing processes and FALSE otherwise, but I'm not
> > sure if I can trust this.  It's not a documented feature in
> > ?tools::pskill, which also warns about 'signal' not being standardized
> > across OSes.
> >
> > The other Linux alternative I can imagine is:
> >
> >pid_exists <- function(pid) system2("ps", args = c("--pid", pid),
> > stdout = FALSE) == 0L
> >
> > Can I expect this to work on macOS as well?  What about other *nix systems?
> >
> > And, finally, what can be done on Windows?
> >
> > I'm sure there are packages on CRAN that provides this, but I'd like
> > to keep dependencies at a minimum.
> >
> > I appreciate any feedback. Thxs,
> >
> > Henrik
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Detecting whether a process exists or not by its PID?

2018-08-31 Thread Tomas Kalibera

On 08/31/2018 03:13 PM, Gábor Csárdi wrote:

On Fri, Aug 31, 2018 at 2:51 PM Tomas Kalibera  wrote:
[...]

kill(sig=0) is specified by POSIX but indeed as you say there is a race
condition due to PID-reuse.  In principle, detecting that a worker
process is still alive cannot be done correctly outside base R.

I am not sure why you think so.
To avoid the race with PID re-use one needs access to signal handling, 
to blocking signals, to handling sigchld. system/system2 and 
mcparallel/mccollect in base R use these features and the interaction is 
still safe given the specific use in system/system2 and 
mcparallel/mccollect, yet would have to be re-visited if either of the 
two uses change. These features cannot be safely used outside of base R 
in contributed packages.


Tomas




At user-level I would probably consider some watchdog, e.g. the parallel
tasks would be repeatedly touching a file.

I am pretty sure that there are simpler and better solutions. E.g. one
would be to
ask the worker process for its startup time (with as much precision as possible)
and then use the (pid, startup_time) pair as a unique id.

With this you can check if the process is still running, by checking
that the pid exists,
and that its startup time matches.

This is all very simple with the ps package, on Linux, macOS and Windows.

Gabor


In base R, one can do this correctly for forked processes via
mcparallel/mccollect, not for PSOCK cluster workers which are based on
system() (and I understand it would be a useful feature)

  > j <- mcparallel(Sys.sleep(1000))
  > mccollect(j, wait=FALSE)
NULL

# kill the child process

  > mccollect(j, wait=FALSE)
$`1542`
NULL

More details indeed in ?mcparallel. The key part is that the job must be
started as non-detached and as soon as mccollect() collects is,
mccollect() must never be called on it again.

Tomas


I can the PID of each cluster nodes by querying them for their
Sys.getpid(), e.g.

  pids <- parallel::clusterEvalQ(cl, Sys.getpid())

Is there a function in core R for testing whether a process with a
given PID exists or not? From trial'n'error, I found that on Linux:

pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L))

returns TRUE for existing processes and FALSE otherwise, but I'm not
sure if I can trust this.  It's not a documented feature in
?tools::pskill, which also warns about 'signal' not being standardized
across OSes.

The other Linux alternative I can imagine is:

pid_exists <- function(pid) system2("ps", args = c("--pid", pid),
stdout = FALSE) == 0L

Can I expect this to work on macOS as well?  What about other *nix systems?

And, finally, what can be done on Windows?

I'm sure there are packages on CRAN that provides this, but I'd like
to keep dependencies at a minimum.

I appreciate any feedback. Thxs,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Iñaki Ucar
El vie., 31 ago. 2018 a las 15:10, Felix Ernst
() escribió:
>
> Dear all,
>
> I a bit unsure, whether this qualifies as a bug, but it is definitly a 
> strange behaviour. That why I wanted to discuss it.
>
> With the following function, I want to test for evenly space numbers, 
> starting from anywhere.
>
> .is_continous_evenly_spaced <- function(n){
>   if(length(n) < 2) return(FALSE)
>   n <- n[order(n)]
>   n <- n - min(n)
>   step <- n[2] - n[1]
>   test <- seq(from = min(n), to = max(n), by = step)
>   if(length(n) == length(test) &&
>  all(n == test)){
> return(TRUE)
>   }
>   return(FALSE)
> }
>
> > .is_continous_evenly_spaced(c(1,2,3,4))
> [1] TRUE
> > .is_continous_evenly_spaced(c(1,3,4,5))
> [1] FALSE
> > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
> [1] FALSE
>
> I expect the result for 1 and 2, but not for 3. Upon Investigation it turns 
> out, that n == test is TRUE for every pair, but not for the pair of 0.2.
>
> The types reported are always double, however n[2] == 0.1 reports FALSE as 
> well.
>
> The whole problem is solved by switching from all(n == test) to 
> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
>
> Does this work as intended? Thanks for any help, advise and suggestions in 
> advance.

I guess this has something to do with how the sequence is built and
the inherent error of floating point arithmetic. In fact, if you
return test minus n, you'll get:

[1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00

and the error gets bigger when you continue the sequence; e.g., this
is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):

[1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
[6] 4.440892e-16 4.440892e-16 0.00e+00

So, independently of this is considered a bug or not, instead of

length(n) == length(test) && all(n == test)

I would use the following condition:

isTRUE(all.equal(n, test))

Iñaki

>
> Best regards,
> Felix
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Iñaki Ucar

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Emil Bode
Agreed that's it's rounding error, and all.equal would be the way to go.
I wouldn't call it a bug, it's simply part of working with floating point 
numbers, any language has the same issue.

And while we're at it, I think the function can be a lot shorter:
.is_continous_evenly_spaced <- function(n){
  length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n), to=max(n), 
length.out = length(n
}

Cheers, Emil

El vie., 31 ago. 2018 a las 15:10, Felix Ernst
() escribió:
>
> Dear all,
>
> I a bit unsure, whether this qualifies as a bug, but it is definitly a 
strange behaviour. That why I wanted to discuss it.
>
> With the following function, I want to test for evenly space numbers, 
starting from anywhere.
>
> .is_continous_evenly_spaced <- function(n){
>   if(length(n) < 2) return(FALSE)
>   n <- n[order(n)]
>   n <- n - min(n)
>   step <- n[2] - n[1]
>   test <- seq(from = min(n), to = max(n), by = step)
>   if(length(n) == length(test) &&
>  all(n == test)){
> return(TRUE)
>   }
>   return(FALSE)
> }
>
> > .is_continous_evenly_spaced(c(1,2,3,4))
> [1] TRUE
> > .is_continous_evenly_spaced(c(1,3,4,5))
> [1] FALSE
> > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
> [1] FALSE
>
> I expect the result for 1 and 2, but not for 3. Upon Investigation it 
turns out, that n == test is TRUE for every pair, but not for the pair of 0.2.
>
> The types reported are always double, however n[2] == 0.1 reports FALSE 
as well.
>
> The whole problem is solved by switching from all(n == test) to 
all(as.character(n) == as.character(test)). However that is weird, isn’t it?
>
> Does this work as intended? Thanks for any help, advise and suggestions 
in advance.

I guess this has something to do with how the sequence is built and
the inherent error of floating point arithmetic. In fact, if you
return test minus n, you'll get:

[1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00

and the error gets bigger when you continue the sequence; e.g., this
is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):

[1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
[6] 4.440892e-16 4.440892e-16 0.00e+00

So, independently of this is considered a bug or not, instead of

length(n) == length(test) && all(n == test)

I would use the following condition:

isTRUE(all.equal(n, test))

Iñaki

>
> Best regards,
> Felix
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Iñaki Ucar

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Iñaki Ucar
FYI, more fun with floats:

> 0.1+0.1==0.2
[1] TRUE
> 0.1+0.1+0.1+0.1==0.4
[1] TRUE
> 0.1+0.1+0.1==0.3
[1] FALSE
> 0.1+0.1+0.1==0.1*3
[1] TRUE
> 0.3==0.1*3
[1] FALSE

¯\_(ツ)_/¯

But this is not R's fault. See: https://0.30004.com

Iñaki

El vie., 31 ago. 2018 a las 15:36, Iñaki Ucar
() escribió:
>
> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
> () escribió:
> >
> > Dear all,
> >
> > I a bit unsure, whether this qualifies as a bug, but it is definitly a 
> > strange behaviour. That why I wanted to discuss it.
> >
> > With the following function, I want to test for evenly space numbers, 
> > starting from anywhere.
> >
> > .is_continous_evenly_spaced <- function(n){
> >   if(length(n) < 2) return(FALSE)
> >   n <- n[order(n)]
> >   n <- n - min(n)
> >   step <- n[2] - n[1]
> >   test <- seq(from = min(n), to = max(n), by = step)
> >   if(length(n) == length(test) &&
> >  all(n == test)){
> > return(TRUE)
> >   }
> >   return(FALSE)
> > }
> >
> > > .is_continous_evenly_spaced(c(1,2,3,4))
> > [1] TRUE
> > > .is_continous_evenly_spaced(c(1,3,4,5))
> > [1] FALSE
> > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
> > [1] FALSE
> >
> > I expect the result for 1 and 2, but not for 3. Upon Investigation it turns 
> > out, that n == test is TRUE for every pair, but not for the pair of 0.2.
> >
> > The types reported are always double, however n[2] == 0.1 reports FALSE as 
> > well.
> >
> > The whole problem is solved by switching from all(n == test) to 
> > all(as.character(n) == as.character(test)). However that is weird, isn’t it?
> >
> > Does this work as intended? Thanks for any help, advise and suggestions in 
> > advance.
>
> I guess this has something to do with how the sequence is built and
> the inherent error of floating point arithmetic. In fact, if you
> return test minus n, you'll get:
>
> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
>
> and the error gets bigger when you continue the sequence; e.g., this
> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
>
> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
> [6] 4.440892e-16 4.440892e-16 0.00e+00
>
> So, independently of this is considered a bug or not, instead of
>
> length(n) == length(test) && all(n == test)
>
> I would use the following condition:
>
> isTRUE(all.equal(n, test))
>
> Iñaki
>
> >
> > Best regards,
> > Felix
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Iñaki Ucar



-- 
Iñaki Ucar

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Detecting whether a process exists or not by its PID?

2018-08-31 Thread Gábor Csárdi
On Fri, Aug 31, 2018 at 3:35 PM Tomas Kalibera  wrote:
>
> On 08/31/2018 03:13 PM, Gábor Csárdi wrote:
> > On Fri, Aug 31, 2018 at 2:51 PM Tomas Kalibera  
> > wrote:
> > [...]
> >> kill(sig=0) is specified by POSIX but indeed as you say there is a race
> >> condition due to PID-reuse.  In principle, detecting that a worker
> >> process is still alive cannot be done correctly outside base R.
> > I am not sure why you think so.
> To avoid the race with PID re-use one needs access to signal handling,
> to blocking signals, to handling sigchld. system/system2 and
> mcparallel/mccollect in base R use these features and the interaction is
> still safe given the specific use in system/system2 and
> mcparallel/mccollect, yet would have to be re-visited if either of the
> two uses change. These features cannot be safely used outside of base R
> in contributed packages.

Yes, _in theory_ this is right, and of course this only works for
child processes.

_In practice_, you do not need signal handling. The startup time stamp
method is
completely fine, because it is practically impossible to have two
processes with the
same pid and the same (high precision) startup time. This method also
works for any
process (not just child processes), so for PSOCK clusters as well.

Gabor

[...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Segfault when performing match on POSIXlt object

2018-08-31 Thread Martin Maechler
> Marco Giuliano 
> on Fri, 31 Aug 2018 08:53:02 +0200 writes:

> Hi All, I found a possible unexpected behavior when
> performing match/%in% on POSIXlt objects, e.g. :

>   d <- as.POSIXlt('2018-01-01')

>   # match(,) --> segfault match(0,d)

>   # consequently also this fails : 0 %in% d

> REPORTED ERROR ON LINUX: *** caught segfault *** address
> 0x16dc2, cause 'memory not mapped'

> Verified on 3.5.0 on linux, 3.5.1 on Windows.

Confirmed (Linux, I think all version >= 3.4.0, but not in R
3.3.3 (or earlier, presumably).

Note the segfault happens inspite of the match_transform() utility
which explicitly checks for "POSIXlt" and the code comment which
says that "POSIXlt" should have been transformed to character, 
but seems to fail in recent versions of R.

> I think this could be a bug, since even if that match
> operation makes no sense, the R session is not supposed to
> crash with segmentation fault, but rather throw an
> exception.

Definitely.  It is a bug.

> Thanks in advance

Thank you for reporting!

Martin Maechler, ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Mark van der Loo
how about

is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)

(use ellipsis to set tolerance if necessary)


Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :

> Agreed that's it's rounding error, and all.equal would be the way to go.
> I wouldn't call it a bug, it's simply part of working with floating point
> numbers, any language has the same issue.
>
> And while we're at it, I think the function can be a lot shorter:
> .is_continous_evenly_spaced <- function(n){
>   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n), to=max(n),
> length.out = length(n
> }
>
> Cheers, Emil
>
> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
> () escribió:
> >
> > Dear all,
> >
> > I a bit unsure, whether this qualifies as a bug, but it is definitly
> a strange behaviour. That why I wanted to discuss it.
> >
> > With the following function, I want to test for evenly space
> numbers, starting from anywhere.
> >
> > .is_continous_evenly_spaced <- function(n){
> >   if(length(n) < 2) return(FALSE)
> >   n <- n[order(n)]
> >   n <- n - min(n)
> >   step <- n[2] - n[1]
> >   test <- seq(from = min(n), to = max(n), by = step)
> >   if(length(n) == length(test) &&
> >  all(n == test)){
> > return(TRUE)
> >   }
> >   return(FALSE)
> > }
> >
> > > .is_continous_evenly_spaced(c(1,2,3,4))
> > [1] TRUE
> > > .is_continous_evenly_spaced(c(1,3,4,5))
> > [1] FALSE
> > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
> > [1] FALSE
> >
> > I expect the result for 1 and 2, but not for 3. Upon Investigation
> it turns out, that n == test is TRUE for every pair, but not for the pair
> of 0.2.
> >
> > The types reported are always double, however n[2] == 0.1 reports
> FALSE as well.
> >
> > The whole problem is solved by switching from all(n == test) to
> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
> >
> > Does this work as intended? Thanks for any help, advise and
> suggestions in advance.
>
> I guess this has something to do with how the sequence is built and
> the inherent error of floating point arithmetic. In fact, if you
> return test minus n, you'll get:
>
> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
>
> and the error gets bigger when you continue the sequence; e.g., this
> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
>
> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
> [6] 4.440892e-16 4.440892e-16 0.00e+00
>
> So, independently of this is considered a bug or not, instead of
>
> length(n) == length(test) && all(n == test)
>
> I would use the following condition:
>
> isTRUE(all.equal(n, test))
>
> Iñaki
>
> >
> > Best regards,
> > Felix
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Iñaki Ucar
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Mark van der Loo
Sorry for the second e-mail: this is worth watching:
https://www.youtube.com/watch?v=3Bu7QUxzIbA&t=1s
It's Martin Maechler's talk at useR!2018. This kind of stuff should be
mandatory material for any aspiring programmer/data scientist/statistician.

-Mark




Op vr 31 aug. 2018 om 16:00 schreef Mark van der Loo <
mark.vander...@gmail.com>:

> how about
>
> is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)
>
> (use ellipsis to set tolerance if necessary)
>
>
> Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :
>
>> Agreed that's it's rounding error, and all.equal would be the way to go.
>> I wouldn't call it a bug, it's simply part of working with floating point
>> numbers, any language has the same issue.
>>
>> And while we're at it, I think the function can be a lot shorter:
>> .is_continous_evenly_spaced <- function(n){
>>   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n),
>> to=max(n), length.out = length(n
>> }
>>
>> Cheers, Emil
>>
>> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
>> () escribió:
>> >
>> > Dear all,
>> >
>> > I a bit unsure, whether this qualifies as a bug, but it is
>> definitly a strange behaviour. That why I wanted to discuss it.
>> >
>> > With the following function, I want to test for evenly space
>> numbers, starting from anywhere.
>> >
>> > .is_continous_evenly_spaced <- function(n){
>> >   if(length(n) < 2) return(FALSE)
>> >   n <- n[order(n)]
>> >   n <- n - min(n)
>> >   step <- n[2] - n[1]
>> >   test <- seq(from = min(n), to = max(n), by = step)
>> >   if(length(n) == length(test) &&
>> >  all(n == test)){
>> > return(TRUE)
>> >   }
>> >   return(FALSE)
>> > }
>> >
>> > > .is_continous_evenly_spaced(c(1,2,3,4))
>> > [1] TRUE
>> > > .is_continous_evenly_spaced(c(1,3,4,5))
>> > [1] FALSE
>> > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
>> > [1] FALSE
>> >
>> > I expect the result for 1 and 2, but not for 3. Upon Investigation
>> it turns out, that n == test is TRUE for every pair, but not for the pair
>> of 0.2.
>> >
>> > The types reported are always double, however n[2] == 0.1 reports
>> FALSE as well.
>> >
>> > The whole problem is solved by switching from all(n == test) to
>> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
>> >
>> > Does this work as intended? Thanks for any help, advise and
>> suggestions in advance.
>>
>> I guess this has something to do with how the sequence is built and
>> the inherent error of floating point arithmetic. In fact, if you
>> return test minus n, you'll get:
>>
>> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
>>
>> and the error gets bigger when you continue the sequence; e.g., this
>> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
>>
>> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
>> [6] 4.440892e-16 4.440892e-16 0.00e+00
>>
>> So, independently of this is considered a bug or not, instead of
>>
>> length(n) == length(test) && all(n == test)
>>
>> I would use the following condition:
>>
>> isTRUE(all.equal(n, test))
>>
>> Iñaki
>>
>> >
>> > Best regards,
>> > Felix
>> >
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>> --
>> Iñaki Ucar
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Segfault when performing match on POSIXlt object

2018-08-31 Thread Martin Maechler
> Martin Maechler  on Fri, 31 Aug 2018 16:00:07 +0200 writes:

> Marco Giuliano on Fri, 31 Aug 2018 08:53:02 +0200 writes:

>> Hi All, I found a possible unexpected behavior when
>> performing match/%in% on POSIXlt objects, e.g. :

>> d <- as.POSIXlt('2018-01-01')

>> # match(,) --> segfault match(0,d)

>> # consequently also this fails :

> 0 %in% d

>> REPORTED ERROR ON LINUX: *** caught segfault *** address
>> 0x16dc2, cause 'memory not mapped'

>> Verified on 3.5.0 on linux, 3.5.1 on Windows.

> Confirmed (Linux, I think all version >= 3.4.0, but not in
> R 3.3.3 (or earlier, presumably).

ooops that was an offset error:

  Bug in all versions >= 3.3.3,
  but not in 3.2.5 (or earlier, presumably)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Detecting whether a process exists or not by its PID?

2018-08-31 Thread luke-tierney

On Fri, 31 Aug 2018, Gábor Csárdi wrote:


On Fri, Aug 31, 2018 at 3:35 PM Tomas Kalibera  wrote:


On 08/31/2018 03:13 PM, Gábor Csárdi wrote:

On Fri, Aug 31, 2018 at 2:51 PM Tomas Kalibera  wrote:
[...]

kill(sig=0) is specified by POSIX but indeed as you say there is a race
condition due to PID-reuse.  In principle, detecting that a worker
process is still alive cannot be done correctly outside base R.

I am not sure why you think so.

To avoid the race with PID re-use one needs access to signal handling,
to blocking signals, to handling sigchld. system/system2 and
mcparallel/mccollect in base R use these features and the interaction is
still safe given the specific use in system/system2 and
mcparallel/mccollect, yet would have to be re-visited if either of the
two uses change. These features cannot be safely used outside of base R
in contributed packages.


Yes, _in theory_ this is right, and of course this only works for
child processes.

_In practice_, you do not need signal handling. The startup time stamp
method is
completely fine, because it is practically impossible to have two
processes with the
same pid and the same (high precision) startup time. This method also
works for any
process (not just child processes), so for PSOCK clusters as well.


PSOCK workers may not be running on the same host as the master process.

Best,

luke



Gabor

[...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Iñaki Ucar
El vie., 31 ago. 2018 a las 16:00, Mark van der Loo
() escribió:
>
> how about
>
> is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)

This doesn't work, because

1. all.equal does *not* return FALSE. Use of isTRUE or identical(.,
TRUE) is required if you want a boolean.
2. all.equal compares two objects, not elements in a vector.

Iñaki

>
> (use ellipsis to set tolerance if necessary)
>
>
> Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :
>>
>> Agreed that's it's rounding error, and all.equal would be the way to go.
>> I wouldn't call it a bug, it's simply part of working with floating point 
>> numbers, any language has the same issue.
>>
>> And while we're at it, I think the function can be a lot shorter:
>> .is_continous_evenly_spaced <- function(n){
>>   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n), to=max(n), 
>> length.out = length(n
>> }
>>
>> Cheers, Emil
>>
>> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
>> () escribió:
>> >
>> > Dear all,
>> >
>> > I a bit unsure, whether this qualifies as a bug, but it is definitly a 
>> strange behaviour. That why I wanted to discuss it.
>> >
>> > With the following function, I want to test for evenly space numbers, 
>> starting from anywhere.
>> >
>> > .is_continous_evenly_spaced <- function(n){
>> >   if(length(n) < 2) return(FALSE)
>> >   n <- n[order(n)]
>> >   n <- n - min(n)
>> >   step <- n[2] - n[1]
>> >   test <- seq(from = min(n), to = max(n), by = step)
>> >   if(length(n) == length(test) &&
>> >  all(n == test)){
>> > return(TRUE)
>> >   }
>> >   return(FALSE)
>> > }
>> >
>> > > .is_continous_evenly_spaced(c(1,2,3,4))
>> > [1] TRUE
>> > > .is_continous_evenly_spaced(c(1,3,4,5))
>> > [1] FALSE
>> > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
>> > [1] FALSE
>> >
>> > I expect the result for 1 and 2, but not for 3. Upon Investigation it 
>> turns out, that n == test is TRUE for every pair, but not for the pair of 
>> 0.2.
>> >
>> > The types reported are always double, however n[2] == 0.1 reports 
>> FALSE as well.
>> >
>> > The whole problem is solved by switching from all(n == test) to 
>> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
>> >
>> > Does this work as intended? Thanks for any help, advise and 
>> suggestions in advance.
>>
>> I guess this has something to do with how the sequence is built and
>> the inherent error of floating point arithmetic. In fact, if you
>> return test minus n, you'll get:
>>
>> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
>>
>> and the error gets bigger when you continue the sequence; e.g., this
>> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
>>
>> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
>> [6] 4.440892e-16 4.440892e-16 0.00e+00
>>
>> So, independently of this is considered a bug or not, instead of
>>
>> length(n) == length(test) && all(n == test)
>>
>> I would use the following condition:
>>
>> isTRUE(all.equal(n, test))
>>
>> Iñaki
>>
>> >
>> > Best regards,
>> > Felix
>> >
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>> --
>> Iñaki Ucar
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Iñaki Ucar

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Mark van der Loo
Ah, my bad, you're right of course.

sum(abs(diff(diff( sort(x) < eps

for some reasonable eps then, would do as a oneliner, or

all(abs(diff(diff(sort(x < eps)

or

max(abs(diff(diff(sort(x) < eps


-Mark

Op vr 31 aug. 2018 om 16:14 schreef Iñaki Ucar :

> El vie., 31 ago. 2018 a las 16:00, Mark van der Loo
> () escribió:
> >
> > how about
> >
> > is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)
>
> This doesn't work, because
>
> 1. all.equal does *not* return FALSE. Use of isTRUE or identical(.,
> TRUE) is required if you want a boolean.
> 2. all.equal compares two objects, not elements in a vector.
>
> Iñaki
>
> >
> > (use ellipsis to set tolerance if necessary)
> >
> >
> > Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :
> >>
> >> Agreed that's it's rounding error, and all.equal would be the way to go.
> >> I wouldn't call it a bug, it's simply part of working with floating
> point numbers, any language has the same issue.
> >>
> >> And while we're at it, I think the function can be a lot shorter:
> >> .is_continous_evenly_spaced <- function(n){
> >>   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n),
> to=max(n), length.out = length(n
> >> }
> >>
> >> Cheers, Emil
> >>
> >> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
> >> () escribió:
> >> >
> >> > Dear all,
> >> >
> >> > I a bit unsure, whether this qualifies as a bug, but it is
> definitly a strange behaviour. That why I wanted to discuss it.
> >> >
> >> > With the following function, I want to test for evenly space
> numbers, starting from anywhere.
> >> >
> >> > .is_continous_evenly_spaced <- function(n){
> >> >   if(length(n) < 2) return(FALSE)
> >> >   n <- n[order(n)]
> >> >   n <- n - min(n)
> >> >   step <- n[2] - n[1]
> >> >   test <- seq(from = min(n), to = max(n), by = step)
> >> >   if(length(n) == length(test) &&
> >> >  all(n == test)){
> >> > return(TRUE)
> >> >   }
> >> >   return(FALSE)
> >> > }
> >> >
> >> > > .is_continous_evenly_spaced(c(1,2,3,4))
> >> > [1] TRUE
> >> > > .is_continous_evenly_spaced(c(1,3,4,5))
> >> > [1] FALSE
> >> > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
> >> > [1] FALSE
> >> >
> >> > I expect the result for 1 and 2, but not for 3. Upon
> Investigation it turns out, that n == test is TRUE for every pair, but not
> for the pair of 0.2.
> >> >
> >> > The types reported are always double, however n[2] == 0.1 reports
> FALSE as well.
> >> >
> >> > The whole problem is solved by switching from all(n == test) to
> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
> >> >
> >> > Does this work as intended? Thanks for any help, advise and
> suggestions in advance.
> >>
> >> I guess this has something to do with how the sequence is built and
> >> the inherent error of floating point arithmetic. In fact, if you
> >> return test minus n, you'll get:
> >>
> >> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
> >>
> >> and the error gets bigger when you continue the sequence; e.g., this
> >> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
> >>
> >> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
> >> [6] 4.440892e-16 4.440892e-16 0.00e+00
> >>
> >> So, independently of this is considered a bug or not, instead of
> >>
> >> length(n) == length(test) && all(n == test)
> >>
> >> I would use the following condition:
> >>
> >> isTRUE(all.equal(n, test))
> >>
> >> Iñaki
> >>
> >> >
> >> > Best regards,
> >> > Felix
> >> >
> >> >
> >> > [[alternative HTML version deleted]]
> >> >
> >> > __
> >> > R-devel@r-project.org mailing list
> >> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >>
> >>
> >> --
> >> Iñaki Ucar
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Iñaki Ucar
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Segfault when performing match on POSIXlt object

2018-08-31 Thread Marco Giuliano
Hi Martin,
should I file a formal bug report somewhere or you've already done it ?

On Fri, Aug 31, 2018 at 4:04 PM Martin Maechler 
wrote:

> > Martin Maechler  on Fri, 31 Aug 2018 16:00:07 +0200 writes:
>
> > Marco Giuliano on Fri, 31 Aug 2018 08:53:02 +0200 writes:
>
> >> Hi All, I found a possible unexpected behavior when
> >> performing match/%in% on POSIXlt objects, e.g. :
>
> >> d <- as.POSIXlt('2018-01-01')
>
> >> # match(,) --> segfault match(0,d)
>
> >> # consequently also this fails :
>
> > 0 %in% d
>
> >> REPORTED ERROR ON LINUX: *** caught segfault *** address
> >> 0x16dc2, cause 'memory not mapped'
>
> >> Verified on 3.5.0 on linux, 3.5.1 on Windows.
>
> > Confirmed (Linux, I think all version >= 3.4.0, but not in
> > R 3.3.3 (or earlier, presumably).
>
> ooops that was an offset error:
>
>   Bug in all versions >= 3.3.3,
>   but not in 3.2.5 (or earlier, presumably)
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Serguei Sokol

Le 31/08/2018 à 16:25, Mark van der Loo a écrit :

Ah, my bad, you're right of course.

sum(abs(diff(diff( sort(x) < eps

for some reasonable eps then, would do as a oneliner, or

all(abs(diff(diff(sort(x < eps)

or

max(abs(diff(diff(sort(x) < eps

Or with only four function calls:
diff(range(diff(sort(x < eps

Serguei.



-Mark

Op vr 31 aug. 2018 om 16:14 schreef Iñaki Ucar :


El vie., 31 ago. 2018 a las 16:00, Mark van der Loo
() escribió:

how about

is_evenly_spaced <- function(x,...) all.equal(diff(sort(x)),...)

This doesn't work, because

1. all.equal does *not* return FALSE. Use of isTRUE or identical(.,
TRUE) is required if you want a boolean.
2. all.equal compares two objects, not elements in a vector.

Iñaki


(use ellipsis to set tolerance if necessary)


Op vr 31 aug. 2018 om 15:46 schreef Emil Bode :

Agreed that's it's rounding error, and all.equal would be the way to go.
I wouldn't call it a bug, it's simply part of working with floating

point numbers, any language has the same issue.

And while we're at it, I think the function can be a lot shorter:
.is_continous_evenly_spaced <- function(n){
   length(n)>1 && isTRUE(all.equal(n[order(n)], seq(from=min(n),

to=max(n), length.out = length(n

}

Cheers, Emil

 El vie., 31 ago. 2018 a las 15:10, Felix Ernst
 () escribió:
 >
 > Dear all,
 >
 > I a bit unsure, whether this qualifies as a bug, but it is

definitly a strange behaviour. That why I wanted to discuss it.

 >
 > With the following function, I want to test for evenly space

numbers, starting from anywhere.

 >
 > .is_continous_evenly_spaced <- function(n){
 >   if(length(n) < 2) return(FALSE)
 >   n <- n[order(n)]
 >   n <- n - min(n)
 >   step <- n[2] - n[1]
 >   test <- seq(from = min(n), to = max(n), by = step)
 >   if(length(n) == length(test) &&
 >  all(n == test)){
 > return(TRUE)
 >   }
 >   return(FALSE)
 > }
 >
 > > .is_continous_evenly_spaced(c(1,2,3,4))
 > [1] TRUE
 > > .is_continous_evenly_spaced(c(1,3,4,5))
 > [1] FALSE
 > > .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
 > [1] FALSE
 >
 > I expect the result for 1 and 2, but not for 3. Upon

Investigation it turns out, that n == test is TRUE for every pair, but not
for the pair of 0.2.

 >
 > The types reported are always double, however n[2] == 0.1 reports

FALSE as well.

 >
 > The whole problem is solved by switching from all(n == test) to

all(as.character(n) == as.character(test)). However that is weird, isn’t it?

 >
 > Does this work as intended? Thanks for any help, advise and

suggestions in advance.

 I guess this has something to do with how the sequence is built and
 the inherent error of floating point arithmetic. In fact, if you
 return test minus n, you'll get:

 [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00

 and the error gets bigger when you continue the sequence; e.g., this
 is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):

 [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
 [6] 4.440892e-16 4.440892e-16 0.00e+00

 So, independently of this is considered a bug or not, instead of

 length(n) == length(test) && all(n == test)

 I would use the following condition:

 isTRUE(all.equal(n, test))

 Iñaki

 >
 > Best regards,
 > Felix
 >
 >
 > [[alternative HTML version deleted]]
 >
 > __
 > R-devel@r-project.org mailing list
 > https://stat.ethz.ch/mailman/listinfo/r-devel



 --
 Iñaki Ucar

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Iñaki Ucar


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématiques
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 62 25 01 27
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Marc Schwartz via R-devel



> On Aug 31, 2018, at 9:36 AM, Iñaki Ucar  wrote:
> 
> El vie., 31 ago. 2018 a las 15:10, Felix Ernst
> () escribió:
>> 
>> Dear all,
>> 
>> I a bit unsure, whether this qualifies as a bug, but it is definitly a 
>> strange behaviour. That why I wanted to discuss it.
>> 
>> With the following function, I want to test for evenly space numbers, 
>> starting from anywhere.
>> 
>> .is_continous_evenly_spaced <- function(n){
>>  if(length(n) < 2) return(FALSE)
>>  n <- n[order(n)]
>>  n <- n - min(n)
>>  step <- n[2] - n[1]
>>  test <- seq(from = min(n), to = max(n), by = step)
>>  if(length(n) == length(test) &&
>> all(n == test)){
>>return(TRUE)
>>  }
>>  return(FALSE)
>> }
>> 
>>> .is_continous_evenly_spaced(c(1,2,3,4))
>> [1] TRUE
>>> .is_continous_evenly_spaced(c(1,3,4,5))
>> [1] FALSE
>>> .is_continous_evenly_spaced(c(1,1.1,1.2,1.3))
>> [1] FALSE
>> 
>> I expect the result for 1 and 2, but not for 3. Upon Investigation it turns 
>> out, that n == test is TRUE for every pair, but not for the pair of 0.2.
>> 
>> The types reported are always double, however n[2] == 0.1 reports FALSE as 
>> well.
>> 
>> The whole problem is solved by switching from all(n == test) to 
>> all(as.character(n) == as.character(test)). However that is weird, isn’t it?
>> 
>> Does this work as intended? Thanks for any help, advise and suggestions in 
>> advance.
> 
> I guess this has something to do with how the sequence is built and
> the inherent error of floating point arithmetic. In fact, if you
> return test minus n, you'll get:
> 
> [1] 0.00e+00 0.00e+00 2.220446e-16 0.00e+00
> 
> and the error gets bigger when you continue the sequence; e.g., this
> is for c(1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7):
> 
> [1] 0.00e+00 0.00e+00 2.220446e-16 2.220446e-16 4.440892e-16
> [6] 4.440892e-16 4.440892e-16 0.00e+00
> 
> So, independently of this is considered a bug or not, instead of
> 
> length(n) == length(test) && all(n == test)
> 
> I would use the following condition:
> 
> isTRUE(all.equal(n, test))
> 
> Iñaki
> 
>> 
>> Best regards,
>> Felix


Hi,

This is essentially FAQ 7.31:

  
https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
 


Review that and the references therein to gain some insights into binary 
representations of floating point numbers.

Rather than the more complicated code you have above, try the following:

evenlyspaced <- function(x) {
  gaps <- diff(sort(x))
  all(gaps[-1] == gaps[1])
}

Note the use of ?diff:

> diff(c(1, 2, 3, 4))
[1] 1 1 1

> diff(c(1, 3, 4, 5))
[1] 2 1 1

> diff(c(1, 1.1, 1.2, 1.3))
[1] 0.1 0.1 0.1

However, in reality, due to the floating point representation issues noted 
above:

> print(diff(c(1, 1.1, 1.2, 1.3)), 20)
[1] 0.100088818 0.099866773
[3] 0.100088818

So the differences between the numbers are not exactly 0.1.

Using the function above, you get:

> evenlyspaced(c(1, 2, 3, 4))
[1] TRUE

> evenlyspaced(c(1, 3, 4, 5))
[1] FALSE

> evenlyspaced(c(1, 1.1, 1.2, 1.3))
[1] FALSE

As has been noted, if you want the gap comparison to be based upon some margin 
of error, use ?all.equal rather than the explicit equals comparison that I have 
in the function above. Something along the lines of:

evenlyspaced <- function(x) {
  gaps <- diff(sort(x))
  all(sapply(gaps[-1], function(x) all.equal(x, gaps[1])))
}

On which case, you now get:

evenlyspaced(c(1, 1.1, 1.2, 1.3))
[1] TRUE


Regards,

Marc Schwartz


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] compairing doubles

2018-08-31 Thread Iñaki Ucar
El vie., 31 ago. 2018 a las 17:08, Serguei Sokol
() escribió:
>
> Le 31/08/2018 à 16:25, Mark van der Loo a écrit :
> > Ah, my bad, you're right of course.
> >
> > sum(abs(diff(diff( sort(x) < eps
> >
> > for some reasonable eps then, would do as a oneliner, or
> >
> > all(abs(diff(diff(sort(x < eps)
> >
> > or
> >
> > max(abs(diff(diff(sort(x) < eps
> Or with only four function calls:
> diff(range(diff(sort(x < eps

We may have a winner... :)

Iñaki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Segfault when performing match on POSIXlt object

2018-08-31 Thread Martin Maechler
> Marco Giuliano 
> on Fri, 31 Aug 2018 16:50:56 +0200 writes:

> Hi Martin, should I file a formal bug report somewhere or
> you've already done it ?

No, I haven't, 
and as I may not address this bug further myself (in the near
future), it may be best if you file a formal report.

I will create an account for you on R's bugzilla - you will be
notified and can update your initial pseudo-random password.

Best,
Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] svg ignores cex.axis in R3.5.1 on macOS

2018-08-31 Thread Spencer Graves
  Plots produced using svg in R 3.5.1 under macOS 10.13.6 ignores 
cex.axis=2.  Consider the following:



> plot(1:2, cex.axis=2)
> svg('svg_ignores_cex.axis.svg')
> plot(1:2, cex.axis=2)
> dev.off()
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: 
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib


locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

loaded via a namespace (and not attached):
[1] compiler_3.5.1


  ** The axis labels are appropriately expanded with the first 
"plot(1:2, cex.axis=2)".  However, when I wrote that to an svg file and 
opened it in other applications (GIMP and Safari), the cex.axis request 
was ignored.  This also occurred inside RStudio on my Mac. It worked 
properly using R 3.2.1 under Windows 7.



  Thanks,
  Spencer Graves

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Argument 'dim' misspelled in error message

2018-08-31 Thread Hervé Pagès

Hi,

The following error message misspells the name of
the 'dim' argument:

  > array(integer(0), dim=integer(0))
  Error in array(integer(0), dim = integer(0)) :
'dims' cannot be of length 0

The name of the argument is 'dim' not 'dims':

  > args(array)
  function (data = NA, dim = length(data), dimnames = NULL)
  NULL

Cheers,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] svg ignores cex.axis in R3.5.1 on macOS

2018-08-31 Thread Spencer Graves




On 2018-08-31 14:21, Spencer Graves wrote:
Plots produced using svg in R 3.5.1 under macOS 10.13.6 ignores 
cex.axis=2.  Consider the following:



> plot(1:2, cex.axis=2)
> svg('svg_ignores_cex.axis.svg')
> plot(1:2, cex.axis=2)
> dev.off()
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.6

Matrix products: default
BLAS: 
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib


locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods base

loaded via a namespace (and not attached):
[1] compiler_3.5.1


  ** The axis labels are appropriately expanded with the first 
"plot(1:2, cex.axis=2)".  However, when I wrote that to an svg file 
and opened it in other applications (GIMP and Safari), the cex.axis 
request was ignored.  This also occurred inside RStudio on my Mac. It 
worked properly using R 3.2.1 under Windows 7.



I just confirmed that when I created a file like this under Windows 7 
and brought it back to my Mac, it displayed fine.  I have not tried this 
with the current version of R under Windows 7 nor an old version of R on 
my Mac.  Thanks.  Spencer



  Thanks,
  Spencer Graves

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Segfault when performing match on POSIXlt object

2018-08-31 Thread Marco Giuliano
Bug report submitted :
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17459
Thanks!

On Fri, Aug 31, 2018 at 6:48 PM Martin Maechler 
wrote:

> > Marco Giuliano
> > on Fri, 31 Aug 2018 16:50:56 +0200 writes:
>
> > Hi Martin, should I file a formal bug report somewhere or
> > you've already done it ?
>
> No, I haven't,
> and as I may not address this bug further myself (in the near
> future), it may be best if you file a formal report.
>
> I will create an account for you on R's bugzilla - you will be
> notified and can update your initial pseudo-random password.
>
> Best,
> Martin
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-31 Thread Henrik Bengtsson
Thanks all for a great discussion.

I think we can introduce assertions for length(x) <= 1 (and produce a
warning/error if not) without changing the value of these &&/||
expressions.

In R 3.4.0, '_R_CHECK_LENGTH_1_CONDITION_=true' was introduced to turn
warnings on "the condition has length > 1 and only the first element
will be used" in cases like 'if (c(TRUE, TRUE)) 42'  into errors.  The
idea is to later make '_R_CHECK_LENGTH_1_CONDITION_=true' the new
default.  I guess, someday this will always produce an error.

Similarly, the test for this &&/|| issue could be controlled by
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=warn' and
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=err' and possibly have
'_R_CHECK_LENGTH_1_LOGICAL_OPS_=true' default to 'warn' and later
'err'.

Changing the behavior of cases where length(x) == 0 is more likely to
break *some* code out there, and might require a separate
discussion/set of validations.  It's not unlikely that someone
actually relied on this to resolve to NA.  BTW, since it hasn't been
explicitly said, it's "logical" that we have TRUE && logical(0)
resolving to NA, because it currently behaves as TRUE[1] &&
logical(0)[1], which resolves to TRUE && NA => NA.  If a decision on
the zero-length case would delay fixing the length(x) > 1 case, I
would postpone the decision on the former.

/Henrik

On Fri, Aug 31, 2018 at 2:48 AM Emil Bode  wrote:
>
>
> On 30/08/2018, 20:15, "R-devel on behalf of Hadley Wickham" 
>  wrote:
>
> On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler
>  wrote:
> >
> > > Joris Meys
> > > on Thu, 30 Aug 2018 14:48:01 +0200 writes:
> >
> > > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
> > >  wrote:
> > >> Note that `||` and `&&` have never been symmetric:
> > >>
> > >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
> > >> error
> > >>
> > >>
> > > Fair point. So the suggestion would be to check whether x
> > > is of length 1 and whether y is of length 1 only when
> > > needed. I.e.
> >
> > > c(TRUE,FALSE) || TRUE
> >
> > > would give an error and
> >
> > > TRUE || c(TRUE, FALSE)
> >
> > > would pass.
> >
> > > Thought about it a bit more, and I can't come up with a
> > > use case where the first line must pass. So if the short
> > > circuiting remains and the extra check only gives a small
> > > performance penalty, adding the error could indeed make
> > > some bugs more obvious.
> >
> > I agree "in theory".
> > Thank you, Henrik, for bringing it up!
> >
> > In practice I think we should start having a warning signalled.
> > I have checked the source code in the mean time, and the check
> > is really very cheap
> > { because it can/should be done after checking isNumber(): so
> >   then we know we have an atomic and can use XLENGTH() }
> >
> >
> > The 0-length case I don't think we should change as I do find
> > NA (is logical!) to be an appropriate logical answer.
>
> Can you explain your reasoning a bit more here? I'd like to understand
> the general principle, because from my perspective it's more
> parsimonious to say that the inputs to || and && must be length 1,
> rather than to say that inputs could be length 0 or length 1, and in
> the length 0 case they are replaced with NA.
>
> Hadley
>
> I would say the value NA would cause warnings later on, that are easy to 
> track down, so a return of NA is far less likely to cause problems than an 
> unintended TRUE or FALSE. And I guess there would be some code reliant on 
> 'logical(0) || TRUE' returning TRUE, that wouldn't necessarily be a mistake.
>
> But I think it's hard to predict how exactly people are using functions. I 
> personally can't imagine a situation where I'd use || or && outside an 
> if-statement, so I'd rather have the current behaviour, because I'm not sure 
> if I'm reliant on logical(0) || TRUE  somewhere in my code (even though that 
> would be ugly code, it's not wrong per se)
> But I could always rewrite it, so I believe it's more a question of how much 
> would have to be rewritten. Maybe implement it first in devel, to see how 
> many people would complain?
>
> Emil Bode
>
>
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel