Re: [Rd] plotmath degree symbol

2018-08-30 Thread Martin Møller Skarbiniks Pedersen
On Fri, 24 Aug 2018 at 19:53, Edzer Pebesma
 wrote:
>
> In plotmath expressions, R's degree symbol, e.g. shown by
>
> plot(1, main = parse(text = "1*degree*C"))
>
> has sunk to halfway the text line, instead of touching its top. In older
> R versions this looked much better.

I can confirm this problem.

R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS

Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.5.1

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] build package with unicode (farsi) strings

2018-08-30 Thread Thierry Onkelinx
Dear Farid,

Try using the ASCII notation. letters_fa <- c("\u0627", "\u0641"). The full
code table is available at https://www.utf8-chartable.de

Best regards,



ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkel...@inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///



2018-08-28 7:17 GMT+02:00 Faridedin Cheraghi :

> Hi,
>
> I have a R script file with Persian letters in it defined as a variable:
>
> #' @export
> letters_fa <- c('الف','ب','پ','ت','ث','ج','چ','ح','خ','ر','ز','د')
>
> I have specified the encoding field in my DESCRIPTION file of my package.
>
> ...
> Encoding: UTF-8
> ...
>
> I also included Sys.setlocale(locale="Persian") in my .RProfile, so it is
> executed when RCMD is called. However, after a BUILD and INSTALL, when I
> access the variable from the package, the characters are not printed
> correctly:
> > futils::letters_fa
>  [1] "<84><81>" "" ""
>"" ""
>  [6] "" "<86>" ""
>"" ""
> [11] "" ""
>
>
> thanks
> Farid
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Emil Bode
I have to disagree, I think one of the advantages of '||' (or &&) is the lazy 
evaluation, i.e. you can use the first condition to "not care" about the second 
(and stop errors from being thrown).
So if I want to check if x is a length-one numeric with value a value between 0 
and 1, I can do 'class(x)=='numeric' && length(x)==1 && x>0 && x<1'.
In your proposal, having x=c(1,2) would throw an error or multiple warnings.
Also code that relies on the second argument not being evaluated would break, 
as we need to evaluate y in order to know length(y)
There may be some benefit in checking for length(x) only, though that could 
also cause some false positives (e.g. 'x==-1 || length(x)==0' would be a bit 
ugly, but not necessarily wrong, same for someone too lazy to write x[1] 
instead of x).

And I don’t really see the advantage. The casting to length one is (I think), a 
feature, not a bug. If I have/need a length one x, and a length one y, why not 
use '|' and '&'? I have to admit I only use them in if-statements, and if I 
need an error to be thrown when x and y are not length one, I can use the 
shorter versions and then the if throws a warning (or an error for a length-0 
or NA result).

I get it that for someone just starting in R, the differences between | and || 
can be confusing, but I guess that's just the price to pay for having a 
vectorized language.

Best regards, 
Emil Bode
 
Data-analyst
 
+31 6 43 83 89 33
emil.b...@dans.knaw.nl
 
DANS: Netherlands Institute for Permanent Access to Digital Research Resources
Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | 
i...@dans.knaw.nl  | dans.knaw.nl 

DANS is an institute of the Dutch Academy KNAW  and funding 
organisation NWO . 

On 29/08/2018, 05:03, "R-devel on behalf of Henrik Bengtsson" 
 wrote:

# Issue

'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
using R 3.5.1),

> c(TRUE, TRUE) || FALSE
[1] TRUE
> c(TRUE, FALSE) || FALSE
[1] TRUE
> c(TRUE, NA) || FALSE
[1] TRUE
> c(FALSE, TRUE) || FALSE
[1] FALSE

This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
same) and it also applies to 'x && y'.

Note also how the above truncation of 'x' is completely silent -
there's neither an error nor a warning being produced.


# Discussion/Suggestion

Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
mistake.  Either the code is written assuming 'x' and 'y' are scalars,
or there is a coding error and vectorized versions 'x | y' and 'x & y'
were intended.  Should 'x || y' always be considered an mistake if
'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
or an error?  For instance,
'''r
> x <- c(TRUE, TRUE)
> y <- FALSE
> x || y

Error in x || y : applying scalar operator || to non-scalar elements
Execution halted

What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
'x || y' returns 'NA' in such cases, e.g.

> logical(0) || c(FALSE, NA)
[1] NA
> logical(0) || logical(0)
[1] NA
> logical(0) && logical(0)
[1] NA

I don't know the background for this behavior, but I'm sure there is
an argument behind that one.  Maybe it's simply that '||' and '&&'
should always return a scalar logical and neither TRUE nor FALSE can
be returned.

/Henrik

PS. This is in the same vein as
https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
- in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
_R_CHECK_LENGTH_1_CONDITION_=true

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Joris Meys
I have to agree with Emil here. && and || are short circuited like in C and
C++. That means that

TRUE || c(TRUE, FALSE)
FALSE && c(TRUE, FALSE)

cannot give an error because the second part is never evaluated. Throwing a
warning or error for

c(TRUE, FALSE) || TRUE

would mean that the operator gives a different result depending on the
order of the objects, breaking the symmetry. Also that would be undesirable.

Regarding logical(0): per the documentation, it is indeed so that ||, &&
and isTRUE always return a length-one logical vector. Hence the NA.

On a sidenote: there is no such thing as a scalar in R. What you call
scalar, is really a length-one vector. That seems like a detail, but is
important in understanding why this admittedly confusing behaviour actually
makes sense within the framework of R imho. I do understand your objections
and suggestions, but it would boil down to removing short circuited
operators from R.

My 2 cents.
Cheers
Joris

On Wed, Aug 29, 2018 at 5:03 AM Henrik Bengtsson 
wrote:

> # Issue
>
> 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
> using R 3.5.1),
>
> > c(TRUE, TRUE) || FALSE
> [1] TRUE
> > c(TRUE, FALSE) || FALSE
> [1] TRUE
> > c(TRUE, NA) || FALSE
> [1] TRUE
> > c(FALSE, TRUE) || FALSE
> [1] FALSE
>
> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
> same) and it also applies to 'x && y'.
>
> Note also how the above truncation of 'x' is completely silent -
> there's neither an error nor a warning being produced.
>
>
> # Discussion/Suggestion
>
> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
> mistake.  Either the code is written assuming 'x' and 'y' are scalars,
> or there is a coding error and vectorized versions 'x | y' and 'x & y'
> were intended.  Should 'x || y' always be considered an mistake if
> 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
> or an error?  For instance,
> '''r
> > x <- c(TRUE, TRUE)
> > y <- FALSE
> > x || y
>
> Error in x || y : applying scalar operator || to non-scalar elements
> Execution halted
>
> What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
> 'x || y' returns 'NA' in such cases, e.g.
>
> > logical(0) || c(FALSE, NA)
> [1] NA
> > logical(0) || logical(0)
> [1] NA
> > logical(0) && logical(0)
> [1] NA
>
> I don't know the background for this behavior, but I'm sure there is
> an argument behind that one.  Maybe it's simply that '||' and '&&'
> should always return a scalar logical and neither TRUE nor FALSE can
> be returned.
>
> /Henrik
>
> PS. This is in the same vein as
> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
> _R_CHECK_LENGTH_1_CONDITION_=true
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Dénes Tóth

Hi,

I absolutely second Henrik's suggestion.

On 08/30/2018 01:09 PM, Emil Bode wrote:

I have to disagree, I think one of the advantages of '||' (or &&) is the lazy evaluation, 
i.e. you can use the first condition to "not care" about the second (and stop errors 
from being thrown).


I do not think Henrik's proposal implies that both arguments of `||` or 
`&&` should be evaluated before the evaluation of the condition. It 
implies that if an argument is evaluated, and its length does not equal 
one, it should return an error instead of the silent truncation of the 
argument.

So your argument is orthogonal to the issue.


So if I want to check if x is a length-one numeric with value a value between 0 and 1, I can do 
'class(x)=='numeric' && length(x)==1 && x>0 && x<1'.
In your proposal, having x=c(1,2) would throw an error or multiple warnings.
Also code that relies on the second argument not being evaluated would break, 
as we need to evaluate y in order to know length(y)
There may be some benefit in checking for length(x) only, though that could 
also cause some false positives (e.g. 'x==-1 || length(x)==0' would be a bit 
ugly, but not necessarily wrong, same for someone too lazy to write x[1] 
instead of x).

And I don’t really see the advantage. The casting to length one is (I think), a 
feature, not a bug. If I have/need a length one x, and a length one y, why not use 
'|' and '&'? I have to admit I only use them in if-statements, and if I need an 
error to be thrown when x and y are not length one, I can use the shorter versions 
and then the if throws a warning (or an error for a length-0 or NA result).

I get it that for someone just starting in R, the differences between | and || 
can be confusing, but I guess that's just the price to pay for having a 
vectorized language.


I use R for about 10 years, and use regularly `||` and `&&` for the 
standard purpose (implemented in most programming languages for the same 
purpose, that is, no evaluation of all arguments if it is not required 
to decide whether the condition is TRUE). I can not recall any single 
case when I wanted to use them for the purpose to evaluate whether the 
*first* elements of vectors fulfill the given condition.


However, I regularly write mistakenly `||` or `&&` when I actually want 
to write `|` or `&`, and have no chance to spot the error because of the 
silent truncation of the arguments.



Regards,
Denes





Best regards,
Emil Bode
  
Data-analyst
  
+31 6 43 83 89 33

emil.b...@dans.knaw.nl
  
DANS: Netherlands Institute for Permanent Access to Digital Research Resources

Anna van Saksenlaan 51 | 2593 HW Den Haag | +31 70 349 44 50 | i...@dans.knaw.nl 
 | dans.knaw.nl 

DANS is an institute of the Dutch Academy KNAW  and funding 
organisation NWO .

On 29/08/2018, 05:03, "R-devel on behalf of Henrik Bengtsson" 
 wrote:

 # Issue
 
 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here

 using R 3.5.1),
 
 > c(TRUE, TRUE) || FALSE

 [1] TRUE
 > c(TRUE, FALSE) || FALSE
 [1] TRUE
 > c(TRUE, NA) || FALSE
 [1] TRUE
 > c(FALSE, TRUE) || FALSE
 [1] FALSE
 
 This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the

 same) and it also applies to 'x && y'.
 
 Note also how the above truncation of 'x' is completely silent -

 there's neither an error nor a warning being produced.
 
 
 # Discussion/Suggestion
 
 Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a

 mistake.  Either the code is written assuming 'x' and 'y' are scalars,
 or there is a coding error and vectorized versions 'x | y' and 'x & y'
 were intended.  Should 'x || y' always be considered an mistake if
 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
 or an error?  For instance,
 '''r
 > x <- c(TRUE, TRUE)
 > y <- FALSE
 > x || y
 
 Error in x || y : applying scalar operator || to non-scalar elements

 Execution halted
 
 What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today

 'x || y' returns 'NA' in such cases, e.g.
 
 > logical(0) || c(FALSE, NA)

 [1] NA
 > logical(0) || logical(0)
 [1] NA
 > logical(0) && logical(0)
 [1] NA
 
 I don't know the background for this behavior, but I'm sure there is

 an argument behind that one.  Maybe it's simply that '||' and '&&'
 should always return a scalar logical and neither TRUE nor FALSE can
 be returned.
 
 /Henrik
 
 PS. This is in the same vein as

 https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
 - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
 _R_CHECK_LENGTH_1_CONDITION_=true
 
 __

 R-devel@r-project.org mailing list
 https://stat.ethz.ch/

Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Dénes Tóth




On 08/30/2018 01:56 PM, Joris Meys wrote:

I have to agree with Emil here. && and || are short circuited like in C and
C++. That means that

TRUE || c(TRUE, FALSE)
FALSE && c(TRUE, FALSE)

cannot give an error because the second part is never evaluated. Throwing a
warning or error for

c(TRUE, FALSE) || TRUE

would mean that the operator gives a different result depending on the
order of the objects, breaking the symmetry. Also that would be undesirable.


Note that `||` and `&&` have never been symmetric:

TRUE || stop() # returns TRUE
stop() || TRUE # returns an error




Regarding logical(0): per the documentation, it is indeed so that ||, &&
and isTRUE always return a length-one logical vector. Hence the NA.

On a sidenote: there is no such thing as a scalar in R. What you call
scalar, is really a length-one vector. That seems like a detail, but is
important in understanding why this admittedly confusing behaviour actually
makes sense within the framework of R imho. I do understand your objections
and suggestions, but it would boil down to removing short circuited
operators from R.

My 2 cents.
Cheers
Joris

On Wed, Aug 29, 2018 at 5:03 AM Henrik Bengtsson 
wrote:


# Issue

'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
using R 3.5.1),


c(TRUE, TRUE) || FALSE

[1] TRUE

c(TRUE, FALSE) || FALSE

[1] TRUE

c(TRUE, NA) || FALSE

[1] TRUE

c(FALSE, TRUE) || FALSE

[1] FALSE

This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
same) and it also applies to 'x && y'.

Note also how the above truncation of 'x' is completely silent -
there's neither an error nor a warning being produced.


# Discussion/Suggestion

Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
mistake.  Either the code is written assuming 'x' and 'y' are scalars,
or there is a coding error and vectorized versions 'x | y' and 'x & y'
were intended.  Should 'x || y' always be considered an mistake if
'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
or an error?  For instance,
'''r

x <- c(TRUE, TRUE)
y <- FALSE
x || y


Error in x || y : applying scalar operator || to non-scalar elements
Execution halted

What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
'x || y' returns 'NA' in such cases, e.g.


logical(0) || c(FALSE, NA)

[1] NA

logical(0) || logical(0)

[1] NA

logical(0) && logical(0)

[1] NA

I don't know the background for this behavior, but I'm sure there is
an argument behind that one.  Maybe it's simply that '||' and '&&'
should always return a scalar logical and neither TRUE nor FALSE can
be returned.

/Henrik

PS. This is in the same vein as
https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
- in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
_R_CHECK_LENGTH_1_CONDITION_=true

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Joris Meys
On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth  wrote:

> Note that `||` and `&&` have never been symmetric:
>
> TRUE || stop() # returns TRUE
> stop() || TRUE # returns an error
>
>
Fair point. So the suggestion would be to check whether x is of length 1
and whether y is of length 1 only when needed. I.e.

c(TRUE,FALSE) || TRUE

would give an error and

TRUE || c(TRUE, FALSE)

would pass.

Thought about it a bit more, and I can't come up with a use case where the
first line must pass. So if the short circuiting remains and the extra
check only gives a small performance penalty, adding the error could indeed
make some bugs more obvious.

Cheers
Joris

-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Emil Bode
Okay, I thought you always wanted to check the length, but if we can only check 
what's evaluated I mostly agree.

I still think there's not much wrong with how length-0 logicals are treated, as 
the return of NA in cases where the value matters is enough warning I think, 
and I can imagine some code like my previous example 'x==-1 || length(x)==0', 
which wouldn't need a warning.

But we could do a check for length being >1

Greetings, Emil


On 30/08/2018, 14:55, "R-devel on behalf of Joris Meys" 
 wrote:

On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth  wrote:

> Note that `||` and `&&` have never been symmetric:
>
> TRUE || stop() # returns TRUE
> stop() || TRUE # returns an error
>
>
Fair point. So the suggestion would be to check whether x is of length 1
and whether y is of length 1 only when needed. I.e.

c(TRUE,FALSE) || TRUE

would give an error and

TRUE || c(TRUE, FALSE)

would pass.

Thought about it a bit more, and I can't come up with a use case where the
first line must pass. So if the short circuiting remains and the extra
check only gives a small performance penalty, adding the error could indeed
make some bugs more obvious.

Cheers
Joris

-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)



---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Hadley Wickham
I think this is an excellent idea as it eliminates a situation which
is almost certainly user error. Making it an error would break a small
amount of existing code (even if for the better), so perhaps it should
start as a warning, but be optionally upgraded to an error. It would
be nice to have a fixed date (R version) in the future when the
default will change to error.

In an ideal world, I think the following four cases should all return
the same error:

if (logical()) 1
#> Error in if (logical()) 1: argument is of length zero
if (c(TRUE, TRUE)) 1
#> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only the
#> first element will be used
#> [1] 1

logical() || TRUE
#> [1] TRUE
c(TRUE, TRUE) || TRUE
#> [1] TRUE

i.e. I think that `if`, `&&`, and `||` should all check that their
input is a logical (or numeric) vector of length 1.

Hadley

On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson
 wrote:
>
> # Issue
>
> 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
> using R 3.5.1),
>
> > c(TRUE, TRUE) || FALSE
> [1] TRUE
> > c(TRUE, FALSE) || FALSE
> [1] TRUE
> > c(TRUE, NA) || FALSE
> [1] TRUE
> > c(FALSE, TRUE) || FALSE
> [1] FALSE
>
> This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
> same) and it also applies to 'x && y'.
>
> Note also how the above truncation of 'x' is completely silent -
> there's neither an error nor a warning being produced.
>
>
> # Discussion/Suggestion
>
> Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
> mistake.  Either the code is written assuming 'x' and 'y' are scalars,
> or there is a coding error and vectorized versions 'x | y' and 'x & y'
> were intended.  Should 'x || y' always be considered an mistake if
> 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
> or an error?  For instance,
> '''r
> > x <- c(TRUE, TRUE)
> > y <- FALSE
> > x || y
>
> Error in x || y : applying scalar operator || to non-scalar elements
> Execution halted
>
> What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
> 'x || y' returns 'NA' in such cases, e.g.
>
> > logical(0) || c(FALSE, NA)
> [1] NA
> > logical(0) || logical(0)
> [1] NA
> > logical(0) && logical(0)
> [1] NA
>
> I don't know the background for this behavior, but I'm sure there is
> an argument behind that one.  Maybe it's simply that '||' and '&&'
> should always return a scalar logical and neither TRUE nor FALSE can
> be returned.
>
> /Henrik
>
> PS. This is in the same vein as
> https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
> - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
> _R_CHECK_LENGTH_1_CONDITION_=true
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] build package with unicode (farsi) strings

2018-08-30 Thread Ista Zahn
On Thu, Aug 30, 2018 at 3:11 AM Thierry Onkelinx
 wrote:
>
> Dear Farid,
>
> Try using the ASCII notation. letters_fa <- c("\u0627", "\u0641").

... as recommend in the manual:
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Encoding-issues

Best,
Ista

The full
> code table is available at https://www.utf8-chartable.de
>
> Best regards,
>
>
>
> ir. Thierry Onkelinx
> Statisticus / Statistician
>
> Vlaamse Overheid / Government of Flanders
> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
> FOREST
> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> thierry.onkel...@inbo.be
> Havenlaan 88 bus 73, 1000 Brussel
> www.inbo.be
>
> ///
> To call in the statistician after the experiment is done may be no more
> than asking him to perform a post-mortem examination: he may be able to say
> what the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> ///
>
> 
>
> 2018-08-28 7:17 GMT+02:00 Faridedin Cheraghi :
>
> > Hi,
> >
> > I have a R script file with Persian letters in it defined as a variable:
> >
> > #' @export
> > letters_fa <- c('الف','ب','پ','ت','ث','ج','چ','ح','خ','ر','ز','د')
> >
> > I have specified the encoding field in my DESCRIPTION file of my package.
> >
> > ...
> > Encoding: UTF-8
> > ...
> >
> > I also included Sys.setlocale(locale="Persian") in my .RProfile, so it is
> > executed when RCMD is called. However, after a BUILD and INSTALL, when I
> > access the variable from the package, the characters are not printed
> > correctly:
> > > futils::letters_fa
> >  [1] "<84><81>" "" ""
> >"" ""
> >  [6] "" "<86>" ""
> >"" ""
> > [11] "" ""
> >
> >
> > thanks
> > Farid
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread William Dunlap via R-devel
Should the following two functions should always give the same result,
except for possible differences in the 'call' component of the warning
or error message?:

  f0 <- function(x, y) x || y
  f1 <- function(x, y) if (x) { TRUE } else { if (y) {TRUE } else { FALSE }
}

And the same for the 'and' version?

  g0 <- function(x, y) x && y
  g1 <- function(x, y) if (x) { if (y) { TRUE } else { FALSE } } else {
FALSE }

The proposal is to make them act the same when length(x) or length(y) is
not 1.
Should they also act the same when x or y is NA?  'if (x)' currently stops
if is.na(x)
and 'x||y' does not.  Or should we continue with 'if' restricted to
bi-valued
logical and '||' and '&&' handling tri-valued logic?



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Aug 30, 2018 at 7:16 AM, Hadley Wickham  wrote:

> I think this is an excellent idea as it eliminates a situation which
> is almost certainly user error. Making it an error would break a small
> amount of existing code (even if for the better), so perhaps it should
> start as a warning, but be optionally upgraded to an error. It would
> be nice to have a fixed date (R version) in the future when the
> default will change to error.
>
> In an ideal world, I think the following four cases should all return
> the same error:
>
> if (logical()) 1
> #> Error in if (logical()) 1: argument is of length zero
> if (c(TRUE, TRUE)) 1
> #> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only
> the
> #> first element will be used
> #> [1] 1
>
> logical() || TRUE
> #> [1] TRUE
> c(TRUE, TRUE) || TRUE
> #> [1] TRUE
>
> i.e. I think that `if`, `&&`, and `||` should all check that their
> input is a logical (or numeric) vector of length 1.
>
> Hadley
>
> On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson
>  wrote:
> >
> > # Issue
> >
> > 'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
> > using R 3.5.1),
> >
> > > c(TRUE, TRUE) || FALSE
> > [1] TRUE
> > > c(TRUE, FALSE) || FALSE
> > [1] TRUE
> > > c(TRUE, NA) || FALSE
> > [1] TRUE
> > > c(FALSE, TRUE) || FALSE
> > [1] FALSE
> >
> > This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
> > same) and it also applies to 'x && y'.
> >
> > Note also how the above truncation of 'x' is completely silent -
> > there's neither an error nor a warning being produced.
> >
> >
> > # Discussion/Suggestion
> >
> > Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
> > mistake.  Either the code is written assuming 'x' and 'y' are scalars,
> > or there is a coding error and vectorized versions 'x | y' and 'x & y'
> > were intended.  Should 'x || y' always be considered an mistake if
> > 'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
> > or an error?  For instance,
> > '''r
> > > x <- c(TRUE, TRUE)
> > > y <- FALSE
> > > x || y
> >
> > Error in x || y : applying scalar operator || to non-scalar elements
> > Execution halted
> >
> > What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
> > 'x || y' returns 'NA' in such cases, e.g.
> >
> > > logical(0) || c(FALSE, NA)
> > [1] NA
> > > logical(0) || logical(0)
> > [1] NA
> > > logical(0) && logical(0)
> > [1] NA
> >
> > I don't know the background for this behavior, but I'm sure there is
> > an argument behind that one.  Maybe it's simply that '||' and '&&'
> > should always return a scalar logical and neither TRUE nor FALSE can
> > be returned.
> >
> > /Henrik
> >
> > PS. This is in the same vein as
> > https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
> > - in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
> > _R_CHECK_LENGTH_1_CONDITION_=true
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> http://hadley.nz
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Martin Maechler
> Joris Meys 
> on Thu, 30 Aug 2018 14:48:01 +0200 writes:

> On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
>  wrote:
>> Note that `||` and `&&` have never been symmetric:
>> 
>> TRUE || stop() # returns TRUE stop() || TRUE # returns an
>> error
>> 
>> 
> Fair point. So the suggestion would be to check whether x
> is of length 1 and whether y is of length 1 only when
> needed. I.e.

> c(TRUE,FALSE) || TRUE

> would give an error and

> TRUE || c(TRUE, FALSE)

> would pass.

> Thought about it a bit more, and I can't come up with a
> use case where the first line must pass. So if the short
> circuiting remains and the extra check only gives a small
> performance penalty, adding the error could indeed make
> some bugs more obvious.

I agree "in theory".
Thank you, Henrik, for bringing it up!

In practice I think we should start having a warning signalled.
I have checked the source code in the mean time, and the check
is really very cheap
{ because it can/should be done after checking isNumber(): so
  then we know we have an atomic and can use XLENGTH() }


The 0-length case I don't think we should change as I do find
NA (is logical!) to be an appropriate logical answer.

Martin Maechler
ETH Zurich and R Core team.

> Cheers Joris

> -- 
> Joris Meys Statistical consultant

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Rui Barradas

Hello,

Inline.

Às 16:44 de 30/08/2018, William Dunlap via R-devel escreveu:

Should the following two functions should always give the same result,
except for possible differences in the 'call' component of the warning
or error message?:

   f0 <- function(x, y) x || y
   f1 <- function(x, y) if (x) { TRUE } else { if (y) {TRUE } else { FALSE }
}

And the same for the 'and' version?

   g0 <- function(x, y) x && y
   g1 <- function(x, y) if (x) { if (y) { TRUE } else { FALSE } } else {
FALSE }

The proposal is to make them act the same when length(x) or length(y) is
not 1.
Should they also act the same when x or y is NA?  'if (x)' currently stops
if is.na(x)
and 'x||y' does not.  Or should we continue with 'if' restricted to
bi-valued
logical and '||' and '&&' handling tri-valued logic?


I expect R to continue to do


f0(FALSE, NA)# [1] NA
f0(NA, FALSE)# [1] NA

g0(TRUE, NA)# [1] NA
g0(NA, TRUE)# [1] NA

f1(FALSE, NA)
#Error in if (y) { : missing value where TRUE/FALSE needed
f1(NA, FALSE)
#Error in if (x) { : missing value where TRUE/FALSE needed

g1(TRUE, NA)
#Error in if (x) { : missing value where TRUE/FALSE needed
g1(NA, TRUE)
#Error in if (x) { : missing value where TRUE/FALSE needed



Please don't change this.
There's more to the logical operators than the operands' lengths. That 
issue needs to be fixed but it doesn't mean a radical change should happen.
And the same goes for 'if'. Here the problem is completely different, 
there's more to 'if' than '||' and '&&'. Any change should be done with 
increased care. (Which I'm sure will, as always.)


Rui Barradas






Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Aug 30, 2018 at 7:16 AM, Hadley Wickham  wrote:


I think this is an excellent idea as it eliminates a situation which
is almost certainly user error. Making it an error would break a small
amount of existing code (even if for the better), so perhaps it should
start as a warning, but be optionally upgraded to an error. It would
be nice to have a fixed date (R version) in the future when the
default will change to error.

In an ideal world, I think the following four cases should all return
the same error:

if (logical()) 1
#> Error in if (logical()) 1: argument is of length zero
if (c(TRUE, TRUE)) 1
#> Warning in if (c(TRUE, TRUE)) 1: the condition has length > 1 and only
the
#> first element will be used
#> [1] 1

logical() || TRUE
#> [1] TRUE
c(TRUE, TRUE) || TRUE
#> [1] TRUE

i.e. I think that `if`, `&&`, and `||` should all check that their
input is a logical (or numeric) vector of length 1.

Hadley

On Tue, Aug 28, 2018 at 10:03 PM Henrik Bengtsson
 wrote:


# Issue

'x || y' performs 'x[1] || y' for length(x) > 1.  For instance (here
using R 3.5.1),


c(TRUE, TRUE) || FALSE

[1] TRUE

c(TRUE, FALSE) || FALSE

[1] TRUE

c(TRUE, NA) || FALSE

[1] TRUE

c(FALSE, TRUE) || FALSE

[1] FALSE

This property is symmetric in LHS and RHS (i.e. 'y || x' behaves the
same) and it also applies to 'x && y'.

Note also how the above truncation of 'x' is completely silent -
there's neither an error nor a warning being produced.


# Discussion/Suggestion

Using 'x || y' and 'x && y' with a non-scalar 'x' or 'y' is likely a
mistake.  Either the code is written assuming 'x' and 'y' are scalars,
or there is a coding error and vectorized versions 'x | y' and 'x & y'
were intended.  Should 'x || y' always be considered an mistake if
'length(x) != 1' or 'length(y) != 1'?  If so, should it be a warning
or an error?  For instance,
'''r

x <- c(TRUE, TRUE)
y <- FALSE
x || y


Error in x || y : applying scalar operator || to non-scalar elements
Execution halted

What about the case where 'length(x) == 0' or 'length(y) == 0'?  Today
'x || y' returns 'NA' in such cases, e.g.


logical(0) || c(FALSE, NA)

[1] NA

logical(0) || logical(0)

[1] NA

logical(0) && logical(0)

[1] NA

I don't know the background for this behavior, but I'm sure there is
an argument behind that one.  Maybe it's simply that '||' and '&&'
should always return a scalar logical and neither TRUE nor FALSE can
be returned.

/Henrik

PS. This is in the same vein as
https://mailman.stat.ethz.ch/pipermail/r-devel/2017-March/073817.html
- in R (>=3.4.0) we now get that if (1:2 == 1) ... is an error if
_R_CHECK_LENGTH_1_CONDITION_=true

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Joris Meys
On Thu, Aug 30, 2018 at 5:58 PM Martin Maechler 
wrote:

>
> I agree "in theory".
> Thank you, Henrik, for bringing it up!
>
> In practice I think we should start having a warning signalled.
>

I agree. I wouldn't know who would count on the automatic selection of the
first value, but better safe than sorry.


> I have checked the source code in the mean time, and the check
> is really very cheap
> { because it can/should be done after checking isNumber(): so
>   then we know we have an atomic and can use XLENGTH() }
>
>
That was my idea as well after going through the source code. I didn't want
to state it as I don't know enough of the code base and couldn't see if
there were complications I missed. Thank you for confirming!

Cheers
Joris
-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ROBUSTNESS: x || y and x && y to give warning/error if length(x) != 1 or length(y) != 1

2018-08-30 Thread Hadley Wickham
On Thu, Aug 30, 2018 at 10:58 AM Martin Maechler
 wrote:
>
> > Joris Meys
> > on Thu, 30 Aug 2018 14:48:01 +0200 writes:
>
> > On Thu, Aug 30, 2018 at 2:09 PM Dénes Tóth
> >  wrote:
> >> Note that `||` and `&&` have never been symmetric:
> >>
> >> TRUE || stop() # returns TRUE stop() || TRUE # returns an
> >> error
> >>
> >>
> > Fair point. So the suggestion would be to check whether x
> > is of length 1 and whether y is of length 1 only when
> > needed. I.e.
>
> > c(TRUE,FALSE) || TRUE
>
> > would give an error and
>
> > TRUE || c(TRUE, FALSE)
>
> > would pass.
>
> > Thought about it a bit more, and I can't come up with a
> > use case where the first line must pass. So if the short
> > circuiting remains and the extra check only gives a small
> > performance penalty, adding the error could indeed make
> > some bugs more obvious.
>
> I agree "in theory".
> Thank you, Henrik, for bringing it up!
>
> In practice I think we should start having a warning signalled.
> I have checked the source code in the mean time, and the check
> is really very cheap
> { because it can/should be done after checking isNumber(): so
>   then we know we have an atomic and can use XLENGTH() }
>
>
> The 0-length case I don't think we should change as I do find
> NA (is logical!) to be an appropriate logical answer.

Can you explain your reasoning a bit more here? I'd like to understand
the general principle, because from my perspective it's more
parsimonious to say that the inputs to || and && must be length 1,
rather than to say that inputs could be length 0 or length 1, and in
the length 0 case they are replaced with NA.

Hadley

-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] build package with unicode (farsi) strings

2018-08-30 Thread Hadley Wickham
On Thu, Aug 30, 2018 at 2:11 AM Thierry Onkelinx
 wrote:
>
> Dear Farid,
>
> Try using the ASCII notation. letters_fa <- c("\u0627", "\u0641"). The full
> code table is available at https://www.utf8-chartable.de

It's a little easier to do this with code:

letters_fa <- c('الف','ب','پ','ت','ث','ج','چ','ح','خ','ر','ز','د')
writeLines(stringi::stri_escape_unicode(letters_fa))
#> \u0627\u0644\u0641
#> \u0628
#> \u067e
#> \u062a
#> \u062b
#> \u062c
#> \u0686
#> \u062d
#> \u062e
#> \u0631
#> \u0632
#> \u062f

Hadley

-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Detecting whether a process exists or not by its PID?

2018-08-30 Thread Henrik Bengtsson
Hi, I'd like to test whether a (localhost) PSOCK cluster node is still
running or not by its PID, e.g. it may have crashed / core dumped.
I'm ok with getting false-positive results due to *another* process
with the same PID has since started.

I can the PID of each cluster nodes by querying them for their
Sys.getpid(), e.g.

pids <- parallel::clusterEvalQ(cl, Sys.getpid())

Is there a function in core R for testing whether a process with a
given PID exists or not? From trial'n'error, I found that on Linux:

  pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L))

returns TRUE for existing processes and FALSE otherwise, but I'm not
sure if I can trust this.  It's not a documented feature in
?tools::pskill, which also warns about 'signal' not being standardized
across OSes.

The other Linux alternative I can imagine is:

  pid_exists <- function(pid) system2("ps", args = c("--pid", pid),
stdout = FALSE) == 0L

Can I expect this to work on macOS as well?  What about other *nix systems?

And, finally, what can be done on Windows?

I'm sure there are packages on CRAN that provides this, but I'd like
to keep dependencies at a minimum.

I appreciate any feedback. Thxs,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Detecting whether a process exists or not by its PID?

2018-08-30 Thread Gábor Csárdi
On Fri, Aug 31, 2018 at 1:18 AM Henrik Bengtsson
 wrote:
[...]
>   pid_exists <- function(pid) as.logical(tools::pskill(pid, signal = 0L))
>
> returns TRUE for existing processes and FALSE otherwise, but I'm not
> sure if I can trust this.  It's not a documented feature in
> ?tools::pskill, which also warns about 'signal' not being standardized
> across OSes.

Yes, as long as tools::pskill() is willing to call a killl(0) system
call, AFAIK this will work fine on all UNIX systems.

> The other Linux alternative I can imagine is:
>
>   pid_exists <- function(pid) system2("ps", args = c("--pid", pid),
> stdout = FALSE) == 0L
>
> Can I expect this to work on macOS as well?  What about other *nix systems?

There is no --pid option on macOS. I think simply `ps ` is
better, but some very minimal systems might not have ps at all.

> And, finally, what can be done on Windows?

You need to call OpenProcess from C, or find some base R function that
does that without messing up the process. Seems like tools::psnice()
does that.

> I'm sure there are packages on CRAN that provides this, but I'd like
> to keep dependencies at a minimum.

Yes, e.g. the ps package does this, and it does it properly, i.e. you
don't need to worry about pid reuse. Pid reuse does cause problems
quite frequently, especially on Windows, and especially on a system
that starts a lot of processes, like win-builder.

Gabor

> I appreciate any feedback. Thxs,
>
> Henrik
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel