Re: [Rd] ?Syntax wrong about `?`'s precedence ?

2019-08-30 Thread Stephen Ellison
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Ant F
> Sent: 29 August 2019 12:06
> To: r-devel@r-project.org
> Subject: [Rd] ?Syntax wrong about `?`'s precedence ?
> ...
> See the following example :
> 
> `?` <- `+`

I'm curious; What did you expect to happen if you replace the function '?' with 
the operator '+' ?
? is surely now being evaluated as a user-defined function and not as an 
operator. 
Would you expect the results of doing that to be the same as evaluation without 
replacement?

S Ellison




***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ?Syntax wrong about `?`'s precedence ?

2019-08-30 Thread William Dunlap via R-devel
Precedence is a property of the parser and has nothing to do with the
semantics assigned to various symbols.  Using just core R functions you can
see the precedence of '?' is between those of '=' and '<-'.

> # '=' has lower precedence than '?'
> str(as.list(parse(text="a ? b = c")[[1]]))
List of 3
 $ : symbol =
 $ : language `?`(a, b)
 $ : symbol c
> str(as.list(parse(text="a = b ? c")[[1]]))
List of 3
 $ : symbol =
 $ : symbol a
 $ : language `?`(b, c)
> # '<-' has higher precedence than '?'
> str(as.list(parse(text="a ? b <- c")[[1]]))
List of 3
 $ : symbol ?
 $ : symbol a
 $ : language b <- c
> str(as.list(parse(text="a <- b ? c")[[1]]))
List of 3
 $ : symbol ?
 $ : language a <- b
 $ : symbol c

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Fri, Aug 30, 2019 at 4:41 AM Stephen Ellison 
wrote:

> > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Ant F
> > Sent: 29 August 2019 12:06
> > To: r-devel@r-project.org
> > Subject: [Rd] ?Syntax wrong about `?`'s precedence ?
> > ...
> > See the following example :
> >
> > `?` <- `+`
>
> I'm curious; What did you expect to happen if you replace the function '?'
> with the operator '+' ?
> ? is surely now being evaluated as a user-defined function and not as an
> operator.
> Would you expect the results of doing that to be the same as evaluation
> without replacement?
>
> S Ellison
>
>
>
>
> ***
> This email and any attachments are confidential. Any u...{{dropped:10}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ?Syntax wrong about `?`'s precedence ?

2019-08-30 Thread Kevin Ushey
See also: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16710

On Fri, Aug 30, 2019 at 9:02 AM William Dunlap via R-devel
 wrote:
>
> Precedence is a property of the parser and has nothing to do with the
> semantics assigned to various symbols.  Using just core R functions you can
> see the precedence of '?' is between those of '=' and '<-'.
>
> > # '=' has lower precedence than '?'
> > str(as.list(parse(text="a ? b = c")[[1]]))
> List of 3
>  $ : symbol =
>  $ : language `?`(a, b)
>  $ : symbol c
> > str(as.list(parse(text="a = b ? c")[[1]]))
> List of 3
>  $ : symbol =
>  $ : symbol a
>  $ : language `?`(b, c)
> > # '<-' has higher precedence than '?'
> > str(as.list(parse(text="a ? b <- c")[[1]]))
> List of 3
>  $ : symbol ?
>  $ : symbol a
>  $ : language b <- c
> > str(as.list(parse(text="a <- b ? c")[[1]]))
> List of 3
>  $ : symbol ?
>  $ : language a <- b
>  $ : symbol c
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Fri, Aug 30, 2019 at 4:41 AM Stephen Ellison 
> wrote:
>
> > > From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Ant F
> > > Sent: 29 August 2019 12:06
> > > To: r-devel@r-project.org
> > > Subject: [Rd] ?Syntax wrong about `?`'s precedence ?
> > > ...
> > > See the following example :
> > >
> > > `?` <- `+`
> >
> > I'm curious; What did you expect to happen if you replace the function '?'
> > with the operator '+' ?
> > ? is surely now being evaluated as a user-defined function and not as an
> > operator.
> > Would you expect the results of doing that to be the same as evaluation
> > without replacement?
> >
> > S Ellison
> >
> >
> >
> >
> > ***
> > This email and any attachments are confidential. Any u...{{dropped:10}}
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] ?Syntax wrong about `?`'s precedence ?

2019-08-30 Thread peter dalgaard
...and 14955, which seems to have the explanation (but was marked as 
closed/fixed??). The parser does list '?' as lower precedence than '=', but 
'='-assignments are not normal 'expr's which can appear as arguments to '?'. 
(Presumably because of named arguments: f(a=b) differs from f(a<-b).)  

Other tokens which have lower precedence than assignments are flow-control 
items, IF ELSE WHILE FOR REPEAT, but I don't see any way to confuse them in the 
same way as '?'.

It might be possible to resolve the situation by specifying '?' syntax 
explicitly as
expr_or_assign '?' expr_or_assign, but, well, "There be Tygers here"...

-pd


> On 30 Aug 2019, at 18:32 , Kevin Ushey  wrote:
> 
> See also: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16710
> 
> On Fri, Aug 30, 2019 at 9:02 AM William Dunlap via R-devel
>  wrote:
>> 
>> Precedence is a property of the parser and has nothing to do with the
>> semantics assigned to various symbols.  Using just core R functions you can
>> see the precedence of '?' is between those of '=' and '<-'.
>> 
>>> # '=' has lower precedence than '?'
>>> str(as.list(parse(text="a ? b = c")[[1]]))
>> List of 3
>> $ : symbol =
>> $ : language `?`(a, b)
>> $ : symbol c
>>> str(as.list(parse(text="a = b ? c")[[1]]))
>> List of 3
>> $ : symbol =
>> $ : symbol a
>> $ : language `?`(b, c)
>>> # '<-' has higher precedence than '?'
>>> str(as.list(parse(text="a ? b <- c")[[1]]))
>> List of 3
>> $ : symbol ?
>> $ : symbol a
>> $ : language b <- c
>>> str(as.list(parse(text="a <- b ? c")[[1]]))
>> List of 3
>> $ : symbol ?
>> $ : language a <- b
>> $ : symbol c
>> 
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>> 
>> 
>> On Fri, Aug 30, 2019 at 4:41 AM Stephen Ellison 
>> wrote:
>> 
 From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Ant F
 Sent: 29 August 2019 12:06
 To: r-devel@r-project.org
 Subject: [Rd] ?Syntax wrong about `?`'s precedence ?
 ...
 See the following example :
 
`?` <- `+`
>>> 
>>> I'm curious; What did you expect to happen if you replace the function '?'
>>> with the operator '+' ?
>>> ? is surely now being evaluated as a user-defined function and not as an
>>> operator.
>>> Would you expect the results of doing that to be the same as evaluation
>>> without replacement?
>>> 
>>> S Ellison
>>> 
>>> 
>>> 
>>> 
>>> ***
>>> This email and any attachments are confidential. Any u...{{dropped:10}}
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] inconsistent handling of factor, character, and logical predictors in lm()

2019-08-30 Thread Fox, John
Dear R-devel list members,

I've discovered an inconsistency in how lm() and similar functions handle 
logical predictors as opposed to factor or character predictors. An "lm" object 
for a model that includes factor or character predictors includes the levels of 
a factor or unique values of a character predictor in the $xlevels component of 
the object, but not the FALSE/TRUE values for a logical predictor even though 
the latter is treated as a factor in the fit.

For example:

 snip --

> m1 <- lm(Sepal.Length ~ Sepal.Width + Species, data=iris)
> m1$xlevels
$Species
[1] "setosa" "versicolor" "virginica" 
 
> m2 <- lm(Sepal.Length ~ Sepal.Width + as.character(Species), data=iris)
> m2$xlevels
$`as.character(Species)`
[1] "setosa" "versicolor" "virginica" 

> m3 <- lm(Sepal.Length ~ Sepal.Width + I(Species == "setosa"), data=iris)
> m3$xlevels
named list()

> m3

Call:
lm(formula = Sepal.Length ~ Sepal.Width + I(Species == "setosa"), 
data = iris)

Coefficients:
   (Intercept) Sepal.Width  I(Species == 
"setosa")TRUE  
3.5571  0.9418 
-1.7797  

 snip --

I believe that the culprit is .getXlevels(), which makes provision for factor 
and character predictors but not for logical predictors:

 snip --

> .getXlevels
function (Terms, m) 
{
xvars <- vapply(attr(Terms, "variables"), deparse2, 
"")[-1L]
if ((yvar <- attr(Terms, "response")) > 0) 
xvars <- xvars[-yvar]
if (length(xvars)) {
xlev <- lapply(m[xvars], function(x) if (is.factor(x)) 
levels(x)
else if (is.character(x)) 
levels(as.factor(x)))
xlev[!vapply(xlev, is.null, NA)]
}
}

 snip --

It would be simple to modify the last test in .getXlevels to 

else if (is.character(x) || is.logical(x))

which would cause .getXlevels() to return c("FALSE", "TRUE") (assuming both 
values are present in the data). I'd find that sufficient, but alternatively 
there could be a separate test for logical predictors that returns c(FALSE, 
TRUE).

I discovered this issue when a function in the effects package failed for a 
model with a logical predictor. Although it's possible to program around the 
problem, I think that it would be better to handle factors, character 
predictors, and logical predictors consistently.

Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] New lazyload rdx key type: list(eagerKey=, lazyKeys=)

2019-08-30 Thread William Dunlap via R-devel
Prior to R-3.6.0 the keys in the lazyload key files, e.g.
pkg/data/Rdata.rdx or pkg/R/pkg.rdx, seemed to all be 2-long integer
vectors.  Now they can be lists.  The ones I have seen have two components,
"eagerKey" is a 2-long integer vector and "lazyKeys" is a named list of
2-long integer vectors.

> rdx <- readRDS(system.file(package="survival", "data", "Rdata.rdx"))
> str(Filter(is.list, rdx$references))
List of 2
 $ env::1:List of 2
  ..$ eagerKey: int [1:2] 273691 183
  ..$ lazyKeys:List of 1
  .. ..$ lines: int [1:2] 273874 284
 $ env::2:List of 2
  ..$ eagerKey: int [1:2] 473142 166
  ..$ lazyKeys:List of 1
  .. ..$ lines: int [1:2] 473308 310

or

>  rdx <- readRDS(system.file(package="lambda.r", "R", "lambda.r.rdx"))
> length(Filter(is.integer, rdx$references))
[1] 4
> str(Filter(Negate(is.integer), rdx$references))
List of 5
 $ env::5:List of 2
  ..$ eagerKey: int [1:2] 28278 328
  ..$ lazyKeys:List of 2
  .. ..$ lines: int [1:2] 28606 80
  .. ..$ parseData: int [1:2] 28686 389
 $ env::6:List of 2
  ..$ eagerKey: int [1:2] 29075 327
  ..$ lazyKeys:List of 2
  .. ..$ lines: int [1:2] 29402 71
  .. ..$ parseData: int [1:2] 29473 321
 $ env::7:List of 2
  ..$ eagerKey: int [1:2] 29794 325
  ..$ lazyKeys:List of 2
  .. ..$ lines: int [1:2] 30119 117
  .. ..$ parseData: int [1:2] 30236 752
... many more ...

All the ones I've seen involve the environment in srcref attributes and
most packages do not keep that.  Will these be used for more sorts of
environments in the future?

What is the meaning of the lazyKeys?  Are these stored as promises until
needed or is there some special option to never or always load them?

Bill Dunlap
TIBCO Software
wdunlap tibco.com

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] inconsistent handling of factor, character, and logical predictors in lm()

2019-08-30 Thread Abby Spurdle
> I think that it would be better to handle factors, character predictors, and 
> logical predictors consistently.

"logical predictors" can be regarded as categorical or continuous (i.e. 0 or 1).
And the model matrix should be the same, either way.

I think the first question to be asked is, which is the best approach,
categorical or continuous?
The continuous approach seems simpler and more efficient to me, but
output from the categorical approach may be more intuitive, for some
people.

I note that the use factors and characters, doesn't necessarily
produce consistent output, for $xlevels.
(Because factors can have their levels re-ordered).

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel