Re: [Rd] R CMD build then check fails on R-devel due to serialization version

2018-01-11 Thread Jim Hester
This change poses difficulties for automated build systems such as
travis-ci, which is widely used in the R community. In particular
because this is a WARNING and not a NOTE this causes all R-devel
builds with vignettes to fail, as the default settings fail the build
if R CMD check issues a WARNING.

The simplest change would be for R-core to change this message to be a
NOTE rather than a WARNING, the serialization could still be tested
and there would be a check against vignettes built with R-devel, but
it would not cause these builds to fail.

On Wed, Jan 10, 2018 at 3:52 PM, Duncan Murdoch
 wrote:
> On 10/01/2018 1:26 PM, Neal Richardson wrote:
>>
>> Hi,
>> Since yesterday I'm seeing `R CMD check --as-cran` failures on the
>> R-devel daily build (specifically, R Under development (unstable)
>> (2018-01-09 r74100)) for multiple packages:
>>
>> * checking serialized R objects in the sources ... WARNING
>> Found file(s) with version 3 serialization:
>> ‘build/vignette.rds’
>> Such files are only readable in R >= 3.5.0.
>> Recreate them with R < 3.5.0 or save(version = 2) or saveRDS(version =
>> 2) as appropriate
>>
>> As far as I can tell, revision 74099
>>
>> (https://github.com/wch/r-source/commit/d9530001046a582ff6a43ca834d6c3586abd0a97),
>> which changes the default serialization format to 3, clashes with
>> revision 73973
>> (https://github.com/wch/r-source/commit/885764eb74f2211a547b13727f2ecc5470c3dd00),
>> which checks that serialized R objects are _not_ version 3. It seems
>> that with the current development version of R, if you `R CMD build`
>> and then run `R CMD check --as-cran` on the built package, it will
>> fail.
>>
>
> I think the message basically says:  don't do that.  You should build with
> R-release for now.  You always need to check with R-devel, so life is
> complicated.
>
> If you build with R-devel without forcing the old format, nobody using
> R-release will be able to use your tarball.
>
> Eventually I guess the new format will be accepted by CRAN, but it will
> likely be a while:  nobody expects everyone to instantly upgrade to a new R
> release, let alone to an unreleased development version.
>
> Presumably that particular file (build/vignette.rds) could be automatically
> built in the old format for now, but the new format needs testing, so it
> makes sense to me to leave it as a default, even if it makes it more
> complicated to submit a package to CRAN.
>
> Duncan Murdoch
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] substitute() on arguments in ellipsis ("dot dot dot")?

2018-08-15 Thread Jim Hester
Assuming you are fine with a pairlist instead of a list avoiding the
`as.list()` call for dots2 saves a reasonable amount of time and makes it
clearly the fastest.

library(rlang)

dots1 <- function(...) as.list(substitute(list(...)))[-1L]
dots2 <- function(...) as.list(substitute(...()))
dots2.5 <- function(...) substitute(...())
dots3 <- function(...) match.call(expand.dots = FALSE)[["..."]]
dots4 <- function(...) exprs(...)

bench::mark(
  dots1(1+2, "a", rnorm(3), stop("bang!")),
  dots2(1+2, "a", rnorm(3), stop("bang!")),
  dots2.5(1+2, "a", rnorm(3), stop("bang!")),
  dots3(1+2, "a", rnorm(3), stop("bang!")),
  dots4(1+2, "a", rnorm(3), stop("bang!")),
  check = FALSE
)[1:4]
#> # A tibble: 5 x 4
#>   expression min mean
 median
#> 

#> 1 "dots1(1 + 2, \"a\", rnorm(3), stop(\"bang!\…   2.38µs   5.63µs
 2.89µs
#> 2 "dots2(1 + 2, \"a\", rnorm(3), stop(\"bang!\…   2.07µs3.1µs
2.6µs
#> 3 "dots2.5(1 + 2, \"a\", rnorm(3), stop(\"bang…471ns  789.5ns
638ns
#> 4 "dots3(1 + 2, \"a\", rnorm(3), stop(\"bang!\…   3.17µs   4.83µs
 4.22µs
#> 5 "dots4(1 + 2, \"a\", rnorm(3), stop(\"bang!\…   3.16µs   4.43µs
 3.87µs


On Mon, Aug 13, 2018 at 7:59 PM Hadley Wickham  wrote:

> Since you're already using bang-bang ;)
>
> library(rlang)
>
> dots1 <- function(...) as.list(substitute(list(...)))[-1L]
> dots2 <- function(...) as.list(substitute(...()))
> dots3 <- function(...) match.call(expand.dots = FALSE)[["..."]]
> dots4 <- function(...) exprs(...)
>
> bench::mark(
>   dots1(1+2, "a", rnorm(3), stop("bang!")),
>   dots2(1+2, "a", rnorm(3), stop("bang!")),
>   dots3(1+2, "a", rnorm(3), stop("bang!")),
>   dots4(1+2, "a", rnorm(3), stop("bang!")),
>   check = FALSE
> )[1:4]
> #> # A tibble: 4 x 4
> #>   expression  min mean
> median
> #>  
> 
> #> 1 "dots1(1 + 2, \"a\", rnorm(3), stop(\"bang!\"…   3.23µs   4.15µs
> 3.81µs
> #> 2 "dots2(1 + 2, \"a\", rnorm(3), stop(\"bang!\"…   2.72µs   4.48µs
> 3.37µs
> #> 3 "dots3(1 + 2, \"a\", rnorm(3), stop(\"bang!\"…   4.06µs   4.94µs
> 4.69µs
> #> 4 "dots4(1 + 2, \"a\", rnorm(3), stop(\"bang!\"…   3.92µs4.9µs
> 4.46µs
>
>
> On Mon, Aug 13, 2018 at 4:19 AM Henrik Bengtsson
>  wrote:
> >
> > Thanks all, this was very helpful.  Peter's finding - dots2() below -
> > is indeed interesting - I'd be curious to learn what goes on there.
> >
> > The different alternatives perform approximately the same;
> >
> > dots1 <- function(...) as.list(substitute(list(...)))[-1L]
> > dots2 <- function(...) as.list(substitute(...()))
> > dots3 <- function(...) match.call(expand.dots = FALSE)[["..."]]
> >
> > stats <- microbenchmark::microbenchmark(
> >   dots1(1+2, "a", rnorm(3), stop("bang!")),
> >   dots2(1+2, "a", rnorm(3), stop("bang!")),
> >   dots3(1+2, "a", rnorm(3), stop("bang!")),
> >   times = 10e3
> > )
> > print(stats)
> > # Unit: microseconds
> > #expr  min   lq mean median
> > uq  max neval
> > #  dots1(1 + 2, "a", rnorm(3), stop("bang!")) 2.14 2.45 3.04   2.58
> > 2.73 1110 1
> > #  dots2(1 + 2, "a", rnorm(3), stop("bang!")) 1.81 2.10 2.47   2.21
> > 2.34 1626 1
> > #  dots3(1 + 2, "a", rnorm(3), stop("bang!")) 2.59 2.98 3.36   3.15
> > 3.31 1037 1
> >
> > /Henrik
> >
> > On Mon, Aug 13, 2018 at 7:10 AM Peter Meilstrup
> >  wrote:
> > >
> > > Interestingly,
> > >
> > >as.list(substitute(...()))
> > >
> > > also works.
> > >
> > > On Sun, Aug 12, 2018 at 1:16 PM, Duncan Murdoch
> > >  wrote:
> > > > On 12/08/2018 4:00 PM, Henrik Bengtsson wrote:
> > > >>
> > > >> Hi. For any number of *known* arguments, we can do:
> > > >>
> > > >> one <- function(a) list(a = substitute(a))
> > > >> two <- function(a, b) list(a = substitute(a), b = substitute(b))
> > > >>
> > > >> and so on. But how do I achieve the same when I have:
> > > >>
> > > >> dots <- function(...) list(???)
> > > >>
> > > >> I want to implement this such that I can do:
> > > >>
> > > >>> exprs <- dots(1+2)
> > > >>> str(exprs)
> > > >>
> > > >> List of 1
> > > >>   $ : language 1 + 2
> > > >>
> > > >> as well as:
> > > >>
> > > >>> exprs <- dots(1+2, "a", rnorm(3))
> > > >>> str(exprs)
> > > >>
> > > >> List of 3
> > > >>   $ : language 1 + 2
> > > >>   $ : chr "a"
> > > >>   $ : language rnorm(3)
> > > >>
> > > >> Is this possible to achieve using plain R code?
> > > >
> > > >
> > > > I think so.  substitute(list(...)) gives you a single expression
> containing
> > > > a call to list() with the unevaluated arguments; you can convert
> that to
> > > > what you want using something like
> > > >
> > > > dots <- function (...) {
> > > >   exprs <- substitute(list(...))
> > > >   as.list(exprs[-1])
> > > > }
> > > >
> > > > Duncan Murdoch
> > > >
> > > >
> > > > __
> > > > R-devel@r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r

[Rd] Change windows installer default to only install 64 bit R

2018-11-09 Thread Jim Hester
The R Installer by default installs both the 32 and 64 bit versions of
R. This can cause user confusion as they have multiple versions of R
installed and are then unsure which to use, or mistakenly open the
wrong version.

We can remove much of this ambiguity by changing the default choice in
the installer to only install the 64 bit version in the installer. If
users do need the 32 bit version it is still simple for them to
install it by checking the appropriate box during installation.

The following diff (also attached) simply reorders the options in the
install dialog to make 64 bit only installs the default. If users do
not have 64 bit support in their OS the options fallback to 32 bit
install by default, the same as they do currently in that situation.

There also seemed to be a bug in that 32 bit installs did not include
the message translations, which is also fixed.

Thanks for your consideration,

Jim

Index: src/gnuwin32/installer/types3264.iss
===
--- src/gnuwin32/installer/types3264.iss (revision 75569)
+++ src/gnuwin32/installer/types3264.iss (working copy)
@@ -1,8 +1,8 @@

 [Types]
+Name: "user64"; Description: "64-bit {cm:user}"; Check: Is64BitInstallMode
 Name: "user"; Description: {cm:user}; Check: Is64BitInstallMode
 Name: "user32"; Description: "32-bit {cm:user}"
-Name: "user64"; Description: "64-bit {cm:user}"; Check: Is64BitInstallMode
 Name: "custom"; Description: {cm:custom}; Flags: iscustom

 [Components]
@@ -9,4 +9,4 @@
 Name: "main"; Description: "Core Files"; Types: user user32 user64  custom
 Name: "i386"; Description: "32-bit Files"; Types: user user32 custom
 Name: "x64"; Description: "64-bit Files"; Types: user user64 custom
-Name: "translations"; Description: "Message translations"; Types:
user user64 custom
+Name: "translations"; Description: "Message translations"; Types:
user user32 user64 custom
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Using `--configure.args` with configure.win on Windows

2018-12-20 Thread Jim Hester
Looking at the code for `R CMD INSTALL` [1] it looks like
`--configure-args` is not used on Windows, so there is not a way to pass
arguments to the `configure.win` script like there is for `configure`.

Is this lack intentional or simply an oversight because support for
configure.win was added later?

[1]:
https://github.com/wch/r-source/blob/8bc3a6f4b0c2fca3195cac427e9ad8b4448eaa73/src/library/tools/R/install.R#L670-L697

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Package inclusion in R core implementation

2019-03-04 Thread Jim Hester
Conversely, what is the process to remove a package from core R? It seems
to me some (many?) of the packages included are there more out of
historical accident rather than any technical need to be in the core
distribution. Having them as a core (or recommended) package makes them
harder update independently to R and makes testing, development and
contribution more cumbersome.

On Fri, Mar 1, 2019 at 4:35 AM Morgan Morgan 
wrote:

> Hi,
>
> It sometimes happens that some packages get included to R like for example
> the parallel package.
>
> I was wondering if there is a process to decide whether or not to include a
> package in the core implementation of R?
>
> For example, why not include the Rcpp package, which became for a lot of
> user the main tool to extend R?
>
> What is our view on the (not so well known) dotCall64 package which is an
> interesting alternative for extending R?
>
> Thank you
> Best regards,
> Morgan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Use of C++ in Packages

2019-03-29 Thread Jim Hester
First, thank you to Tomas for writing his recent post[0] on the R
developer blog. It raised important issues in interfacing R's C API
and C++ code.

However I do _not_ think the conclusion reached in the post is helpful
  > don’t use C++ to interface with R

There are now more than 1,600 packages on CRAN using C++, the time is
long past when that type of warning is going to be useful to the R
community.

These same issues will also occur with any newer language (such as
Rust or Julia[1]) which uses RAII to manage resources and tries to
interface with R. It doesn't seem a productive way forward for R to
say it can't interface with these languages without first doing
expensive copies into an intermediate heap.

The advice to avoid C++ is also antithetical to John Chambers vision
of first S and R as a interface language (from Extending R [2])

  > The *interface* principle has always been central to R and to S
before. An interface to subroutines was _the_ way to extend the first
version of S. Subroutine interfaces have continued to be central to R.

The book also has extensive sections on both C++ (via Rcpp) and Julia,
so clearly John thinks these are legitimate ways to extend R.

So if 'don't use C++' is not realistic and the current R API does not
allow safe use of C++ exceptions what are the alternatives?

One thing we could do is look how this is handled in other languages
written in C which also use longjmp for errors.

Lua is one example, they provide an alternative interface;
lua_pcall[3] and lua_cpcall[4] which wrap a normal lua call and return
an error code rather long jumping. These interfaces can then be safely
wrapped by RAII - exception based languages.

This alternative error code interface is not just useful for C++, but
also for resource cleanup in C, it is currently non-trivial to handle
cleanup in all the possible cases a longjmp can occur (interrupts,
warnings, custom conditions, timeouts any allocation etc.) even with R
finalizers.

It is past time for R to consider a non-jumpy C interface, so it can
continue to be used as an effective interface to programming routines
in the years to come.

[0]: 
https://developer.r-project.org/Blog/public/2019/03/28/use-of-c---in-packages/
[1]: https://github.com/JuliaLang/julia/issues/28606
[2]: https://doi.org/10.1201/9781315381305
[3]: http://www.lua.org/manual/5.1/manual.html#lua_pcall
[4]: http://www.lua.org/manual/5.1/manual.html#lua_cpcall

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Underscores in package names

2019-08-08 Thread Jim Hester
Are there technical reasons that package names cannot be snake case?
This seems to be enforced by `.standard_regexps()$valid_package_name`
which currently returns

   "[[:alpha:]][[:alnum:].]*[[:alnum:]]"

Is there any technical reason this couldn't be altered to accept `_`
as well, e.g.

  "[[:alpha:]][[:alnum:]._]*[[:alnum:]]"

I realize that historically `_` has not always been valid in variable
names, but this has now been acceptable for 15+ years (since R 1.9.0 I
believe). Might we also allow underscores for package names?

Jim

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-09 Thread Jim Hester
To be clear, I'd be happy to contribute code to make this work, with
the changes mentioned by Duncan and elsewhere in the codebase, if
someone on R-core was interested in reviewing it.

Jim

On Thu, Aug 8, 2019 at 11:05 AM Duncan Murdoch  wrote:
>
> On 08/08/2019 10:31 a.m., Jim Hester wrote:
> > Are there technical reasons that package names cannot be snake case?
> > This seems to be enforced by `.standard_regexps()$valid_package_name`
> > which currently returns
> >
> > "[[:alpha:]][[:alnum:].]*[[:alnum:]]"
> >
> > Is there any technical reason this couldn't be altered to accept `_`
> > as well, e.g.
> >
> >"[[:alpha:]][[:alnum:]._]*[[:alnum:]]"
> >
> > I realize that historically `_` has not always been valid in variable
> > names, but this has now been acceptable for 15+ years (since R 1.9.0 I
> > believe). Might we also allow underscores for package names?
>
> The tarball names separate the package name from the version number
> using an underscore.  There is code that is written to assume there is
> at most one underscore, e.g. .check_package_CRAN_incoming in
> src/library/tools/R/QC.r.
>
> That code could be changed, but so could the proposed package name...
>
> Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Underscores in package names

2019-08-15 Thread Jim Hester
Martin,

Thank you for discussing this amongst R-core and for detailing the
R-core discussion here.

Some specific examples where having underscores available would have
been useful.

1. My primerTree package (2013) was originally primer_tree, but I had
to change the name to camelCase to comply with the check requirements.
Using camelCase in the package name makes reading code jarring, as the
functions all use snake_case.
2. The widely used testthat package would likely be called test_that,
like the corresponding function within the package. This also
highlights one of the drawbacks of the current situation, without
separators the package name is more difficult to read, does it have
two t's or three?
3. The assertive suite of packages use `.` for separation, e.g.
`assertive.base`, `assertive.datetimes` etc. but all functions within
the packages use `_` separators, again likely this was done out of
necessity rather than desire.

There are many more I am sure, these were some that came immediately
to mind. More important than the specific examples is the opportunity
cost of having this restriction, which we cannot really quantify.

Using dots for separators has a number of practical problems.
Functions using dots are ambiguous, e.g. is `as.data.frame()` a
regular function, an `as.data()` method for a `frame` object, or an
`as()` method for a `data.frame` object? And in fact regular functions
can be accidentally promoted to S3 methods by defining a S3 generic,
which does actually happen in real life, confusing users [1]. While
package names are not functions, using dots in package names
encourages the use of dots in functions, a dangerous practice. Dots in
names is also one of the common stones cast at R as a language, as
dots are used for object oriented method dispatch in other common
languages.

The prevalence of dotted functions is the only major naming convention
which is steadily decreasing over time. It now accounts for only
around 15% of all function names when looking at all 94 Million lines
of code currently available on CRAN (See Figure 2. from Yen et. al.
[2]).

Thanks again for the public discussion,

Jim

[1]: https://twitter.com/_ColinFay/status/1105579764797108230
[2]: https://osf.io/preprints/socarxiv/ts2wq/

On Wed, Aug 14, 2019 at 5:16 AM Martin Maechler
 wrote:
>
> > Duncan Murdoch
> > on Fri, 9 Aug 2019 20:23:28 -0400 writes:
>
> > On 09/08/2019 4:37 p.m., Gabriel Becker wrote:
> >> Duncan,
> >>
> >>
> >> On Fri, Aug 9, 2019 at 1:17 PM Duncan Murdoch  >> > wrote:
> >>
> >> On 09/08/2019 2:41 p.m., Gabriel Becker wrote:
> >> > Note that this proposal would make mypackage_2.3.1 a valid
> >> *package name*,
> >> > whose corresponding tarball name might be mypackage_2.3.1_2.3.2
> >> after a
> >> > patch. Yes its a silly example, but why allow that kind of ambiguity?
> >> >
> >> CRAN already has a package named "FuzzyNumbers.Ext.2", whose tarball is
> >> FuzzyNumbers.Ext.2_3.2.tar.gz, so I think we've already lost that game.
> >>
> >>
> >> I suppose technically 2 is a valid version number for a package (?) so 
> I
> >> suppose you have me there. But as Ben pointed out while I was writing
> >> this, all I can really say is that in practice they read to me (as
> >> someone who has administered R on a large cluster and written
> >> build-system software for it) as substantially different levels of
> >> ambiguity. I do acknowledge, as Ben does, that yes a more complex
> >> regular expression/splitting algorithm can be written that would handle
> >> the more general package names. I just don't personally see a 
> motivation
> >> that justifies changing something this fundamental (even if it is both
> >> narrow and was initially more or less arbitrarily chosen) about R at
> >> this late date.
> >>
> >> I guess at the end of the day, I guess what I'm saying is that breaking
> >> and changing things is sometimes good, but if we're going to rock the
> >> boat personally I'd want to do so going after bigger wins than this 
> one.
> >> Thats just my opinion though.
>
> > Sorry, I wasn't clear.  I agree with you.  I was just saying that the
> > particular argument based on ugly tarball names isn't the reason.
>
> > Duncan Murdoch
>
> Thank you (and Gabe).
>
> We have had some R core internal "talk" about Jim Hester's
> suggestion (of adding underscores to the allow characters in
> package names).
> Duncan had already given a good reason why such a change would be problematic
> (the underscore being used as unique separator of package name
>  and version in source and binary package archives),
> and with Jim's offer to find and provide patches for all places
> this is used in the R sources, we've convinced ourselves that
> there is much more code "out there", notably 'devops' code in
> scripts, which currently relies

Re: [Rd] install_github and survival

2019-09-09 Thread Jim Hester
I just sent a PR that makes a few small changes to the package and
fixes the installation with `install_github()` (and also `R CMD
INSTALL`) and other ways of installing the package.

https://github.com/therneau/survival/pull/84

Jim

On Thu, Sep 5, 2019 at 1:53 PM Therneau, Terry M., Ph.D. via R-devel
 wrote:
>
> I treat CRAN as the main repository for survival, but I have also had a github
> (therneau/survival) version for a couple of years.  It has a vignette2 
> directory, for
> instance, that contains extra vignettes that either take too long to run or 
> depend on
> other packages.  It also gets updated more often than CRAN (though those 
> updates mght not
> be as well tested yet).
>
> In any case, since it is there, people will of course run install_github 
> against it.
> I've added a config script to do the one extra step necessary, but when I try
> install_github it fails.   I'm clearly doing something wrong.  If someone 
> were willing to
> contribute a fix I would be most grateful.
>
> survival3.1-0 is almost ready for CRAN, by the way.   Reverse dependency 
> checks of hdnom
> turned up one last thing to repair...
>
>
> Terry Therneau
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] ALTREP string methods for substr and nchar

2019-12-19 Thread Jim Hester
A useful extension of ALTREP is having two new string methods which
return the number of characters of a given string element and to
return a substring of an element.

Having these methods would allow retrieving these values without
needing to create a CHARSXP for the full element data, which could
potentially be costly for long elements.

For example say you have an ALTREP altstring vector where each element
holds the sequence of a single chromosome, it would be useful to query
the lengths of each chromosome and retrieve the first 100 characters
etc. without having to put the whole chromosome in memory. I realize
there are tools in Bioconductor to handle this particular case, but it
seems the general case would be perfect for ALTREP.

Jim

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion/opinions: add a `absolute` param to `normalizePath()` to force returning an absolute path

2020-04-15 Thread Jim Hester
The fs[1] function `fs::path_abs()` does what I believe you were
expecting `normalizePath()` to do in this case. e.g.

setwd("~")
normalizePath("foo/bar")
#> Warning in normalizePath("foo/bar") :
#> path[1]="foo/bar": No such file or directory
#> [1] "foo/bar"

fs::path_abs("foo/bar")
#> /Users/jhester/foo/bar

[1]: https://CRAN.R-project.org/package=fs


On Tue, Apr 14, 2020 at 1:03 PM Dean Attali  wrote:
>
> This request stems off a bug report I posted
> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17757 where it was
> determined the current behaviour is as expected.
>
> To recap: when given a real file, normalizePath() always* returns the full
> absolute path. When given a non-existent file, normalizePath() returns a
> full path on Windows but it returns the input on other systems*. I'd argue
> that there are benefits to being able to reliably and consistently get a
> full path, regardless of whether the file exists or not. In order to not
> break existing behaviour, I propose adding an argument `absolute = FALSE`
> that will attempt to return an absolute path when the argument is set to
> TRUE. I don't have any evidence for this claim, but I believe that others
> who use this function would expect, like I did, that an absolute path is
> returned regardless of the file state. I understand the documentation is
> correct because it warns the absolute path may not be returned, but I
> believe it would be a useful feature to support.
>
>
> * I've tested this on Win7, Win10, two versions of MacOS, ubuntu. This
> behaviour may not be true in other OSes
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Add a new environment variable switch for the 'large version' check

2020-04-15 Thread Jim Hester
If you test a package with `R CMD check --as-cran` one of the
'incoming' checks is for a large version number, it gives a NOTE like
this

* checking CRAN incoming feasibility ... NOTE
Maintainer: ‘Jim Hester ’

Version contains large components (0.0.0.9000)

This is a useful check when packages are submitted to CRAN because it
catches these large version components, which typically are reserved
for development versions of packages.

However when checking packages during development it is often expected
to have these large versions, so this note can be confusing for those
new to package development.

Currently the only way to turn off this particular check is to turn
off _all_ of the CRAN incoming checks.

The following patch (also attached) adds an environment variable that
can be used to disable just this check, which would allow users to
disable it if they expect to be using a large version. The default
behavior (and CRAN's usage) would remain unchanged.

diff --git a/src/library/tools/R/QC.R b/src/library/tools/R/QC.R
index 062722127a..64acd72c5e 100644
--- a/src/library/tools/R/QC.R
+++ b/src/library/tools/R/QC.R
@@ -6963,7 +6963,9 @@ function(dir, localOnly = FALSE)
 if(grepl("(^|[.-])0[0-9]+", ver))
 out$version_with_leading_zeroes <- ver
 unlisted_version <- unlist(package_version(ver))
-if(any(unlisted_version >= 1234 & unlisted_version !=
as.integer(format(Sys.Date(), "%Y"
+if(any(unlisted_version >= 1234 & unlisted_version !=
as.integer(format(Sys.Date(), "%Y"))) &&
+   
!config_val_to_logical(Sys.getenv("_R_CHECK_CRAN_INCOMING_SKIP_LARGE_VERSION_",
+ "FALSE")))
 out$version_with_large_components <- ver

 .aspell_package_description_for_CRAN <- function(dir, meta = NULL) {
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] defining r audio connections

2020-05-07 Thread Jim Hester
https://github.com/jimhester/archive was not allowed on CRAN when I
submitted it 3 years ago due to this restriction.

Being able to write custom connections is a useful feature for a number of
applications, I would love this policy to be reconsidered.

On Wed, May 6, 2020 at 10:30 PM Henrik Bengtsson 
wrote:

> What's the gist of the problem of making/having this part of the public
> API? Is it security, is it stability, is it that the current API is under
> construction, is it a worry about maintenance load for R Core, ...? Do we
> know why?
>
> It sounds like it's a feature that is  useful. I think we missed out on
> some great enhancements in the past because of it not being part of the
> public API.
>
> /Henrik
>
> On Wed, May 6, 2020, 16:26 Martin Morgan  wrote:
>
> > yep, you're right, after some initial clean-up and running with or
> without
> > --as-cran R CMD check gives a NOTE
> >
> >   *  checking compiled code
> >   File ‘socketeer/libs/socketeer.so’:
> > Found non-API calls to R: ‘R_GetConnection’,
> >‘R_new_custom_connection’
> >
> >   Compiled code should not call non-API entry points in R.
> >
> >   See 'Writing portable packages' in the 'Writing R Extensions' manual.
> >
> > Connections in general seem more useful than ad-hoc functions, though
> > perhaps for Frederick's use case Duncan's suggestion is sufficient. For
> > non-CRAN packages I personally would implement a connection.
> >
> > (I mistakenly thought this was a more specialized mailing list; I
> wouldn't
> > have posted to R-devel on this topic otherwise)
> >
> > Martin Morgan
> >
> > On 5/6/20, 4:12 PM, "Gábor Csárdi"  wrote:
> >
> > AFAIK that API is not allowed on CRAN. It triggers a NOTE or a
> > WARNING, and your package will not be published.
> >
> > Gabor
> >
> > On Wed, May 6, 2020 at 9:04 PM Martin Morgan <
> mtmorgan.b...@gmail.com>
> > wrote:
> > >
> > > The public connection API is defined in
> > >
> > >
> >
> https://github.com/wch/r-source/blob/trunk/src/include/R_ext/Connections.h
> > >
> > > I'm not sure of a good pedagogic example; people who want to write
> > their own connections usually want to do so for complicated reasons!
> > >
> > > This is my own abandoned attempt
> >
> https://github.com/mtmorgan/socketeer/blob/b0a1448191fe5f79a3f09d1f939e1e235a22cf11/src/connection.c#L169-L192
> > where connection_local_client() is called from R and _connection_local()
> > creates and populates the appropriate structure. Probably I have done
> > things totally wrong (e.g., by not checking the version of the API, as
> > advised in the header file!)
> > >
> > > Martin Morgan
> > >
> > > On 5/6/20, 2:26 PM, "R-devel on behalf of Duncan Murdoch" <
> > r-devel-boun...@r-project.org on behalf of murdoch.dun...@gmail.com>
> > wrote:
> > >
> > > On 06/05/2020 1:09 p.m., frede...@ofb.net wrote:
> > > > Dear R Devel,
> > > >
> > > > Since Linux moved away from using a file-system interface for
> > audio, I think it is necessary to write special libraries to interface
> with
> > audio hardware from various languages on Linux.
> > > >
> > > > In R, it seems like the appropriate datatype for a
> `snd_pcm_t`
> > handle pointing to an open ALSA source or sink would be a "connection".
> > Connection types are already defined in R for "file", "url", "pipe",
> > "fifo", "socketConnection", etc.
> > > >
> > > > Is there a tutorial or an example package where a new type of
> > connection is defined, so that I can see how to do this properly in a
> > package?
> > > >
> > > > I can see from the R source that, for example, `do_gzfile` is
> > defined in `connections.c` and referenced in `names.c`. However, I
> thought
> > I should ask here first in case there is a better place to start, than
> > trying to copy this code.
> > > >
> > > > I only want an object that I can use `readBin` and `writeBin`
> > on, to read and write audio data using e.g. `snd_pcm_writei` which is
> part
> > of the `alsa-lib` package.
> > >
> > > I don't think R supports user-defined connections, but probably
> > writing
> > > readBin and writeBin equivalents specific to your library
> > wouldn't be
> > > any harder than creating a connection.  For those, you will
> > probably
> > > want to work with an "external pointer" (see Writing R
> > Extensions).
> > > Rcpp probably has support for these if you're working in C++.
> > >
> > > Duncan Murdoch
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > __
> > R-devel

[Rd] Mapping parse tree elements to tokens

2015-07-29 Thread Jim Hester
I would like to map the parsed tokens obtained from utils::getParseData()
to the parse tree and elements obtained by base::parse().

It looks like back when this code was in the parser package the parse()
function annotated the elements in the tree with their id, which would
allow you to perform this mapping.  However when the code was included in R
this functionality was removed.

?getParseData states
  The ‘id’ values are not attached to the elements of the parse
  tree, they are only retained in the table returned by
  ‘getParseData’.

Is there another way you can map between the getParseData() tokens and
elements of the parse tree that makes this additional annotation
unnecessary?  Or is this simply not possible?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Mapping parse tree elements to tokens

2015-07-29 Thread Jim Hester
As Michael guessed my main use cases was code analysis.  A concrete example
where this would help is with my test code coverage tool covr.  There is
currently a bug when tracking coverage for if / else statements when the
clauses do not contain brackets (https://github.com/jimhester/covr/issues/39).
Because only one source reference is generated in this case (because it is
parsed as a single expression), it is not possible to track each of the
clauses separately.  While I can get the source reference for the entire
statement, in order to extract the if/else clauses I need to either use the
tokenized information from getParseData(), or re-parse the entire if / else
expression by hand (which seems prone to error to me).

Another example of where this would help is linking comments to
expressions.  While I know this topic has been discussed previously (
https://stat.ethz.ch/pipermail/r-devel/2009-March/052731.html) and I am
fine with the default parser dropping comments, having the ability to map
the more detailed tokens back to the parse tree would allow the comments to
be annotated to their closest expression.

Of the three options you propose I think simply supplying the index as an
additional column from the getParseData() output would be the most
straightforward to implement and use.

While it is true that you can get most of the way there with the current
source references as Michael mentions in some cases having more fine
grained location information is useful and there is no great way to get
there currently without re-parsing the full expressions from the source
reference.

The current getParseData output is already very implementation specific so
I don't think it would be a great additional support burden to add the
indexing information.  Likely the whole function would have to be removed
if a different parsing method was used.

Regardless I am glad others have shown some interest in this issue, thank
you for taking the time to read and respond!

Jim

On Wed, Jul 29, 2015 at 2:47 PM, Duncan Murdoch 
wrote:

> On 29/07/2015 2:30 PM, Michael Lawrence wrote:
>
>> Probably need a generic tree based on "ParseNode" objects that
>> associate the line information with the symbol (for leaf nodes). As
>> Duncan notes, it should be possible to gather that from the table.
>>
>> But it would be nice if there was an "expr" column in the parse data
>> column in addition to "text". It would contain the parsed object.
>> Otherwise, to use the table, one is often reparsing the text, which
>> just seems redundant and inconvenient.
>>
>
> Can you (both Jim and Michael) describe the uses you might have for this?
> There are lots of possible changes that could make this information
> available:
>
>  - attach to each item in the parse tree, as the parser package did.  (Bad
> idea for general use which is why I dropped it, but
> it could be done as a special option to parse, if you aren't planning to
> evaluate the expression.)
>  - give the index into the parse tree of each item, i.e. c(1,1), c(1,2),
> c(1,3) in the example below, or just 1,2,3 along with a function to
> reconstruct the full path.
>  - give a copy of the branch of the parse tree, as Michael suggests.
>
> etc.  Which is best for your purposes?
>
> Duncan Murdoch
>
>
>> Michael
>>
>> On Wed, Jul 29, 2015 at 9:43 AM, Duncan Murdoch
>>  wrote:
>> > On 29/07/2015 12:13 PM, Jim Hester wrote:
>> >>
>> >> I would like to map the parsed tokens obtained from
>> utils::getParseData()
>> >> to the parse tree and elements obtained by base::parse().
>> >>
>> >> It looks like back when this code was in the parser package the parse()
>> >> function annotated the elements in the tree with their id, which would
>> >> allow you to perform this mapping.  However when the code was included
>> in
>> >> R
>> >> this functionality was removed.
>> >
>> >
>> > Yes, not all elements of the parse tree can legally have attributes
>> > attached.
>> >>
>> >>
>> >> ?getParseData states
>> >>The ‘id’ values are not attached to the elements of the parse
>> >>tree, they are only retained in the table returned by
>> >>‘getParseData’.
>> >>
>> >> Is there another way you can map between the getParseData() tokens and
>> >> elements of the parse tree that makes this additional annotation
>> >> unnecessary?  Or is this simply not possible?
>> >
>> >
>> > I think you can't get to it, though you can get close by looking at the
>> id &
>> > parent values in the

[Rd] unloadNamespace() does not address unevaluated promises in the S3 Methods Table

2015-12-22 Thread Jim Hester
Given the extremely simple package at
https://github.com/jimhester/testUnload, which includes only one S3 method
'print.object' the following code produces a lazy load error from a new R
session (R-devel r69801)

install.packages("testUnload", repos = NULL)
library("testUnload")
unloadNamespace("testUnload")
install.packages("testUnload", repos = NULL)
library("testUnload")
#> Error in get(method, envir = home) :
#>   lazy-load database '{sic}/testUnload/R/testUnload.rdb' is corrupt
#> In addition: Warning message:
#> In get(method, envir = home) : internal error -3 in R_decompress1
#> Error: package or namespace load failed for ‘testUnload’

Upon investigation this is because the code in registerS3Methods creates a
promise using 'delayedAssign' for 'print.object' function in the
'.__S3MethodsTable__.' environment within the base environment (which is
where the 'print' generic is defined). (see lines 1387-1489 in
src/library/base/R/namespace.R).

When the second install.packages is called the files are changed before the
original promise is evaluated, which causes the error. An easy way to see
this is to explicitly evaluate the promise prior to the reinstall, which
removes the error.

library("testUnload")
get(".__S3MethodsTable__.", envir = baseenv())$print.object
#> function(x, ...) x
#> 
unloadNamespace("testUnload")
install.packages("testUnload", repos = NULL)
library("testUnload")


Explicitly deleting the promise after unloading the namespace also fixes
this issue.

library("testUnload")
unloadNamespace("testUnload")
rm(list="print.object", envir = get(".__S3MethodsTable__.", envir =
baseenv()))
install.packages("testUnload", repos = NULL)
library("testUnload")


In my opinion, once the namespace is unloaded the corresponding entries
should be removed in the S3 Methods Table by default.

If others agree with this assessment I can try to provide a patch to
'unloadNamespace' to fix this, but I imagine it will be somewhat tricky to
get correct so others more familiar with this code may be better suited
than I.

Jim

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Support for user defined unary functions

2017-03-16 Thread Jim Hester
R has long supported user defined binary (infix) functions, defined
with `%fun%`. A one line change [1] to R's grammar allows users to
define unary (prefix) functions in the same manner.

`%chr%` <- function(x) as.character(x)
`%identical%` <- function(x, y) identical(x, y)

%chr% 100
#> [1] "100"

%chr% 100 %identical% "100"
#> [1] TRUE

This seems a natural extension of the existing functionality and
requires only a minor change to the grammar. If this change seems
acceptable I am happy to provide a complete patch with suitable tests
and documentation.

[1]:
Index: src/main/gram.y
===
--- src/main/gram.y (revision 72358)
+++ src/main/gram.y (working copy)
@@ -357,6 +357,7 @@
|   '+' expr %prec UMINUS   { $$ = xxunary($1,$2);
 setId( $$, @$); }
|   '!' expr %prec UNOT { $$ = xxunary($1,$2);
 setId( $$, @$); }
|   '~' expr %prec TILDE{ $$ = xxunary($1,$2);
 setId( $$, @$); }
+   |   SPECIAL expr{ $$ = xxunary($1,$2);
 setId( $$, @$); }
|   '?' expr{ $$ = xxunary($1,$2);
 setId( $$, @$); }

|   expr ':'  expr  { $$ =
xxbinary($2,$1,$3);  setId( $$, @$); }

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Support for user defined unary functions

2017-03-16 Thread Jim Hester
Gabe,

The unary functions have the same precedence as normal SPECIALS
(although the new unary forms take precedence over binary SPECIALS).
So they are lower precedence than unary + and -. Yes, both of your
examples are valid with this patch, here are the results and quoted
forms to see the precedence.

`%chr%` <- function(x) as.character(x)
`%identical%` <- function(x, y) identical(x, y)
quote("100" %identical% %chr% 100)
#>  "100" %identical% (`%chr%`(100))

"100" %identical% %chr% 100
#> [1] TRUE

`%num%` <- as.numeric
quote(1 + - %num% "5")
#> 1 + -(`%num%`("5"))

1 + - %num% "5"
#> [1] -4

Jim

On Thu, Mar 16, 2017 at 12:01 PM, Gabriel Becker  wrote:
> Jim,
>
> This seems cool. Thanks for proposing it. To be concrete, he user-defined
> unary operations would be of the same precedence (or just slightly below?)
> built-in unary ones? So
>
> "100" %identical% %chr% 100
>
> would work and return TRUE under your patch?
>
> And  with %num% <- as.numeric, then
>
> 1 + - %num% "5"
>
> would also be legal (though quite ugly imo) and work?
>
> Best,
> ~G
>
> On Thu, Mar 16, 2017 at 7:24 AM, Jim Hester 
> wrote:
>>
>> R has long supported user defined binary (infix) functions, defined
>> with `%fun%`. A one line change [1] to R's grammar allows users to
>> define unary (prefix) functions in the same manner.
>>
>> `%chr%` <- function(x) as.character(x)
>> `%identical%` <- function(x, y) identical(x, y)
>>
>> %chr% 100
>> #> [1] "100"
>>
>> %chr% 100 %identical% "100"
>> #> [1] TRUE
>>
>> This seems a natural extension of the existing functionality and
>> requires only a minor change to the grammar. If this change seems
>> acceptable I am happy to provide a complete patch with suitable tests
>> and documentation.
>>
>> [1]:
>> Index: src/main/gram.y
>> ===
>> --- src/main/gram.y (revision 72358)
>> +++ src/main/gram.y (working copy)
>> @@ -357,6 +357,7 @@
>> |   '+' expr %prec UMINUS   { $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>> |   '!' expr %prec UNOT { $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>> |   '~' expr %prec TILDE{ $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>> +   |   SPECIAL expr{ $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>> |   '?' expr{ $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>>
>> |   expr ':'  expr  { $$ =
>> xxbinary($2,$1,$3);  setId( $$, @$); }
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
>
> --
> Gabriel Becker, PhD
> Associate Scientist (Bioinformatics)
> Genentech Research

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Support for user defined unary functions

2017-03-16 Thread Jim Hester
I used the `function(x)` form to explicitly show the function was
being called with only one argument, clearly performance implications
are not relevant for these examples.

I think of this mainly as a gap in the tooling we provide users and
package authors. R has native prefix `+1`, functional `f(1)` and infix
`1 + 1` operators, but we only provide a mechanism to create user
defined functional and infix operators.

One could also argue that the user defined infix operators are also
ugly and could be replaced by `f(a, b)` calls as well; beauty is in
the eye of the beholder.

The unquote example [1] shows one example where this gap in tooling
caused authors to co-opt existing unary exclamation operator, this
same gap is part of the reason the formula [2] and question mark [3]
operators have been used elsewhere in non standard contexts.

If the language provided package authors with a native way to create
unary operators like it already does for the other operator types
these machinations would be unnecessary.

[1]: https://github.com/hadley/rlang/blob/master/R/tidy-unquote.R#L17
[2]: https://cran.r-project.org/package=ensurer
[3]: https://cran.r-project.org/package=types

On Thu, Mar 16, 2017 at 1:04 PM, Gabriel Becker  wrote:
> Martin,
>
> Jim can speak directly to his motivations; I don't claim to be able to do
> so. That said, I suspect this is related to a conversation on twitter about
> wanting an infix "unquote" operator in the context of the non-standard
> evaluation framework Hadley Wickham and Lionel Henry (and possibly others)
> are working on.
>
> They're currently using !!! and !! for things related to this, but this
> effectively requires non-standard parsing, as ~!!x is interpreted as
> ~(`!!`(x)) rather than ~(!(!(x)) as the R parser understands it. Others and
> I pointed out this was less than desirable, but if something like it was
> going to happen it would hopefully happen in the language specification,
> rather than in a package (and also hopefully not using !! specifically).
>
> Like you, I actually tend to prefer the functional form myself in most
> cases. There are functional forms that would work for the above case (e.g.,
> something like the .() that DBI uses), but that's probably off topic here,
> and not a decision I'm directly related to anyway.
>
> Best,
> ~G
>
>
>
> On Thu, Mar 16, 2017 at 9:51 AM, Martin Maechler
>  wrote:
>>
>> >>>>> Jim Hester 
>> >>>>> on Thu, 16 Mar 2017 12:31:56 -0400 writes:
>>
>> > Gabe,
>> > The unary functions have the same precedence as normal SPECIALS
>> > (although the new unary forms take precedence over binary SPECIALS).
>> > So they are lower precedence than unary + and -. Yes, both of your
>> > examples are valid with this patch, here are the results and quoted
>> > forms to see the precedence.
>>
>> > `%chr%` <- function(x) as.character(x)
>>
>>   [more efficient would be `%chr%` <- as.character]
>>
>> > `%identical%` <- function(x, y) identical(x, y)
>> > quote("100" %identical% %chr% 100)
>> > #>  "100" %identical% (`%chr%`(100))
>>
>> > "100" %identical% %chr% 100
>> > #> [1] TRUE
>>
>> > `%num%` <- as.numeric
>> > quote(1 + - %num% "5")
>> > #> 1 + -(`%num%`("5"))
>>
>> > 1 + - %num% "5"
>> > #> [1] -4
>>
>> > Jim
>>
>> I'm sorry to be a bit of a spoiler to "coolness", but
>> you may know that I like to  applaud Norm Matloff for his book
>> title "The Art of R Programming",
>> because for me good code should also be beautiful to some extent.
>>
>> I really very much prefer
>>
>>f(x)
>> to%f% x
>>
>> and hence I really really really cannot see why anybody would prefer
>> the ugliness of
>>
>>1 + - %num% "5"
>> to
>>1 + -num("5")
>>
>> (after setting  num <- as.numeric )
>>
>> Martin
>>
>>
>> > On Thu, Mar 16, 2017 at 12:01 PM, Gabriel Becker
>>  wrote:
>> >> Jim,
>> >>
>> >> This seems cool. Thanks for proposing it. To be concrete, he
>> user-defined
>> >> unary operations would be of the same precedence (or just slightly
>> below?)
>> >> built-in unary ones? So
>> >>
>> >> "100" %identical% %chr% 100
>> >>
>> >&g

Re: [Rd] Support for user defined unary functions

2017-03-17 Thread Jim Hester
This works the same way as `?` is defined in R code, and `-`, `+`
(defined in C) do now, you define one function that handles calls with
both unary and binary arguments.

quote(a %f% %f% b)
#> a %f% (`%f%`(b))

`%f%` <- function(a, b) if (missing(b)) { force(a); cat("unary\n")
} else { force(a);force(b);cat("binary\n") }
a <- 1
b <- 2

a %f% %f% b
#> unary
#> binary


This also brings up the point about what happens to existing user
defined functions such as `%in%` when they are used as unary functions
(likely by mistake). Happily this provides a useful error when run
assuming no default value of the second argument.

%in% a
#> Error in match(x, table, nomatch = 0L) :
#>   argument "table" is missing, with no default


On Thu, Mar 16, 2017 at 7:13 PM, Duncan Murdoch
 wrote:
> I don't have a positive or negative opinion on this yet, but I do have a
> question.  If I define both unary and binary operators with the same name
> (in different frames, presumably), what would happen?
>
> Is "a %chr% b" a syntax error if unary %chr% is found first?  If both might
> be found, does "a %chr% %chr% b" mean "%chr%(a, %chr% b)", or is it a syntax
> error (like typing "a %chr%(%chr%(b))" would be)?
>
> Duncan Murdoch
>
>
>
>
>
> On 16/03/2017 10:24 AM, Jim Hester wrote:
>>
>> R has long supported user defined binary (infix) functions, defined
>> with `%fun%`. A one line change [1] to R's grammar allows users to
>> define unary (prefix) functions in the same manner.
>>
>> `%chr%` <- function(x) as.character(x)
>> `%identical%` <- function(x, y) identical(x, y)
>>
>> %chr% 100
>> #> [1] "100"
>>
>> %chr% 100 %identical% "100"
>> #> [1] TRUE
>>
>> This seems a natural extension of the existing functionality and
>> requires only a minor change to the grammar. If this change seems
>> acceptable I am happy to provide a complete patch with suitable tests
>> and documentation.
>>
>> [1]:
>> Index: src/main/gram.y
>> ===
>> --- src/main/gram.y (revision 72358)
>> +++ src/main/gram.y (working copy)
>> @@ -357,6 +357,7 @@
>> |   '+' expr %prec UMINUS   { $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>> |   '!' expr %prec UNOT { $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>> |   '~' expr %prec TILDE{ $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>> +   |   SPECIAL expr{ $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>> |   '?' expr{ $$ = xxunary($1,$2);
>>  setId( $$, @$); }
>>
>> |   expr ':'  expr  { $$ =
>> xxbinary($2,$1,$3);  setId( $$, @$); }
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Support for user defined unary functions

2017-03-17 Thread Jim Hester
I agree there is no reason they _need_ to be the same precedence, but
I think SPECIALS are already have the proper precedence for both unary
and binary calls. Namely higher than all the binary operators (except
for `:`), but lower than the other unary operators. Even if we gave
unary specials their own precedence I think it would end up in the
same place.

`%l%` <- function(x) tail(x, n = 1)
%l% 1:5
#> [1] 5
%l% -5:-10
#> [1] -10

On Thu, Mar 16, 2017 at 6:57 PM, William Dunlap  wrote:
> I am biased against introducing new syntax, but if one is
> experimenting with it one should make sure the precedence feels right.
> I think the unary and binary minus-sign operators have different
> precedences so I see no a priori reason to make the unary and binary
> %xxx% operators to be the same.
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Thu, Mar 16, 2017 at 3:18 PM, Michael Lawrence
>  wrote:
>> I guess this would establish a separate "namespace" of symbolic prefix
>> operators, %*% being an example in the infix case. So you could have stuff
>> like %?%, but for non-symbolic (spelled out stuff like %foo%), it's hard to
>> see the advantage vs. foo(x).
>>
>> Those examples you mention should probably be addressed (eventually) in the
>> core language, and it looks like people are already able to experiment, so
>> I'm not sure there's a significant impetus for this change.
>>
>> Michael
>>
>>
>> On Thu, Mar 16, 2017 at 10:51 AM, Jim Hester 
>> wrote:
>>
>>> I used the `function(x)` form to explicitly show the function was
>>> being called with only one argument, clearly performance implications
>>> are not relevant for these examples.
>>>
>>> I think of this mainly as a gap in the tooling we provide users and
>>> package authors. R has native prefix `+1`, functional `f(1)` and infix
>>> `1 + 1` operators, but we only provide a mechanism to create user
>>> defined functional and infix operators.
>>>
>>> One could also argue that the user defined infix operators are also
>>> ugly and could be replaced by `f(a, b)` calls as well; beauty is in
>>> the eye of the beholder.
>>>
>>> The unquote example [1] shows one example where this gap in tooling
>>> caused authors to co-opt existing unary exclamation operator, this
>>> same gap is part of the reason the formula [2] and question mark [3]
>>> operators have been used elsewhere in non standard contexts.
>>>
>>> If the language provided package authors with a native way to create
>>> unary operators like it already does for the other operator types
>>> these machinations would be unnecessary.
>>>
>>> [1]: https://github.com/hadley/rlang/blob/master/R/tidy-unquote.R#L17
>>> [2]: https://cran.r-project.org/package=ensurer
>>> [3]: https://cran.r-project.org/package=types
>>>
>>> On Thu, Mar 16, 2017 at 1:04 PM, Gabriel Becker 
>>> wrote:
>>> > Martin,
>>> >
>>> > Jim can speak directly to his motivations; I don't claim to be able to do
>>> > so. That said, I suspect this is related to a conversation on twitter
>>> about
>>> > wanting an infix "unquote" operator in the context of the non-standard
>>> > evaluation framework Hadley Wickham and Lionel Henry (and possibly
>>> others)
>>> > are working on.
>>> >
>>> > They're currently using !!! and !! for things related to this, but this
>>> > effectively requires non-standard parsing, as ~!!x is interpreted as
>>> > ~(`!!`(x)) rather than ~(!(!(x)) as the R parser understands it. Others
>>> and
>>> > I pointed out this was less than desirable, but if something like it was
>>> > going to happen it would hopefully happen in the language specification,
>>> > rather than in a package (and also hopefully not using !! specifically).
>>> >
>>> > Like you, I actually tend to prefer the functional form myself in most
>>> > cases. There are functional forms that would work for the above case
>>> (e.g.,
>>> > something like the .() that DBI uses), but that's probably off topic
>>> here,
>>> > and not a decision I'm directly related to anyway.
>>> >
>>> > Best,
>>> > ~G
>>> >
>>> >
>>> >
>>> > On Thu, Mar 16, 2017 at 9:51 AM, Martin Maechler
>>> >  wrote:
>>> >>
>>> >> 

Re: [Rd] Support for user defined unary functions

2017-03-17 Thread Jim Hester
The unquoting discussion is IMHO separate from this proposal and as
you noted probably better served by a native operator with different
precedence.

I think the main benefit to providing user defined prefix operators is
it allows package authors to experiment with operator ideas and gauge
community interest. The current situation means any novel unary
semantics either need to co-opt existing unary operators or propose
changes to the R parser, neither of which is ideal for
experimentation.

The user defined pipe operator (%>%), now used by > 300 packages, is
an example that giving package authors the power to experiment can
produce beneficial ideas for the community.

On Fri, Mar 17, 2017 at 9:46 AM, Gabriel Becker  wrote:
> Jim,
>
> One more note about precedence. It prevents a solution like the one you
> proposed from solving all of the problems you cited. By my reckoning, a
> "What comes next is for NSE" unary operator needs an extremely low
> precedence, because it needs to greedily grab "everything" (or a large
> amount) that comes after it. Normal-style unary operators, on the other
> hand, explicitly don't want that.
>
> From what I can see, your patch provides support for the latter but not the
> former.
>
> That said I think there are two issues here. One is can users define unary
> operators. FWIW my opinion on that is roughly neutral to slightly positive.
> The other issue is can we have quasi quotation of the type that Hadley and
> Lionel need in the language. This could be solved without allowing
> user-defined unary specials, and we would probably want it to be, as I doubt
> ~ %!%x + %!%y + z is  particularly aesthetically appealing to most (it isn't
> to me). I'd propose coopting unary @ for that myself. After off list
> discussions with Jonathan Carrol and with Michael Lawrence I think it's
> doable, unambiguous, and even imo pretty intuitive for an "unquote"
> operator.
>
> Best,
> ~G
>
> On Fri, Mar 17, 2017 at 5:10 AM, Jim Hester 
> wrote:
>>
>> I agree there is no reason they _need_ to be the same precedence, but
>> I think SPECIALS are already have the proper precedence for both unary
>> and binary calls. Namely higher than all the binary operators (except
>> for `:`), but lower than the other unary operators. Even if we gave
>> unary specials their own precedence I think it would end up in the
>> same place.
>>
>> `%l%` <- function(x) tail(x, n = 1)
>> %l% 1:5
>> #> [1] 5
>> %l% -5:-10
>> #> [1] -10
>>
>> On Thu, Mar 16, 2017 at 6:57 PM, William Dunlap  wrote:
>> > I am biased against introducing new syntax, but if one is
>> > experimenting with it one should make sure the precedence feels right.
>> > I think the unary and binary minus-sign operators have different
>> > precedences so I see no a priori reason to make the unary and binary
>> > %xxx% operators to be the same.
>> > Bill Dunlap
>> > TIBCO Software
>> > wdunlap tibco.com
>> >
>> >
>> > On Thu, Mar 16, 2017 at 3:18 PM, Michael Lawrence
>> >  wrote:
>> >> I guess this would establish a separate "namespace" of symbolic prefix
>> >> operators, %*% being an example in the infix case. So you could have
>> >> stuff
>> >> like %?%, but for non-symbolic (spelled out stuff like %foo%), it's
>> >> hard to
>> >> see the advantage vs. foo(x).
>> >>
>> >> Those examples you mention should probably be addressed (eventually) in
>> >> the
>> >> core language, and it looks like people are already able to experiment,
>> >> so
>> >> I'm not sure there's a significant impetus for this change.
>> >>
>> >> Michael
>> >>
>> >>
>> >> On Thu, Mar 16, 2017 at 10:51 AM, Jim Hester 
>> >> wrote:
>> >>
>> >>> I used the `function(x)` form to explicitly show the function was
>> >>> being called with only one argument, clearly performance implications
>> >>> are not relevant for these examples.
>> >>>
>> >>> I think of this mainly as a gap in the tooling we provide users and
>> >>> package authors. R has native prefix `+1`, functional `f(1)` and infix
>> >>> `1 + 1` operators, but we only provide a mechanism to create user
>> >>> defined functional and infix operators.
>> >>>
>> >>> One could also argue that the user defined infix operators are also
>> >>> ugly and could be replac

[Rd] Consider increasing the size of HSIZE

2017-05-16 Thread Jim Hester
The HSIZE constant, which sets the size of the hash table used to
store symbols is currently defined as `#define HSIZE 4119`. This value
was last increased in r5182 on 1999-07-15.

https://github.com/jimhester/hashsize#readme contains a code which
simulates a normal R workflow by loading a handful of packages. In the
example more than 20,000 symbols are included in the hash table,
resulting in a load factor of greater than 5. The histogram in the
linked repository shows the distribution of bucket sizes for the hash
table.

This high load factor means most queries into the hashtable result in
a collision, requiring an additional linear search of the linked list
for each bucket. Is is common for growable hash tables to increase
their size when the load factor is greater than .75, so I think it
would be of benefit to increase the HSIZE constant considerably; to
32768 or possibly 65536. This will result in increased memory
requirements for the hash table, but far fewer collisions.

To get an idea of the performance implications the repository includes
some benchmarks of looking up the first element in a given hash
bucket, and the last element (for buckets over 10 elements long). The
results are somewhat noisy. Because longer symbol names hashing the
name and performing string comparisons to searching the list tends to
dominate the time. But for symbols of similar length there is a 2X-4X
increase in lookup performance between retrieving the first element in
a bucket to retrieving the last (indicated by the `total` column in
the table).

Increasing the size of `HSIZE` seems like a easy way to improve the
performance of an operation that occurs thousands if not millions of
times for every R session, with very limited cost in memory.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R history: Why 'L; in suffix character ‘L’ for integer constants?

2017-06-16 Thread Jim Hester
The relevant sections of the C standard are
http://c0x.coding-guidelines.com/5.2.4.2.1.html, which specifies that C
ints are only guaranteed to be 16 bits, C long ints at least 32 bits in
size, as Peter mentioned. Also http://c0x.coding-guidelines.com/6.4.4.1.html
specifies l or L as the suffix for a long int constants.

However R does define integers as `int` in it's source code, so use of L is
not strictly correct if a compiler uses 16 bit int types. I guess this
ambiguity is why the `int32_t` typedef exists.

On Fri, Jun 16, 2017 at 3:01 PM, William Dunlap via R-devel <
r-devel@r-project.org> wrote:

> "Writing R Extensions" says "int":
>
> R storage mode  C type  FORTRAN type
> logical  int*  INTEGER
> integer  int*  INTEGER
> double  double*  DOUBLE PRECISION
> complex  Rcomplex*  DOUBLE COMPLEX
> character  char**  CHARACTER*255
> raw  unsigned char*  none
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Fri, Jun 16, 2017 at 11:53 AM, peter dalgaard  wrote:
> >
> > Wikipedia claims that C ints are still only guaranteed to be at least 16
> bits, and longs are at least 32 bits. So no, R's integers are long.
> >
> > -pd
> >
> > > On 16 Jun 2017, at 20:20 , William Dunlap via R-devel <
> r-devel@r-project.org> wrote:
> > >
> > > But R "integers" are C "ints", as opposed to S "integers", which are C
> > > "long ints".  (I suppose R never had to run on ancient hardware with 16
> bit
> > > ints.)
> > >
> > > Bill Dunlap
> > > TIBCO Software
> > > wdunlap tibco.com
> > >
> > > On Fri, Jun 16, 2017 at 10:47 AM, Yihui Xie  wrote:
> > >
> > >> Yeah, that was what I heard from our instructor when I was a graduate
> > >> student: L stands for Long (integer).
> > >>
> > >> Regards,
> > >> Yihui
> > >> --
> > >> https://yihui.name
> > >>
> > >>
> > >> On Fri, Jun 16, 2017 at 11:00 AM, Serguei Sokol <
> so...@insa-toulouse.fr
> >
> > >> wrote:
> > >>> Le 16/06/2017 à 17:54, Henrik Bengtsson a écrit :
> > 
> >  I'm just curious (no complaints), what was the reason for choosing
> the
> >  letter 'L' as a suffix for integer constants?  Does it stand for
> >  something (literal?), is it because it visually stands out, ..., or
> no
> >  specific reason at all?
> > >>>
> > >>> My guess is that it is inherited form C "long integer" type (contrary
> to
> > >>> "short integer" or simply "integer")
> > >>> https://en.wikipedia.org/wiki/C_data_types
> > >>
> > >> __
> > >> R-devel@r-project.org mailing list
> > >> https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > --
> > Peter Dalgaard, Professor,
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Office: A 4.23
> > Email: pd@cbs.dk  Priv: pda...@gmail.com
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] The ByteCompile & LazyLoading fields

2017-07-04 Thread Jim Hester
In WRE [1] it states

> Several optional fields take logical values: these can be specified as ‘yes’, 
> ‘true’, ‘no’ or ‘false’: capitalized values are also accepted.

And if you look at the source [2], [3] you will see exactly what
values this entails.

[1]: 
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file
[2]: 
https://github.com/wch/r-source/blob/212add0254abe36d1f77e5248f9c9a2bf95884d8/src/library/tools/R/install.R#L
[3]: 
https://github.com/wch/r-source/blob/a3a73a730962fa214b4af0ded55b497fb5688b8b/src/library/tools/R/utils.R#L2162-L2168

On Mon, Jul 3, 2017 at 2:35 PM, Colin Gillespie  wrote:
> Hi,
>
> In the DESCRIPTION file the ByteCompile and LazyLoading arguments appear to
> accept any value.
>
> From the manual the field should be a "logical field". However, authors
> interpret this in a variety of ways:
>
> unique(tools::CRAN_package_db()$ByteCompile)
> # [1] NA "TRUE" "yes"  "true" "Yes"  "no"
> # unique(tools::CRAN_package_db()$LazyData)
> # [1] NA   "true"   "TRUE"   "yes"
>  "no" "false"
> # [7] "True"   "Yes""FALSE"  "YES"
>  "LazyData: true" "NA"
> # [13] "No"
>
> I presume that all non NA are treated as TRUE.
>
> This observation applies to other logical fields in the DESCRIPTION file.
>
> Colin
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] write.csv

2017-07-04 Thread Jim Hester
On linux at least you can use `/dev/full` [1] to test writing to a full device.

> echo 'foo' > /dev/full
bash: echo: write error: No space left on device

Although that won't be a perfect test for this case where part of the
file is written successfully.

An alternative suggestion for testing this is to create and mount a
loop device [2] with a small file.

[1]: https://en.wikipedia.org/wiki//dev/full
[2]: https://stackoverflow.com/a/16044420/2055486

On Tue, Jul 4, 2017 at 3:38 PM, Duncan Murdoch  wrote:
> On 04/07/2017 8:40 AM, Lipatz Jean-Luc wrote:
>>
>> I would really like the bug fixed. At least this one, because I know
>> people in my institute using this function.
>> I understand your arguments about open source, but I also saw in this mail
>> list a proposal for a fix for this bug for which there were no answer from
>> the people who are able to include it in the distribution. It looks like if
>> there were interesting bugs and the other ones.
>
>
> Please post a link to that, and I'll look.  Bug reports should be posted to
> the bug list.  It's unfortunate that it is currently so difficult to do so,
> but if they are only posted here, they are often overlooked.
>
>> I don't understand the other arguments : the example was reproduced with a
>> simple USB key and you cannot state that a disk will eternally be empty
>> enough, specially when it has several users.
>
>
> I am not denying that it's a bug, I'm just saying that it is a difficult one
> to test automatically (so we probably won't add a regression test once it's
> fixed), and it's not one that has been reported often.  I didn't know there
> were any reports before yours.
>
> Duncan Murdoch
>
>
>> JLL
>>
>>
>> -Message d'origine-
>> De : Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
>> Envoyé : mardi 4 juillet 2017 14:24
>> À : Lipatz Jean-Luc; r-devel@r-project.org
>> Objet : Re: [Rd] write.csv
>>
>> On 04/07/2017 5:40 AM, Lipatz Jean-Luc wrote:
>>>
>>> Hi all,
>>>
>>> I am currently studying how to generalize the usage of R in my
>>> statistical institute and I encountered a problem that I cannot declare on
>>> bugzilla (cannot understand why).
>>
>>
>> Bugzilla was badly abused by spammers last year, so you need to have your
>> account created manually by one of the admins to post there.  Write to me
>> privately if you'd like me to create an account for you.  (If you want it
>> attached to a different email address, that's fine.)
>>
>> Sorry for trying this mailing list but I am really worried about the
>> problem itself and the possible implications in using R in a professionnal
>> data production context.
>>>
>>> The issue about 'write.csv' is that it just doesn't check if there is
>>> enough space on disk and doesn't report failure to write data.
>>>
>>> Example (R 3.4.0 windows 32 bits, but I reproduced the problem with older
>>> versions and under Mac OS/X)
>>>
 fwrite(as.list(1:100),"G:/Test")
>>>
>>> Error in fwrite(as.list(1:1e+06), "G:/Test") :
>>>   No space left on device: 'G:/Test'

 write.csv(1:100,"G:/Test")

>>>
>>> I have a big concern here, because it means that you could save some
>>> important data at one point of time and discover a long time after that you
>>> actually lost them.
>>
>>  > I suppose that the fix is relatively straightforward, but how can we
>> be sure that there is no another function with the same bad properties?
>>
>> R is open source.  You could work out the patch for this bug, and in the
>> process see the pattern of coding that leads to it.  Then you'll know if
>> other functions use the same buggy pattern.
>>
>>> Is the lesson that you should not use a R function, even from the core,
>>> without having personnally tested it against extreme conditions?
>>
>>
>> I think the answer to that is yes.  Most people never write such big
>> files that they fill their disk:  if they did, all sorts of things would
>> go wrong on their systems.  So this kind of extreme condition isn't
>> often tested.  It's not easy to test in a platform independent way:  R
>> would need to be able to create a volume with a small capacity.  That's
>> a very system-dependent thing to do.
>>
>>> And wouldn't it be the work of the developpers to do such elementary
>>> tests?
>>
>>
>> Again, R is open source.  You can and should contribute code (and
>> therefore become one of the developers) if you are working in unusual
>> conditions.
>>
>> R states quite clearly in the welcome message every time it starts: "R
>> is free software and comes with ABSOLUTELY NO WARRANTY."  This is
>> essentially the same lack of warranty that you get with commercial
>> software, though it's stated a lot more clearly.
>>
>> Duncan Murdoch
>>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Wish List: base::source() + Add Execution Time Argument

2017-12-21 Thread Jim Hester
R does provide the addTaskCallback / taskCallbackManager to run a
callback function after every top level command. However there is not
an equivalent interface that would be run _before_ each command, which
would make it possible to time of top level calls and provide other
execution measurements.

On Thu, Dec 21, 2017 at 11:31 AM, William Dunlap via R-devel
 wrote:
> Is source() the right place for this?  It may be, but we've had customers
> who would like
> this sort of thing done for commands entered by hand.  And there are those
> who want
> a description of any "non-triivial" objects created in .GlobalEnv by each
> expression, ...
> Do they need a way to wrap each expression evaluated in envir=.GlobalEnv
> with a
> function of their choice, one that would print times, datasets created,
> etc.?
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Dec 21, 2017 at 3:46 AM, Juan Telleria  wrote:
>
>> Dear R Developers,
>>
>> Adding to source() base function a Timer which indicates the execution time
>> of the source code would be a very well welcome feature, and in my opinion
>> not difficult to implement as an additional funtion argument.
>>
>> The source(timing = TRUE) function shall execute internally the following
>> code for each statement:
>>
>> old <- Sys.time() # get start time at the beginning of source()
>> # source code
>> # print elapsed time
>> new <- Sys.time() - old # calculate difference
>> print(new) # print in nice format
>>
>> Thank you.
>>
>> Kind regards,
>>
>> Juan Telleria
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] subset data.frame at C level

2020-06-23 Thread Jim Hester
It looks to me like internally .subset2 uses `get1index()`, but this
function is declared in Defn.h, which AFAIK is not part of the exported R
API.

 Looking at the code for `get1index()` it looks like it just loops over the
(translated) names, so I guess I just do that [0].

[0]:
https://github.com/r-devel/r-svn/blob/1ff1d4197495a6ee1e1d88348a03ff841fd27608/src/main/subscript.c#L226-L235

On Wed, Jun 17, 2020 at 6:11 AM Morgan Morgan 
wrote:

> Hi,
>
> Hope you are well.
>
> I was wondering if there is a function at C level that is equivalent to
> mtcars$carb or .subset2(mtcars, "carb").
>
> If I have the index of the column then the answer would be VECTOR_ELT(df,
> asInteger(idx)) but I was wondering if there is a way to do it directly
> from the name of the column without having to loop over columns names to
> find the index?
>
> Thank you
> Best regards
> Morgan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Sys.timezone() fails on Linux under Microsoft WSL

2021-04-26 Thread Jim Hester
One way to avoid the call to timedatectl is to set the `TZ` environment
variable on your machine to your local timezone, if this is set
`Sys.timezone()` uses this and does not try to query timedatectl for the
timezone.

This is a common issue as well in docker containers, as like on WSL in
docker timedatectl is present, but non-functional.

Jim

On Tue, Apr 13, 2021 at 9:19 AM Brenton Wiernik  wrote:

> In Microsoft’s Windows Subsystem for Linux (WSL or WSL2), there is not
> system framework, so utilities that depend on it fail. This includes
> timedatectl which R uses in Sys.timezone(). The timedatectl utility is
> present on Linux systems installed under WSL/WSL2, but is non-functional.
> So, when Sys.timezone() checks for Sys.which("timedatectl"), it receives a
> false positive. The subsequent methods after this if () do work, however.
>
> This can be fixed if line 42 of Sys.timezone() were changed from:
> if (nzchar(Sys.which("timedatectl"))) {
>
> to:
> if (nzchar(Sys.which("timedatectl")) && !grepl("microsoft", system("uname
> -r", intern = TRUE), ignore.case = TRUE)) {
>
> "uname -r" returns for example:
> "5.4.72-microsoft-standard-WSL2"
>
> So checking for "microsoft" or "WSL" would probably work.
>
>
> Brenton Wiernik
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Possible ALTREP bug

2021-05-28 Thread Jim Hester
>From reading the discussion on the Bioconductor issue tracker it seems like
the reason the GC is not suspended for the non-string ALTREP Elt methods is
primarily due to performance concerns.

If this is the case perhaps an additional flag could be added to the
`R_set_altrep_*()` functions so ALTREP authors could indicate if GC should
be halted when that particular method is called for that particular ALTREP
class.

This would avoid the performance hit (other than a boolean check) for the
standard case when no allocations are expected, but allow authors to
indicate that R should pause GC if needed for methods in their class.

On Fri, May 28, 2021 at 9:42 AM  wrote:

> integer and real Elt methods are not expected to allocate. You would
> have to suspend GC to be able to do that. This currently can't be done
> from package code.
>
> Best,
>
> luke
>
> On Fri, 28 May 2021, Gábor Csárdi wrote:
>
> > I have found some weird SEXP corruption behavior with ALTREP, which
> > could be a bug. (Or I could be doing something wrong.)
> >
> > I have an integer ALTREP vector that calls back to R from the Elt
> > method. When this vector is indexed in a lapply(), its first element
> > gets corrupted. Sometimes it's just a type change to logical, but
> > sometimes the corruption causes a crash.
> >
> > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small package
> > that demonstrates this: https://github.com/gaborcsardi/redfish
> >
> > The R callback in this package calls `loadNamespace("Matrix")`, but
> > the same crash happens for other packages as well, and sometimes it
> > also happens if I don't load any packages at all. (But that example
> > was much more complicated, so I went with the package loading.)
> >
> > It is somewhat random, and sometimes turning off the JIT avoids the
> > crash, but not always.
> >
> > Hopefully I am just doing something wrong in the ALTREP code (see
> > https://github.com/gaborcsardi/redfish/blob/main/src/test.c), and it
> > is not actually a bug.
> >
> > Thanks,
> > Gabor
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R test coverage

2023-11-27 Thread Jim Hester
It should be possible to use covr to do this, see this old issue
(https://github.com/r-lib/covr/issues/59). However interpreting the
results can prove challenging as naturally covr uses functions
from the base internally. Unfortunately the base and many of the
internal and recommended packages have somewhat bespoke installation,
so it would likely take some additional work on your end to get
reporting working. I had thought of doing this at one point, but
wasn't sure there would be any audience for the results, so did not
pursue it further.

For measuring coverage of the C code in the R runtime alone you could
use gcov and run the test suite, which depending on your contribution
may be the most informative route.

(replying for the list, accidentally replied only to Lluís the first time)

On Mon, Nov 27, 2023 at 10:15 AM Lluís Revilla  wrote:
>
> Hi all,
>
> I recently proposed a change in R that has led to the addition of new code
> to the R source code.
>
> The code added, as far as I know, has no tests and it might affect many
> packages in CRAN.
> I was wondering if adding a test would be helpful or it is already covered
> by some other test.
> Which brought me to the question: what is the coverage of R checks
> (check-devel)?
>
> My searches in the mailing list or other places didn't return anything
> close to it.
> But I am curious if someone has already done this and how.
>
> Many thanks,
>
> Lluís
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] LDFLAGS defined in R_MAKEVARS_USER file is ignored for R CMD SHLIB on Windows

2015-05-11 Thread Jim Hester
Example input and output to reproduce this can be found at
https://gist.github.com/jimhester/b7f05f50794c88e44b17.

I tested this attempting to compile the [digest](
http://cran.r-project.org/web/packages/digest/index.html) package, `run.sh`
and `run.bat` were both run in the package source directory on Ubuntu 14.01
and Windows 7 respectively.

In particular while the `CFLAGS` values were properly passed to the
compiler on both Linux and Windows, the `LDFLAGS` value was only passed to
the linker on Linux, which caused the subsequent linking errors on Windows.

Perhaps this is intended behavior, if so is there a different compiler
variable I can use to pass flags to the linker on Windows?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] LDFLAGS defined in R_MAKEVARS_USER file is ignored for R CMD SHLIB on Windows

2015-05-13 Thread Jim Hester
I have tracked this discrepancy down to the use of `SHLIB_LD` rather than
`SHLIB_LINK` in share/make/winshlib.mk
<https://github.com/wch/r-source/blob/7348d71d1cb18e9c4b55950fd57198e8d2abcc8b/share/make/winshlib.mk>.
This variable has been used in winshlib.mk since svn r47953
<https://github.com/wch/r-source/commit/3ebd185c0745bdc7cb8dd185bd7df5ff7f827f18>,
however the corresponding shlib.mk for linux has always used `SHLIB_LINK`
instead.

The attached patch updates the variables in winshlib.mk to use `SHLIB_LINK`
and makes the behavior consistent across platforms, which fixes my issue.

On Mon, May 11, 2015 at 12:28 PM, Jim Hester 
wrote:

> Example input and output to reproduce this can be found at
> https://gist.github.com/jimhester/b7f05f50794c88e44b17.
>
> I tested this attempting to compile the [digest](
> http://cran.r-project.org/web/packages/digest/index.html) package,
> `run.sh` and `run.bat` were both run in the package source directory on
> Ubuntu 14.01 and Windows 7 respectively.
>
> In particular while the `CFLAGS` values were properly passed to the
> compiler on both Linux and Windows, the `LDFLAGS` value was only passed to
> the linker on Linux, which caused the subsequent linking errors on Windows.
>
> Perhaps this is intended behavior, if so is there a different compiler
> variable I can use to pass flags to the linker on Windows?
>
Index: share/make/winshlib.mk
===
--- share/make/winshlib.mk	(revision 68364)
+++ share/make/winshlib.mk	(working copy)
@@ -10,13 +10,13 @@
 $(SHLIB): $(OBJECTS)
 	@if test "z$(OBJECTS)" != "z"; then \
 	  if test -e "$(BASE)-win.def"; then \
-	echo $(SHLIB_LD) -shared $(DLLFLAGS) -o $@ $(BASE)-win.def $(OBJECTS) $(ALL_LIBS); \
-	$(SHLIB_LD) -shared $(DLLFLAGS) -o $@ $(BASE)-win.def $(OBJECTS) $(ALL_LIBS); \
+	echo $(SHLIB_LINK) -shared $(DLLFLAGS) -o $@ $(BASE)-win.def $(OBJECTS) $(ALL_LIBS); \
+	$(SHLIB_LINK) -shared $(DLLFLAGS) -o $@ $(BASE)-win.def $(OBJECTS) $(ALL_LIBS); \
 	  else \
 	echo EXPORTS > tmp.def; \
 	$(NM) $^ | $(SED) -n $(SYMPAT) $(NM_FILTER) >> tmp.def; \
-	echo $(SHLIB_LD) -shared $(DLLFLAGS) -o $@ tmp.def $(OBJECTS) $(ALL_LIBS); \
-	$(SHLIB_LD) -shared $(DLLFLAGS) -o $@ tmp.def $(OBJECTS) $(ALL_LIBS); \
+	echo $(SHLIB_LINK) -shared $(DLLFLAGS) -o $@ tmp.def $(OBJECTS) $(ALL_LIBS); \
+	$(SHLIB_LINK) -shared $(DLLFLAGS) -o $@ tmp.def $(OBJECTS) $(ALL_LIBS); \
 	$(RM) tmp.def; \
 	  fi \
 	fi
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel