from:"Hugh Marera"

Re: [Rd] Use of C++ in Packages

2019-04-25 Thread Hugh Marera

Some of us are learning about development in R and use R in our work data
analysis pipelines. What is the best way to identify packages that
currently have these C++ problems? I would like to be able to help fix the
bugs but more importantly not use these packages in critical work
pipelines. Any C++ R package bug squashing events out there?

Regards

Hugh

On Mon, Apr 1, 2019 at 6:23 PM Tomas Kalibera 
wrote:

> On 3/30/19 8:59 AM, Romain Francois wrote:
> > tl;dr: we need better C++ tools and documentation.
> >
> > We collectively know more now with the rise of tools like rchk and
> improved documentation such as Tomas’s post. That’s a start, but it appears
> that there still is a lot of knowledge that would deserve to be promoted to
> actual documentation of best practices.
> Well there is quite a bit of knowledge in Writing R Extensions and many
> problems could have been prevented had it been read more thoroughly by
> package developers. The problem that C++ runs some functions
> automatically (like destructors), should not be too hard to identify
> based on what WRE says about the need for protection against garbage
> collection.
>
>  From my experience, one can learn most about R internals from debugging
> and reading source code - when debugging PROTECT errors and other memory
> errors/memory corruption, common problems caused by bugs in native C/C++
> code - one needs to read and understand source code involved at all
> layers, one needs to understand the documentation covering code at
> different layers, and one has to think about these things, forming
> hypotheses, narrowing down to smaller examples, etc.
>
> My suggestion for package authors who write native code and want to
> learn more, and who want to be responsible (these kinds of bugs affect
> other packaged indirectly and can be woken up by inconsequential and
> correct code changes, even in R runtime): test and debug your code hard
> - look at UBSAN/ASAN/valgrind/rchk checks from CRAN and run these tools
> yourself if needed. Run with strict barrier checking and with gctorture.
> Write more tests to increase the coverage. Specifically now if you use
> C++ code, try to read all of your related code and check you do not have
> the problems I mentioned in my blog. Think of other related problems and
> if you find about them, tell others. Make sure you only use the API from
> Writing R Extensions (and R help system). If you really can't find
> anything wrong about your package, but still want to learn more, try to
> debug some bugs reported against R runtime or against your favorite
> packages you use (or their CRAN check reports from various tools). In
> addition to learning more about R internals, by spending much more time
> on debugging you may also get a different perspective on some of the
> things about C++ I pointed to. Finally, it would help us with the
> problem we have now - that many R packages in C++ have serious bugs.
>
> Tomas
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] should base R have a piping operator ?

2019-10-05 Thread Hugh Marera

How is your argument different to, say,  "Should dplyr or data.table be
part of base R as they are the most popular data science packages and they
are used by a large number of users?"

Kind regards

On Sat, Oct 5, 2019 at 4:34 PM Ant F  wrote:

> Dear R-devel,
>
> The most popular piping operator sits in the package `magrittr` and is used
> by a huge amount of users, and imported /reexported by more and more
> packages too.
>
> Many workflows don't even make much sense without pipes nowadays, so the
> examples in the doc will use pipes, as do the README, vignettes etc. I
> believe base R could have a piping operator so packages can use a pipe in
> their code or doc and stay dependency free.
>
> I don't suggest an operator based on complex heuristics, instead I suggest
> a very simple and fast one (>10 times than magrittr in my tests) :
>
> `%.%` <- function (e1, e2) {
>   eval(substitute(e2), envir = list(. = e1), enclos = parent.frame())
> }
>
> iris %.% head(.) %.% dim(.)
> #> [1] 6 5
>
> The difference with magrittr is that the dots must all be explicit (which
> sits with the choice of the name), and that special magrittr features such
> as assignment in place and building functions with `. %>% head() %>% dim()`
> are not supported.
>
> Edge cases are not surprising:
>
> ```
> x <- "a"
> x %.% quote(.)
> #> .
> x %.% substitute(.)
> #> [1] "a"
>
> f1 <- function(y) function() eval(quote(y))
> f2 <- x %.% f1(.)
> f2()
> #> [1] "a"
> ```
>
> Looking forward for your thoughts on this,
>
> Antoine
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] should base R have a piping operator ?

2019-10-05 Thread Hugh Marera

I exaggerated the comparison for effect. However, it is not very difficult
to find functions in dplyr or data.table or indeed other packages that one
may wish to be in base R. Examples, for me, could include
data.table::fread, dplyr::group_by & dplyr::summari[sZ]e combo, etc. Also,
the "popularity" of magrittr::`%>%` is mostly attributable to the tidyverse
(an advanced superset of R). Many R users don't even know that they are
installing the magrittr package.

On Sat, Oct 5, 2019 at 6:30 PM Iñaki Ucar  wrote:

> On Sat, 5 Oct 2019 at 17:15, Hugh Marera  wrote:
> >
> > How is your argument different to, say,  "Should dplyr or data.table be
> > part of base R as they are the most popular data science packages and
> they
> > are used by a large number of users?"
>
> Two packages with many features, dozens of functions and under heavy
> development to fix bugs, add new features and improve performance, vs.
> a single operator with a limited and well-defined functionality, and a
> reference implementation that hasn't changed in years (but certainly
> hackish in a way that probably could only be improved from R itself).
>
> Can't you really spot the difference?
>
> Iñaki
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Use of C++ in Packages

Re: [Rd] should base R have a piping operator ?

Re: [Rd] should base R have a piping operator ?

3 matches

Site Navigation

Mail list logo

Footer information