On 26/02/2019 8:25 a.m., Sebastian Martin Krantz wrote:
Dear Developers,

Having spent time developing and thinking about how data aggregation and
summary statistics can be enhanced in R, I would like to present my
ideas/efforts in the form of two commands:

The first, which for now I called 'collap', is an upgrade of aggregate that
accommodates and extends the functionality of aggregate in various
respects, most importantly to work with multilevel and multi-type data,
multiple function calls, highly customized aggregation tasks, a much
greater flexibility in the passing of inputs and tidy output.

The second function, 'qsu', is an advanced and flexible summary command for
cross-sectional and multilevel (panel) data (i.e. it can provide overall,
between and within entities statistics, and allows for grouping, custom
functions and transformations). It also provides a quick method to compute
and output within-transformed data.

Both commands are efficiently built from core R, but provide for optional
integration with data.table, which renders them extremely fast on large
datasets. An explanation of the syntax, a demonstration and benchmark
results are provided in the attached vignette.

Since both commands accommodate existing functionality while adding
significant basic functionality, I though that their addition to the stats
package would be a worthwhile consideration. I am happy for your feedback.

Generally the R Core group is reluctant to incorporate new functions into the base packages. Each function that is added adds to their work, and they already have too much to do. (I am no longer a member of R Core, but I don't think things have changed since I retired.)

It is much easier for them if volunteers publish functions themselves, via contributed packages.

Nowadays Github provides a very convenient platform on which you can develop a package containing your functions. If other users find bugs or have suggested improvements, it's very easy for them to send those to you, and you can make the fixes available immediately. Once you are satisfied that it is stable, you can submit it to CRAN, and anyone using R can easily install it.

If you find the prospect of writing a package daunting, you shouldn't. It's actually quite easy, especially if you are using RStudio or ESS (or some other helpful front-end.) Hadley Wickham's book <http://r-pkgs.had.co.nz/> is a pretty accessible description of a development strategy. (It's not the only strategy, but lots of people use it.)

Duncan Murdoch

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to