On 26 Mar 2014, at 18:24 , Radford Neal <radf...@cs.toronto.edu> wrote:
>> From: Richard Cotton <richiero...@gmail.com> >> >> The rep function is very versatile, but that versatility comes at a >> cost: it takes a bit of effort to learn (and remember) its syntax. >> This is a problem, since rep is one of the first functions many >> beginners will come across. Of the three main uses of rep, two have >> simpler alternatives. >> >> rep(x, times = ) has rep.int >> rep(x, length.out = ) has rep_len >> >> I think that a rep_each function would be a worthy addition for the >> third use case >> >> rep(x, each = ) >> >> (It might also be worth having rep_times as a synonym for rep.int.) > > I think this is exactly the wrong approach. Indeed, the aim should be > to get rid of functions like rep.int (or at least discourage their > use, even if they have to be kept for compatibility). > > Why is rep_each(x,n) better than rep(x,each=n)? There is no saving in > typing (which would be trivial anyway). There *ought* to be no > significant difference in speed (though that seems to have been the > motive for rep.int). Are you trying to let students learn R without > ever learning about specifying arguments by name? > > And where would you stop? How about seq_by(a,b,s) rather than having > to arduously type seq(a,b,by=s)? Maybe we should have glm_binomial, > glm_poisson, etc. so we don't have to remember the "family" argument? > This way lies madness... Spot on. Well, maybe a slight disagreement: In a weakly typed language like R, you will always have performance losses due to type testing and dispatching, and no compiler/interpreter is intelligent enough to predict the types so that this can be avoided. Some amout of hinting is needed for reliable speedups, either by having special functions for simple cases (allowed to make assumptions on their inputs), or some sort of #pragma-like construction. Actually, rep.int seems to be a poor example of this since the speedup is pretty negligible unless you do huge amounts of short replicates. I expect that the S-PLUS compatibility was the main reason to have it. Case in point: > system.time(for(i in 1:10000000) rep("a",10)) user system elapsed 16.721 0.125 19.037 > system.time(for(i in 1:10000000) rep.int("a",10)) user system elapsed 14.356 0.050 14.611 > system.time(for(i in 1:1000000) rep("a",1000)) user system elapsed 11.655 2.157 14.263 > system.time(for(i in 1:1000000) rep.int("a",1000)) user system elapsed 10.957 1.708 12.917 For more spectacular speedups compare seq(1,10) to seq_len(10) or even just to 1:10. Then again, the slowdown in seq() is so large that it is hard to believe it to be completely unavoidable. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel