Re: [Rd] Potential improvements of ave?

2021-03-16 Thread Martin Maechler
> Gabriel Becker > on Mon, 15 Mar 2021 15:08:44 -0700 writes: > Abby, > Vectors do have an internal mechanism for knowing that they are sorted via > ALTREP (it was one of 2 core motivating features for 'smart vectors' the > other being knowledge about presence of NAs).

Re: [Rd] Potential improvements of ave?

2021-03-16 Thread Dirk Eddelbuettel
On 16 March 2021 at 10:50, Martin Maechler wrote: | I vaguely remember (from Luke's docs/presentation on ALTREP) | that there are some "missing parts" here. | One of them the not-existing R level functionality, another may be | the C code below R's is.unsorted() ... maybe is.unsorted() | could

[Rd] R.sh and argument escaping

2021-03-16 Thread Ivan Krylov
Hello R-devel! The following sequence of commands results in an error message on a POSIX system: tab="`echo -ne "\t"`" LC_ALL=C Rscript -e " $tab 1" # ARGUMENT '~+~1' __ignored__ Tabs can sneak into the -e argument from indented multi-line arguments in shell scripts: Rscript -e ' foo()

[Rd] Undefined (so far as I can tell?) behavior of browser when called at top level of sourced script?

2021-03-16 Thread Gabriel Becker
Hi all, I was asked a question about why browser() was behaving a specific way, and it turned out that it was being called in a script (rather than in a function). Putting aside the design considerations that lead to that, the behavior is actually a bit puzzling, and so far as I have been able to

Re: [Rd] Potential improvements of ave?

2021-03-16 Thread Abby Spurdle
There are some relatively obvious examples: unique, which.min/which.max/etc, range/min/max, quantile, aggregate/split Also, many timeseries, graphics and spline functions are dependent on the order. In the case of data.frame(s), a boolean flag would probably need to be extended to allow for multi

Re: [Rd] Potential improvements of ave?

2021-03-16 Thread SOEIRO Thomas
Dear all, Thank you for your consideration on this topic. I do not have enough knowledge of R internals to join the discussion about sorting mechanisms. In fact, I did not get how ordering could help for ave as the output must maintain the order of the input (because ave returns only x and not

Re: [Rd] Potential improvements of ave?

2021-03-16 Thread Gabriel Becker
Hi Abby, I actually have a patch submitted that does this for unique/duplicated (only numeric cases I think) but it is, as patches from external contributors go, quite sizable which means it requires a correspondingly large amount of an R-core member's time and energy to vet and consider. It is in

Re: [Rd] Potential improvements of ave?

2021-03-16 Thread Bill Dunlap
Your proposed change (roughly, replacing interaction() by unique(paste())) slows down ave() considerably when there are long columns with lots of repeated rows. I think that interaction(drop=TRUE, ...) can be changed to use less memory and be faster by making a separate branch for drop=TRUE that u

Re: [Rd] Faster sorting algorithm...

2021-03-16 Thread Radford Neal
Those interested in faster sorting may want to look at the merge sort implemented in pqR (see pqR-project.org). It's often used as the default, because it is stable, and does different collations, while being faster than shell sort (except for small vectors). Here are examples, with timings, for