date:20210315

Re: [Rd] R extension memory leak detection question

2021-03-15 Thread Tomas Kalibera


On 3/12/21 7:43 PM, xiaoyan yu wrote:

I am writing C++ program based on R extensions and also try to test the
program with google address sanitizer.

I thought if I don't protect the variable from the allocation API such as
Rf_allocVector, there will be a memory leak. However, the address sanitizer
didn't report it. Is my understanding correct? Or I will see the memory
leak only if I compile R source code with the address sanitizer.


Yes, you should use special options for compilation and linking to use 
address sanitizer. See Writing R Extensions, section 4.3.3.


If you allocate an R object using Rf_allocVector(), but don't protect 
it, it means this object is available for the garbage collector to 
reclaim. So it is not a memory leak.


Memory leaks with a garbage collector are much less common than without, 
because if the program loses a pointer to some piece of memory, that 
piece will automatically be reclaimed (not leaked). Still, memory leaks 
are possible if the program forgets about a pointer to some piece of 
memory no longer needed, and keeps that pointer in say some global 
structure. Such memory leaks would not be found using address sanitizer.


Address sanitizer/Undefined behavior sanitizer can sometimes find errors 
caused by that the program forgets to protect an R object, but this is 
relatively rare, as they don't understand R heap specifically, so you 
cannot assume that if you create such example, the error will always be 
found.


Best
Tomas



  Please help!

Thanks,
Xiaoyan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Potential improvements of ave?

2021-03-15 Thread Abby Spurdle

Hi Thomas,

These are some great suggestions.
But I can't help but feel there's a much bigger problem here.

Intuitively, the ave function could (or should) sort the data.
Then the indexing step becomes almost trivial, in terms of both time
and space complexity.
And the ave function is not the only example of where a problem
becomes much simpler, if the data is sorted.

Historically, I've never found base R functions user-friendly for
aggregation purposes, or for sorting.
(At least, not by comparison to SQL).

But that's not the main problem.
It would seem preferable to sort the data, only once.
(Rather than sorting it repeatedly, or not at all).

Perhaps, objects such as vectors and data.frame(s) could have a
boolean attribute, to indicate if they're sorted.
Or functions such as ave could have a sorted argument.
In either case, if true, the function assumes the data is sorted and
applies a more efficient algorithm.

B.

On Sat, Mar 13, 2021 at 1:07 PM SOEIRO Thomas  wrote:
>
> Dear all,
>
> I have two questions/suggestions about ave, but I am not sure if it's 
> relevant for bug reports.
>
>
>
> 1) I have performance issues with ave in a case where I didn't expect it. The 
> following code runs as expected:
>
> set.seed(1)
>
> df1 <- data.frame(id1 = sample(1:1e2, 5e2, TRUE),
>   id2 = sample(1:3, 5e2, TRUE),
>   id3 = sample(1:5, 5e2, TRUE),
>   val = sample(1:300, 5e2, TRUE))
>
> df1$diff <- ave(df1$val,
> df1$id1,
> df1$id2,
> df1$id3,
> FUN = function(i) c(diff(i), 0))
>
> head(df1[order(df1$id1,
>df1$id2,
>df1$id3), ])
>
> But when expanding the data.frame (* 1e4), ave fails (Error: cannot allocate 
> vector of size 1110.0 Gb):
>
> df2 <- data.frame(id1 = sample(1:(1e2 * 1e4), 5e2 * 1e4, TRUE),
>   id2 = sample(1:3, 5e2 * 1e4, TRUE),
>   id3 = sample(1:(5 * 1e4), 5e2 * 1e4, TRUE),
>   val = sample(1:300, 5e2 * 1e4, TRUE))
>
> df2$diff <- ave(df2$val,
> df2$id1,
> df2$id2,
> df2$id3,
> FUN = function(i) c(diff(i), 0))
>
> This use case does not seem extreme to me (e.g. aggregate et al work 
> perfectly on this data.frame).
> So my question is: Is this expected/intended/reasonable? i.e. Does ave need 
> to be optimized?
>
>
>
> 2) Gabor Grothendieck pointed out in 2011 that drop = TRUE is needed to avoid 
> warnings in case of unused levels 
> (https://stat.ethz.ch/pipermail/r-devel/2011-February/059947.html).
> Is it relevant/possible to expose the drop argument explicitly?
>
>
>
> Thanks,
>
> Thomas
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Faster sorting algorithm...

2021-03-15 Thread Morgan Morgan

Hi,
I am not sure if this is the right mailing list, so apologies in advance if
it is not.

I found the following link/presentation:
https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf

The implementation of fsort is interesting but incomplete (not sure why?)
and can be improved or made faster (at least 25%  I believe). I might be
wrong but there are maybe a couple of bugs as well.

My questions are:

1/ Is the R Core team interested in a faster sorting algo? (Multithread or
even single threaded)

2/ I see an issue with the license, which is MPL-2.0, and hence not
compatible with base R, Python and Julia. Is there an interest to change
the license of fsort so all 3 languages (and all the people using these
languages) can benefit from it? (Like suggested on the first page)

Please let me know if there is an interest to address the above points, I
would be happy to look into it (free of charge of course!).

Thank you
Best regards
Morgan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Faster sorting algorithm...

2021-03-15 Thread Avraham Adler

Isn’t the default method now “radix” which is the data.table sort, and
isn’t that already parallel using openmp where available?

Avi

On Mon, Mar 15, 2021 at 12:26 PM Morgan Morgan 
wrote:

> Hi,
> I am not sure if this is the right mailing list, so apologies in advance if
> it is not.
>
> I found the following link/presentation:
> https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf
>
> The implementation of fsort is interesting but incomplete (not sure why?)
> and can be improved or made faster (at least 25%  I believe). I might be
> wrong but there are maybe a couple of bugs as well.
>
> My questions are:
>
> 1/ Is the R Core team interested in a faster sorting algo? (Multithread or
> even single threaded)
>
> 2/ I see an issue with the license, which is MPL-2.0, and hence not
> compatible with base R, Python and Julia. Is there an interest to change
> the license of fsort so all 3 languages (and all the people using these
> languages) can benefit from it? (Like suggested on the first page)
>
> Please let me know if there is an interest to address the above points, I
> would be happy to look into it (free of charge of course!).
>
> Thank you
> Best regards
> Morgan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
-- 
Sent from Gmail Mobile

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Faster sorting algorithm...

2021-03-15 Thread Morgan Morgan

Default method for sort is not radix(especially for character vector). You
might want to read the documentation of sort.
For your second question, I invite you to look at the code of fsort. It is
implemented only for positive finite double, and default to
data.table:::forder ... when the types are different than positive double...
Please read the pdf link I sent, everything is explained in it.
Thank you
Morgan

On Mon, 15 Mar 2021, 16:52 Avraham Adler,  wrote:

> Isn’t the default method now “radix” which is the data.table sort, and
> isn’t that already parallel using openmp where available?
>
> Avi
>
> On Mon, Mar 15, 2021 at 12:26 PM Morgan Morgan 
> wrote:
>
>> Hi,
>> I am not sure if this is the right mailing list, so apologies in advance
>> if
>> it is not.
>>
>> I found the following link/presentation:
>> https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf
>>
>> The implementation of fsort is interesting but incomplete (not sure why?)
>> and can be improved or made faster (at least 25%  I believe). I might be
>> wrong but there are maybe a couple of bugs as well.
>>
>> My questions are:
>>
>> 1/ Is the R Core team interested in a faster sorting algo? (Multithread or
>> even single threaded)
>>
>> 2/ I see an issue with the license, which is MPL-2.0, and hence not
>> compatible with base R, Python and Julia. Is there an interest to change
>> the license of fsort so all 3 languages (and all the people using these
>> languages) can benefit from it? (Like suggested on the first page)
>>
>> Please let me know if there is an interest to address the above points, I
>> would be happy to look into it (free of charge of course!).
>>
>> Thank you
>> Best regards
>> Morgan
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
> --
> Sent from Gmail Mobile
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] inheritance and attach

2021-03-15 Thread Therneau, Terry M., Ph.D. via R-devel

This change in R-devel just bit me.   Under the newest release, if I attach() 
another 
.RData directory, the methods are not detected.
Was it intentional?   Running in Linux.   Here is a script of an example that 
works fine 
under 3.6.2. but fails in R-devel.

tmt% mkdir temp1
tmt% cd temp1
tmt% R
  # define a silly method, just for testing

charlie <- function(x, ...)
     UseMethod("charlie")


charlie.default <- function(x, ...) {
     cat("default method ", x, "\n")
     x +2
}

charlie.character <- function(x, ...) {
     cat("character method ", x, "\n")
     as.character(as.numeric(x) + 2)
}

 > quit("yes")

tmt% cd ..
tmt% R
 > attach("temp1/.RData")
 > charlie( 4)
Error in UseMethod("charlie") :
   no applicable method for 'charlie' applied to an object of class 
"c('double', 'numeric')"



The use case was my local test environment for the survival package.  I can 
work around it.

-- 
Terry M Therneau, PhD
Department of Health Science Research
Mayo Clinic
thern...@mayo.edu

"TERR-ree THUR-noh"


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Potential improvements of ave?

2021-03-15 Thread SOEIRO Thomas

Hi Abby,

Thank you for your positive feedback.

I agree for your general comment about sorting.

For ave specifically, ordering may not help because the output must maintain 
the order of the input (as ave returns only x and not the entiere data.frame).

Thanks,

Thomas

De : Abby Spurdle 
Envoyé : lundi 15 mars 2021 10:22
À : SOEIRO Thomas
Cc : r-devel@r-project.org
Objet : Re: [Rd] Potential improvements of ave?

EMAIL EXTERNE - TRAITER AVEC PRÉCAUTION LIENS ET FICHIERS

Hi Thomas,

These are some great suggestions.
But I can't help but feel there's a much bigger problem here.

Intuitively, the ave function could (or should) sort the data.
Then the indexing step becomes almost trivial, in terms of both time
and space complexity.
And the ave function is not the only example of where a problem
becomes much simpler, if the data is sorted.

Historically, I've never found base R functions user-friendly for
aggregation purposes, or for sorting.
(At least, not by comparison to SQL).

But that's not the main problem.
It would seem preferable to sort the data, only once.
(Rather than sorting it repeatedly, or not at all).

Perhaps, objects such as vectors and data.frame(s) could have a
boolean attribute, to indicate if they're sorted.
Or functions such as ave could have a sorted argument.
In either case, if true, the function assumes the data is sorted and
applies a more efficient algorithm.

B.

On Sat, Mar 13, 2021 at 1:07 PM SOEIRO Thomas  wrote:
>
> Dear all,
>
> I have two questions/suggestions about ave, but I am not sure if it's 
> relevant for bug reports.
>
>
>
> 1) I have performance issues with ave in a case where I didn't expect it. The 
> following code runs as expected:
>
> set.seed(1)
>
> df1 <- data.frame(id1 = sample(1:1e2, 5e2, TRUE),
>   id2 = sample(1:3, 5e2, TRUE),
>   id3 = sample(1:5, 5e2, TRUE),
>   val = sample(1:300, 5e2, TRUE))
>
> df1$diff <- ave(df1$val,
> df1$id1,
> df1$id2,
> df1$id3,
> FUN = function(i) c(diff(i), 0))
>
> head(df1[order(df1$id1,
>df1$id2,
>df1$id3), ])
>
> But when expanding the data.frame (* 1e4), ave fails (Error: cannot allocate 
> vector of size 1110.0 Gb):
>
> df2 <- data.frame(id1 = sample(1:(1e2 * 1e4), 5e2 * 1e4, TRUE),
>   id2 = sample(1:3, 5e2 * 1e4, TRUE),
>   id3 = sample(1:(5 * 1e4), 5e2 * 1e4, TRUE),
>   val = sample(1:300, 5e2 * 1e4, TRUE))
>
> df2$diff <- ave(df2$val,
> df2$id1,
> df2$id2,
> df2$id3,
> FUN = function(i) c(diff(i), 0))
>
> This use case does not seem extreme to me (e.g. aggregate et al work 
> perfectly on this data.frame).
> So my question is: Is this expected/intended/reasonable? i.e. Does ave need 
> to be optimized?
>
>
>
> 2) Gabor Grothendieck pointed out in 2011 that drop = TRUE is needed to avoid 
> warnings in case of unused levels 
> (https://urldefense.com/v3/__https://stat.ethz.ch/pipermail/r-devel/2011-February/059947.html__;!!JQ5agg!J2AUFbQr31F2c6LUpTnyc5TX2Kh1bJ-VqhMND1c0N5axWO_tQl0pCJhtucPfjU7NXrBO$
>  ).
> Is it relevant/possible to expose the drop argument explicitly?
>
>
>
> Thanks,
>
> Thomas
> __
> R-devel@r-project.org mailing list
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-devel__;!!JQ5agg!J2AUFbQr31F2c6LUpTnyc5TX2Kh1bJ-VqhMND1c0N5axWO_tQl0pCJhtucPfjUzdLFM1$

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] inheritance and attach

2021-03-15 Thread Simon Urbanek

Terry,

NEWS: CHANGES IN R 4.0.0 NEW FEATURES

 \item S3 method lookup now by default skips the elements of the
  search path between the global and base environments.

If you use attach(), S3 methods are hence no longer dispatched to (because it 
is between global and base) unless you register them using .S3method(). Without 
registration you have to load them into the global env for them to work since 
this is now the only environment that doesn't require registration.

Cheers,
Simon



> On Mar 16, 2021, at 7:19 AM, Therneau, Terry M., Ph.D. via R-devel 
>  wrote:
> 
> This change in R-devel just bit me.   Under the newest release, if I attach() 
> another 
> .RData directory, the methods are not detected.
> Was it intentional?   Running in Linux.   Here is a script of an example that 
> works fine 
> under 3.6.2. but fails in R-devel.
> 
> tmt% mkdir temp1
> tmt% cd temp1
> tmt% R
>  # define a silly method, just for testing
> 
> charlie <- function(x, ...)
> UseMethod("charlie")
> 
> 
> charlie.default <- function(x, ...) {
> cat("default method ", x, "\n")
> x +2
> }
> 
> charlie.character <- function(x, ...) {
> cat("character method ", x, "\n")
> as.character(as.numeric(x) + 2)
> }
> 
>> quit("yes")
> 
> tmt% cd ..
> tmt% R
>> attach("temp1/.RData")
>> charlie( 4)
> Error in UseMethod("charlie") :
>   no applicable method for 'charlie' applied to an object of class 
> "c('double', 'numeric')"
> 
> 
> 
> The use case was my local test environment for the survival package.  I can 
> work around it.
> 
> -- 
> Terry M Therneau, PhD
> Department of Health Science Research
> Mayo Clinic
> thern...@mayo.edu
> 
> "TERR-ree THUR-noh"
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [EXTERNAL] Re: inheritance and attach

2021-03-15 Thread Therneau, Terry M., Ph.D. via R-devel

Thanks Simon.  I missed that.   It is a sensible change.
I had trouble because I had just changed computing environments this weekend (a 
forced 
change due to an institutional directive), and this caught me right after that 
so I spent 
some time chasing my tail.  Murphy's law...

Terry T.


On 3/15/21 4:45 PM, Simon Urbanek wrote:
> Terry,
>
> NEWS: CHANGES IN R 4.0.0 NEW FEATURES
>
>   \item S3 method lookup now by default skips the elements of the
>search path between the global and base environments.
>
> If you use attach(), S3 methods are hence no longer dispatched to (because it 
> is between global and base) unless you register them using .S3method(). 
> Without registration you have to load them into the global env for them to 
> work since this is now the only environment that doesn't require registration.
>
> Cheers,
> Simon
>
>
>
>> On Mar 16, 2021, at 7:19 AM, Therneau, Terry M., Ph.D. via R-devel 
>>  wrote:
>>
>> This change in R-devel just bit me.   Under the newest release, if I 
>> attach() another
>> .RData directory, the methods are not detected.
>> Was it intentional?   Running in Linux.   Here is a script of an example 
>> that works fine
>> under 3.6.2. but fails in R-devel.
>>
>> tmt% mkdir temp1
>> tmt% cd temp1
>> tmt% R
>>   # define a silly method, just for testing
>>
>> charlie <- function(x, ...)
>>  UseMethod("charlie")
>>
>>
>> charlie.default <- function(x, ...) {
>>  cat("default method ", x, "\n")
>>  x +2
>> }
>>
>> charlie.character <- function(x, ...) {
>>  cat("character method ", x, "\n")
>>  as.character(as.numeric(x) + 2)
>> }
>>
>>> quit("yes")
>> tmt% cd ..
>> tmt% R
>>> attach("temp1/.RData")
>>> charlie( 4)
>> Error in UseMethod("charlie") :
>>no applicable method for 'charlie' applied to an object of class 
>> "c('double', 'numeric')"
>>
>> 
>>
>> The use case was my local test environment for the survival package.  I can 
>> work around it.
>>
>> -- 
>> Terry M Therneau, PhD
>> Department of Health Science Research
>> Mayo Clinic
>> thern...@mayo.edu
>>
>> "TERR-ree THUR-noh"
>>
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Potential improvements of ave?

2021-03-15 Thread Gabriel Becker

Abby,

Vectors do have an internal mechanism for knowing that they are sorted via
ALTREP (it was one of 2 core motivating features for 'smart vectors' the
other being knowledge about presence of NAs).

Currently I don't think we expose it at the R level, though it is part of
the official C API. I don't know of any plans for this to change, but I
suppose it could. Plus for functions in R itself, we could even use it
without exposing it more widely. A number of functions, including sort
itself, already do this in fact, but more could. I'd be interested in
hearing which functions you think would particularly benefit from this.

~G

On Mon, Mar 15, 2021 at 12:01 PM SOEIRO Thomas 
wrote:

> Hi Abby,
>
> Thank you for your positive feedback.
>
> I agree for your general comment about sorting.
>
> For ave specifically, ordering may not help because the output must
> maintain the order of the input (as ave returns only x and not the entiere
> data.frame).
>
> Thanks,
>
> Thomas
> 
> De : Abby Spurdle 
> Envoyé : lundi 15 mars 2021 10:22
> À : SOEIRO Thomas
> Cc : r-devel@r-project.org
> Objet : Re: [Rd] Potential improvements of ave?
>
> EMAIL EXTERNE - TRAITER AVEC PRÉCAUTION LIENS ET FICHIERS
>
> Hi Thomas,
>
> These are some great suggestions.
> But I can't help but feel there's a much bigger problem here.
>
> Intuitively, the ave function could (or should) sort the data.
> Then the indexing step becomes almost trivial, in terms of both time
> and space complexity.
> And the ave function is not the only example of where a problem
> becomes much simpler, if the data is sorted.
>
> Historically, I've never found base R functions user-friendly for
> aggregation purposes, or for sorting.
> (At least, not by comparison to SQL).
>
> But that's not the main problem.
> It would seem preferable to sort the data, only once.
> (Rather than sorting it repeatedly, or not at all).
>
> Perhaps, objects such as vectors and data.frame(s) could have a
> boolean attribute, to indicate if they're sorted.
> Or functions such as ave could have a sorted argument.
> In either case, if true, the function assumes the data is sorted and
> applies a more efficient algorithm.
>
>
> B.
>
>
> On Sat, Mar 13, 2021 at 1:07 PM SOEIRO Thomas 
> wrote:
> >
> > Dear all,
> >
> > I have two questions/suggestions about ave, but I am not sure if it's
> relevant for bug reports.
> >
> >
> >
> > 1) I have performance issues with ave in a case where I didn't expect
> it. The following code runs as expected:
> >
> > set.seed(1)
> >
> > df1 <- data.frame(id1 = sample(1:1e2, 5e2, TRUE),
> >   id2 = sample(1:3, 5e2, TRUE),
> >   id3 = sample(1:5, 5e2, TRUE),
> >   val = sample(1:300, 5e2, TRUE))
> >
> > df1$diff <- ave(df1$val,
> > df1$id1,
> > df1$id2,
> > df1$id3,
> > FUN = function(i) c(diff(i), 0))
> >
> > head(df1[order(df1$id1,
> >df1$id2,
> >df1$id3), ])
> >
> > But when expanding the data.frame (* 1e4), ave fails (Error: cannot
> allocate vector of size 1110.0 Gb):
> >
> > df2 <- data.frame(id1 = sample(1:(1e2 * 1e4), 5e2 * 1e4, TRUE),
> >   id2 = sample(1:3, 5e2 * 1e4, TRUE),
> >   id3 = sample(1:(5 * 1e4), 5e2 * 1e4, TRUE),
> >   val = sample(1:300, 5e2 * 1e4, TRUE))
> >
> > df2$diff <- ave(df2$val,
> > df2$id1,
> > df2$id2,
> > df2$id3,
> > FUN = function(i) c(diff(i), 0))
> >
> > This use case does not seem extreme to me (e.g. aggregate et al work
> perfectly on this data.frame).
> > So my question is: Is this expected/intended/reasonable? i.e. Does ave
> need to be optimized?
> >
> >
> >
> > 2) Gabor Grothendieck pointed out in 2011 that drop = TRUE is needed to
> avoid warnings in case of unused levels (
> https://urldefense.com/v3/__https://stat.ethz.ch/pipermail/r-devel/2011-February/059947.html__;!!JQ5agg!J2AUFbQr31F2c6LUpTnyc5TX2Kh1bJ-VqhMND1c0N5axWO_tQl0pCJhtucPfjU7NXrBO$
> ).
> > Is it relevant/possible to expose the drop argument explicitly?
> >
> >
> >
> > Thanks,
> >
> > Thomas
> > __
> > R-devel@r-project.org mailing list
> >
> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-devel__;!!JQ5agg!J2AUFbQr31F2c6LUpTnyc5TX2Kh1bJ-VqhMND1c0N5axWO_tQl0pCJhtucPfjUzdLFM1$
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Faster sorting algorithm...

2021-03-15 Thread Abby Spurdle

In principle, I agree that faster ranking/sorting algorithms are
important, and should be a priority.
But I can't help but feel that the paper focuses on textbook-oriented problems.

Given that in real world problems, there's almost always some form of
prior knowledge:
Wouldn't it be better, from a management perspective, to focus on
sorting algorithms, that incorporate that prior knowledge?

I'm not sure whether that's an R-devel discussion, or for another forum...


On Tue, Mar 16, 2021 at 5:25 AM Morgan Morgan  wrote:
>
> Hi,
> I am not sure if this is the right mailing list, so apologies in advance if
> it is not.
>
> I found the following link/presentation:
> https://www.r-project.org/dsc/2016/slides/ParallelSort.pdf
>
> The implementation of fsort is interesting but incomplete (not sure why?)
> and can be improved or made faster (at least 25%  I believe). I might be
> wrong but there are maybe a couple of bugs as well.
>
> My questions are:
>
> 1/ Is the R Core team interested in a faster sorting algo? (Multithread or
> even single threaded)
>
> 2/ I see an issue with the license, which is MPL-2.0, and hence not
> compatible with base R, Python and Julia. Is there an interest to change
> the license of fsort so all 3 languages (and all the people using these
> languages) can benefit from it? (Like suggested on the first page)
>
> Please let me know if there is an interest to address the above points, I
> would be happy to look into it (free of charge of course!).
>
> Thank you
> Best regards
> Morgan
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R extension memory leak detection question

Re: [Rd] Potential improvements of ave?

[Rd] Faster sorting algorithm...

Re: [Rd] Faster sorting algorithm...

Re: [Rd] Faster sorting algorithm...

[Rd] inheritance and attach

Re: [Rd] Potential improvements of ave?

Re: [Rd] inheritance and attach

Re: [Rd] [EXTERNAL] Re: inheritance and attach

Re: [Rd] Potential improvements of ave?

Re: [Rd] Faster sorting algorithm...

11 matches

Site Navigation

Mail list logo

Footer information