Re: [Rd] Bug: time complexity of substring is quadratic as string size and number of substrings increases

2019-02-28 Thread Tomas Kalibera
An optimized version of substring/substr is now in R-devel (76172). Best, Tomas On 2/22/19 8:16 PM, Tomas Kalibera wrote: On 2/20/19 7:55 PM, Toby Hocking wrote: Update: I have observed that stringi::stri_sub is linear time complexity, and it computes the same thing as base::substring. figure

Re: [Rd] Bug: time complexity of substring is quadratic

2019-02-23 Thread Radford Neal
> From: Tomas Kalibera > > Thanks for the report, I am working on a patch that will address this. > > I confirm there is a lot of potential for speedup. On my system, > > 'N=20; x <- substring(paste(rep("A", N), collapse=""), 1:N, 1:N)' > > spends 96% time in checking if the string is asci

Re: [Rd] Bug: time complexity of substring is quadratic as string size and number of substrings increases

2019-02-22 Thread Tomas Kalibera
On 2/20/19 7:55 PM, Toby Hocking wrote: Update: I have observed that stringi::stri_sub is linear time complexity, and it computes the same thing as base::substring. figure https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png source: https://github.com/tdhock/namedCa

Re: [Rd] Bug: time complexity of substring is quadratic as string size and number of substrings increases

2019-02-20 Thread Toby Hocking
Update: I have observed that stringi::stri_sub is linear time complexity, and it computes the same thing as base::substring. figure https://github.com/tdhock/namedCapture-article/blob/master/figure-substring-bug.png source: https://github.com/tdhock/namedCapture-article/blob/master/figure-substring

[Rd] Bug: time complexity of substring is quadratic as string size and number of substrings increases

2019-02-20 Thread Toby Hocking
Hi all, (and especially hi to Tomas Kalibera who accepted my patch sent yesterday) I believe that I have found another bug, this time in the substring function. The use case that I am concerned with is when there is a single (character scalar) text/subject, and many substrings to extract. For exam