I see this on current sid/amd64 (Perl 5.24) too fwiw, and also in an
amd64 chroot with Perl 5.22. I've no idea why it goes away for you on
stretch. Can you confirm that? Are you only testing on i386 or on amd64
as well?

Problem goes away on "stretch", perl version "subversion 1 (v5.24.1) built for x86_64-linux-gnu-thread-multi".
I have checked that many times.


It looks like the difference between
   $par = "(" . $inp . ")"; # is slow
and
   $par = sprintf "(%s)", $inp; # is fast
internally is that the first one uses copy on write semantics, meaning
it doesn't have to copy the whole string in memory. Of course, this
is supposed to improve performance rather than degrade it.
Yes! I think the same.



AFAICS the regexp behaviour stays unchanged, it's just the performance
that drops. I tried some debugging with 'debugperl -Dr' but it mostly
hides the problem by slowing down the execution by itself.
But you see the difference between "fast" and "slow" lines...
Fast -- half a second, slow -- 30 seconds.
This is dramatic performance impact.


It would be nice to distill the issue to a smaller test case but it's
rather sensitive to the input as you noted so that doesn't seem to be
easy.
I have tried hard -- to cut lines, to cut program to get minimal test suite. But when you remove one line it is slow, when remove 1000 lines still slow, when remove 1001 lines gets fast.

Also if you ADD twice that number of lines (double input in size) it gets FAST.... :) :)
Twice many data, 50 times faster.

I think I'm lucky to get proper data input to show the problem -- I can't craft very small data example to show the problem.



Reply via email to