Package: diff
Version: 1:3.0-1
Severity: normal

Dear Maintainer,

ever since installing stretch, I have the feeling that diff became
slower. Today, I had to compare two large text files (~1.3GB) with many
small changes - essentially the output of a recursive md5sum.

Some phone calls and two hours later, I interrupted diff and tried to do
the same with diff --speed-large-files, as the files (very likely) are
large files with sprinkled changes, as only a few lines (file md5s) are
expected to have changed.

This didn't seem to have an effect. Since I do this kind of diff rarely
but consistently, and I thought diff should be faster than that, I
randomly downloaded diffutils-2.9 from ftp.gnu.org and compiled it with
some minor portability fixes.

Indeed, diff from diffutils-2.9 managed to give me the diff in 132s,
whether I specify --speed-large-files or not, which is an acceptable time.

I tried to run debians diff with LC_ALL=C, to see if that might cause a
problem (as grep had similar regressions), but gave up after about 20
minutes of runtime.

So sometime between diffutils-2.9 and debian stretch's version there
has been at least a 54 times slowdown for diff, to the point where diff
becomes useless for large files. Note that 54 slowdown is only a lower
bound, as I didn't have the patience to wait for more than two hours.

Not sure if regressions of this type are considered a bug or simply a
wishlist, but diff does use it's usefulness if it fails to provide output
within hours where it used to only need minutes.

-- System Information:
Debian Release: 9.1
  APT prefers stable
  APT policy: (990, 'stable'), (500, 'stable-updates'), (500, 'unstable'), 
(500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.9.61-040961-generic (SMP w/8 CPU cores)
Locale: LANG=C, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=C (charmap=UTF-8)
Shell: /bin/sh linked to /usr/bin/bash
Init: systemd (via /run/systemd/system)

Versions of packages diff depends on:
ii  diffutils  1:3.5-3

diff recommends no packages.

diff suggests no packages.

-- no debconf information

Reply via email to