Package: diffutils
Version: 1:3.10-2
Severity: normal

Dear Maintainer,

The options --show-c-function and --show-function-line truncate the line
they're displaying. Unfortunately, this truncation is counted in bytes,
even when the files are encoded in UTF-8. And sometimes, it happens that
the truncation cuts in the middle of a multi-byte character.

Here's an example:

------ file1.c -------
void hello(void) { // Something ... café
        foo();
        bar();
}

------ file2.c -------
void hello(void) { // Something ... café
        foo();
        baz();
}
----------------------

And the result:

$ diff --unified=1 --show-c-function file*.c
--- file1.c     2025-01-21 04:42:57.260783930 +0100
+++ file2.c     2025-01-21 04:42:59.168782326 +0100
@@ -2,3 +2,3 @@ void hello(void) { // Something ... caf�
        foo();
-       bar();
+       baz();
 }


The minor consequences of this is that the length of the printed line
isn't consistent (in characters), and readability is impaired.
But also, more annoying, the text produced by diff is no longer valid
UTF-8. This might have unintended consequences on other tools reading
the output of diff. Including a crash if the program reading the diff
output has a strict interpretation of the multibyte characters.

I'm not sure whether diff currently has a notion of file encoding or if
all it does is manipulate byte strings. In the latter case, a proper fix
might be more work.

Best regards,
Celelibi


-- System Information:
Debian Release: trixie/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable')
Architecture: amd64 (x86_64)

Kernel: Linux 6.11.7-amd64 (SMP w/2 CPU threads; PREEMPT)
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)

Versions of packages diffutils depends on:
ii  libc6  2.40-5

diffutils recommends no packages.

Versions of packages diffutils suggests:
pn  diffutils-doc  <none>
ii  wdiff          1.2.2-7

-- no debconf information

Reply via email to