Package: diffutils Version: 1:3.10-2 Severity: normal Dear Maintainer,
The options --show-c-function and --show-function-line truncate the line they're displaying. Unfortunately, this truncation is counted in bytes, even when the files are encoded in UTF-8. And sometimes, it happens that the truncation cuts in the middle of a multi-byte character. Here's an example: ------ file1.c ------- void hello(void) { // Something ... café foo(); bar(); } ------ file2.c ------- void hello(void) { // Something ... café foo(); baz(); } ---------------------- And the result: $ diff --unified=1 --show-c-function file*.c --- file1.c 2025-01-21 04:42:57.260783930 +0100 +++ file2.c 2025-01-21 04:42:59.168782326 +0100 @@ -2,3 +2,3 @@ void hello(void) { // Something ... caf� foo(); - bar(); + baz(); } The minor consequences of this is that the length of the printed line isn't consistent (in characters), and readability is impaired. But also, more annoying, the text produced by diff is no longer valid UTF-8. This might have unintended consequences on other tools reading the output of diff. Including a crash if the program reading the diff output has a strict interpretation of the multibyte characters. I'm not sure whether diff currently has a notion of file encoding or if all it does is manipulate byte strings. In the latter case, a proper fix might be more work. Best regards, Celelibi -- System Information: Debian Release: trixie/sid APT prefers testing APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable') Architecture: amd64 (x86_64) Kernel: Linux 6.11.7-amd64 (SMP w/2 CPU threads; PREEMPT) Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8), LANGUAGE not set Shell: /bin/sh linked to /usr/bin/dash Init: systemd (via /run/systemd/system) Versions of packages diffutils depends on: ii libc6 2.40-5 diffutils recommends no packages. Versions of packages diffutils suggests: pn diffutils-doc <none> ii wdiff 1.2.2-7 -- no debconf information