Package: perl
Version: 5.34.0-3
Severity: normal
Sometimes when doing s///gre on tainted utf8 string, perl warns about
malformed UTF-8 characters or outright panics.
This warns:
$ perl -Twe '$_ = $^X =~ s/./"\x{10469}"/gre'
Malformed UTF-8 character (unexpected end of string) in substitution iterator
at -e line 1.
These die:
$ perl -Twe '$_ = $^X =~ s/.*/"\x{10469}"/gre'
panic: sv_pos_b2u: bad byte offset, blen=4, byte=13 at -e line 1.
$ perl -Twe '$_ = "\x{105}$^X" =~ s/./""/gre'
panic: sv_pos_b2u: bad byte offset, blen=0, byte=2 at -e line 1.
Notably, all these warnings and panics go away when taint mode is not
used or no tainted strings are involved, or when there are no UTF-8
characters involved. No problems seem to appear when not using "/r"
either so currently I'm just using a temporary variable as a workaround.
I think it might be an upstream bug. It seems superficially similar to
RT #122148 from 2014, but I know nothing about internals so that's it.
-- System Information:
Debian Release: bookworm/sid
APT prefers testing
APT policy: (900, 'testing'), (700, 'unstable')
Architecture: amd64 (x86_64)
Kernel: Linux 5.15.0-3-amd64 (SMP w/4 CPU threads)
Kernel taint flags: TAINT_WARN, TAINT_OOT_MODULE, TAINT_UNSIGNED_MODULE
Locale: LANG=C.UTF-8, LC_CTYPE=C.UTF-8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /bin/dash
Init: sysvinit (via /sbin/init)
Versions of packages perl depends on:
ii dpkg 1.21.1
ii libperl5.34 5.34.0-3
ii perl-base 5.34.0-3
ii perl-modules-5.34 5.34.0-3
Versions of packages perl recommends:
ii netbase 6.3
Versions of packages perl suggests:
pn libtap-harness-archive-perl <none>
ii libterm-readline-gnu-perl 1.42-2+b1
ii make 4.3-4.1
ii perl-doc 5.34.0-3
-- no debconf information