On Wed, Dec 18, 2013 at 8:53 AM, Santiago <santi...@debian.org> wrote: ... > $ src/grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/ > src/grep: invalid UTF-8 byte sequence in input > > When I'd expected something like: > > $ LC_ALL=C src/grep -Pr "DEFINE" /usr/lib/linux-kbuild-3.2/ > /usr/lib/linux-kbuild-3.2/scripts/kernel-doc: if ($prototype =~ > m/DEFINE_SINGLE_EVENT\((.*?),/) { > /usr/lib/linux-kbuild-3.2/scripts/kernel-doc: if ($prototype =~ > m/DEFINE_EVENT\((.*?),(.*?),/) { > /usr/lib/linux-kbuild-3.2/scripts/kernel-doc:## if ($prototype =~ > m/SYSCALL_DEFINE0\s*\(\s*(a-zA-Z0-9_)*\s*\)/) { > /usr/lib/linux-kbuild-3.2/scripts/kernel-doc: if ($prototype =~ > m/SYSCALL_DEFINE0/) { > ... > > Maybe, it is a pcre (v. 8.31) issue.
Hi Santiago, Thanks for testing that. What do you get when you run the stand-alone example I gave in the commit log and in the test? printf 'j\x82\nj\n'|LC_ALL=en_US.UTF-8 grep -P j|cat -A; echo $? For me (using pcre-8.33), it works the way I want and both matches: jM-^B$ j$ 0 Hmm... I see that with debian unstable's 8.31-2, it does indeed act differently. I may have to think about excluding pcre support when the version doesn't work the way I want. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org