On Tue, Nov 26, 2013 at 6:30 AM, Santiago <santi...@debian.org> wrote: > This bug was also reported in Debian ( http://bugs.debian.org/730472 ). > > Taking a look on it, I think the most suitable solution for the moment > is to flag PCRE_NO_UTF8_CHECK instead of PCRE_UTF8, so > PCRE does not check if inputs are UTF8 valid. Resulting behavior is > similar to pre-grep-2.15. (See 15758-PCRE-no-check-UTF8.patch)
Thanks for the suggested patches and report. Your first patch is almost right. The problem is that we cannot remove the PCRE_UTF8 flag. If we did that, it would disable UTF-8, reverting an older fix. See tests/pcre-utf8 for examples, or run this: printf '\342\202\254\n' | LC_ALL=en_US.UTF-8 src/grep -P '^\p{S}' I've added a commit log, improved a related test and attached a slightly different patch, but left you as the "Author". I'll wait for an explicit ACK before pushing it. With that, there is no need to handle PCRE_ERROR_BADUTF8 because that should not happen. -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org