Patch updated. Paul, thanks for the previous comments. As you
suggested, the attached patch doesn't copy the buffer and splits the
input when it finds an invalid character.
For the moment, I don't see a cleaner way to avoid the pcre internals.
Regards,
Santiago
From d58b53f86bb3f4b97137f708c159
Vincent Lefevre wrote:
[...] Note that this option can also be passed to pcre_exec()
and pcre_dfa_exec(), to suppress the validity checking of
subject strings only. If the same string is being matched
many times, the option can be safely set for the second and
On 2014-08-29 06:43:45 -0700, Paul Eggert wrote:
> Thanks, but that patch seems to depend on libpcre internals, in that it
> "knows" that pcre_exec cannot possibly succeed without first checking its
> entire input buffer for invalid UTF-8 bytes. Even if that's true now, it
> reflects a performance
Thanks, but that patch seems to depend on libpcre internals, in that it
"knows" that pcre_exec cannot possibly succeed without first checking
its entire input buffer for invalid UTF-8 bytes. Even if that's true
now, it reflects a performance bug that might be fixed in a future
libpcre version.
El 16/08/14 a las 11:36, Paul Eggert escribió:
> Santiago wrote:
> >Another solution would be to don't check if binary files are valid
> >(passing PCRE_NO_UTF8_CHECK to pcre_exec), but I don't know if that'd
> >avoid security holes
>
> It wouldn't. (We already tried it.)
>
Another try. This pat
Hi,
Please, revert ca7868cc27db3d9deafaa2e0ac5a2bb0aa8ef373
That commit (re)introduced a regression bug (See http://debbugs.gnu.org/15758).
pcresearch checks again if input is UTF-8 valid. The problem is that
binary files are utf-8 invalid, so grep -P, in unicode locales, exits
with error:
LANG=
6 matches
Mail list logo