Thanks, but that patch seems to depend on libpcre internals, in that it "knows" that pcre_exec cannot possibly succeed without first checking its entire input buffer for invalid UTF-8 bytes. Even if that's true now, it reflects a performance bug that might be fixed in a future libpcre version.

Also, I don't see why grep needs to copy the buffer when there's an encoding error. Why not simply rerun the matcher on the initial prefix that doesn't have an encoding-error byte, and then (if that doesn't find a match), try matching the suffix after the encoding-error byte? This approach would not only avoid the buffer copy, it would avoid knowledge of libpcre internals.


--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to