On 2014-08-16 16:01:27 +0200, Santiago wrote:
> Workaround attached. It's too slow against binary files, but I haven't
> found a simpler solution.

To avoid the slowness, I think that it would be better to detect
(directly, not via PCRE) invalid UTF-8 sequences and replace them
by null bytes *in-place*.

It might slow down the general case, though. However I'm not sure,
because if the UTF8 validity check (via the replacement of invalid
sequences) is done in grep, it doesn't need to be done in PCRE.

-- 
Vincent Lefèvre <vinc...@vinc17.net> - Web: <https://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <https://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to