Hi Paul, > (add_utf8_anychar): Match only valid UTF-8 byte sequences > instead of allowing overlong encodings or surrogate halves.
Do I understand it correctly that, as a consequence of this change, 'grep' with a regex of '^.*$' will no longer match lines which contains an invalid UTF-8 byte sequence? If so: - Is this effect on 'grep' intended? (And the workaround is to use the "C" locale.) - Is it consistent with the behaviour of regex and kwset, which 'grep' also uses, depending on the arguments (as far as I understand)? Bruno