On Thu, 19 Sep 2024, Brian Inglis via Cygwin wrote:

> On 2024-09-19 07:27, Christian Franke via Cygwin wrote:
> >
> >
> > Yes, but Cygwin does not provide consistent forward/reverse UTF-8 <-> UTF-16
> > mappings.
>
> Surrogates halves are invalid for UTF-8 encoding; they should be first be
> encoded as a valid UTF-16 code point.
> The encoder should just fail if it encounters any invalid sequence!
> Handling surrogates or other invalid values as anything other than invalid
> turns
> the encoding into what has been called WTF-8 where W may be for Windows! ;^>

This may be necessary though, in order to round-trip anything which
is valid in NTFS.  In my opinion, rm -rf not failing in the face of
potentially maliciously named files/directories is more important than
strictly adhering to a standard that says 'fail if you see these values'.

https://cygwin.com/pipermail/cygwin/2024-June/256111.html

-- 
Problem reports:      https://cygwin.com/problems.html
FAQ:                  https://cygwin.com/faq/
Documentation:        https://cygwin.com/docs.html
Unsubscribe info:     https://cygwin.com/ml/#unsubscribe-simple

Reply via email to