On Tue, Apr 1, 2014 at 12:50 PM, Simon Sapin <[email protected]> wrote: > On 01/04/2014 03:01, Keegan McAllister wrote: >> >> It does seem like replacing truly lone surrogates with U+FFFD would >> be an acceptable deviation from the spec, but maybe we want to avoid >> those absolutely. > > As much as I’d like this to be true, I don’t know. Henri seemed pretty > opposed to changing the spec that way, a few years ago: > > https://www.w3.org/Bugs/Public/show_bug.cgi?id=11298#c2 > > Henri, any comment?
My position at the time was based on the assumption that all real browser implementations represent DOM strings as 16-bit units. I saw no value in adding Unicode theoretical purity complexity when it was really 16-bit units in every real browser implementation. If Servo is seriously going to 8-bit code units, then I think I'd be OK with lone surrogates converting to the REPLACEMENT CHARACTER on the WebIDL layer (conceptually; as an optimization, you'd probably want to do it in the parser in the case of document.write() or innerHTML). -- Henri Sivonen [email protected] https://hsivonen.fi/ _______________________________________________ dev-servo mailing list [email protected] https://lists.mozilla.org/listinfo/dev-servo

