On Oct 5, 2014, at 2:05 PM, Ms2ger <ms2...@gmail.com> wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > On 10/05/2014 08:27 PM, Cameron Zwarich wrote: >> If JS can’t handle WTF-8 natively, then what’s the benefit of >> using it? I am opposed to anything that requires string copies >> between the DOM and JS, unless there’s some really great overriding >> reason. >> > > The benefit is correctness while not paying the memory cost of 16-bit > strings.
The solution that other browser engines have taken here (including Gecko recently, IIRC) is to have a dual representation of Latin-1 and 16-bit strings, in both the JS engine and the rest of the browser. As far as I can tell from the success in WebKit and Blink, this solution works pretty well. It would disadvantage languages that are mostly 7-bit ASCII with occasional non-Latin-1 embellishments, but I’m not sure how much worse it does than UTF-8 in practice in these situations. Has anyone collected statistical distributions of code points across languages? > Another benefit is not having to copy strings coming from the parser, > or going into any Rust library that's not entirely Servo-specific. By no extra copies from the parser do you mean that DOM strings will point directly into memory owned by the HTTP resource? What will keep the HTTP resource alive as the DOM object that owns the string migrates across new windows, etc.? We will still need copies going into any Rust library that’s not Servo-specific, since Rust strings are required to be valid UTF-8 or else memory unsafety is introduced. > Not requiring string copies between JS and the DOM would be nice, but > sadly not feasible, regardless of encoding, due to SpiderMonkey's > moving GC. Gecko I think you may have left something off at the end of this paragraph? Servo’s approach to dealing with the impending SM moving GC already has excessive overhead, due to not using raw pointers as temporaries like Gecko. We’re obviously going to have to revisit this interaction regardless of our string choices. Cameron _______________________________________________ dev-servo mailing list dev-servo@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-servo