On 10/5/14, 7:51 PM, Cameron Zwarich wrote:
Are there any plans to eliminate the copies in Gecko?
No. Measurement showed that in practice the cost of copying short
strings, which most of these are, is very low. For large strings you do
end up having to copy, but keep in mind that Gecko used to avoid the
copy only if the callee did not hold on to the string. So as long as
the passed-in thing was only used for comparisons to other strings, you
could avoid a copy.
There are very few real-life cases in which a long string is passed in
and then not stored.
Now it _is_ important to make the copy into a stack buffer for the short
string case, because if you have to malloc, you lose.
If I understand things correctly from reading Blink mailing list posts,
pre-Oilpan Blink shares string buffers between V8 and Blink code, despite V8
having a precise moving GC.
Sure. With enough cooperation from the JS engine you can do more here.
We're just not getting that cooperation from SpiderMonkey in this case
because they have their own performance priorities too. And given that
there was no measured loss in performance on the fast-path cases in
Gecko, I could hardly blame them.
In particular, the really serious issues arise when the string doesn't
have a dedicated string buffer but is stored inline in the movable
GCThing. This only happens for short strings, but those are again the
common case.
We could do something complicated where we copy if the storage is inline
storage and not copy otherwise, but the added complexity wouldn't have
been useful in any of the "pass strings from JS to the DOM" things we've
been tracking for Gecko.
Now it's clearly possible to construct cases in which the fact that a
copy happens at all is noticeable. But those cases already ended up
copying, because they store the string.
It looks like they did this by having GC integration for external Blink strings
that would trigger a deref of the Blink string buffer when a V8 handle went
away.
SpiderMonkey has this, and Gecko uses it for returning strings from C++
to JS.
But this doesn't affect the "pass a string from JS to C++" case at all.
V8 strings that come from Blink strings are in this wrapped representation from
the point of their creation.
Right. Though in the case of SpiderMonkey we don't detect this
representation because it's a fairly rare case in practice; the vast
majority of strings being passed to DOM methods didn't originate in the
DOM last I measured.
V8-created strings are converted to the external Blink representation on demand.
This causes them to be copied, yes?
It would be quite feasible to implement the setup you describe on top of
SpiderMonkey, as long as you had a fast way to tell when you're dealing
with one of your "external" strings and a way to mutate into that form
in-place; the latter doesn't seem hard to add to me, but for the former
you'd need to check with the SpiderMonkey folks.
Would it be worth it? Some data on the frequency of strings that are
passed to C++ being one of (a) strings that were previously passed to
C++ and (b) strings that came from C++ to start with would be useful
here. At least on real-life workloads; I'm sure in what passes for "DOM
benchmarks" it's 100% of the strings, since they just do the same thing
in a loop over and over.
How do you avoid the copy on return values from C++ to JS in Gecko?
JS_NewExternalString, passing a pointer to something that's refcounted
and then having the finalizer deref.
-Boris
_______________________________________________
dev-servo mailing list
dev-servo@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-servo