Re: PSA: mozilla::Result is about to understand move semantics

Kris Maglione Tue, 13 Aug 2019 11:12:07 -0700

Just for some additional extra context...

The C++ mozilla::Result implementation is pretty heavily optimized forcommon use cases to try to pack its representation into a single machineword when possible. For instance, if you return a `Result<Ok, nsresult>`,you're really just returning an `nsresult`. If you return a`Result<Ok, T&>`, you're really just returning a `T*`. If you return a`Result<Thing*, Error&>`, you're really just returning a `void*`. For thefirst two, there isn't any extra overhead compared to not returning a`Result`. For the third, there are a couple of extra bitwise &s which mightget optimized out anyway.

It's also fairly easy to add new optimized packing strategies for othercommon patterns as they come up.

For more complex types, we currently return a struct. Those calls compile asthe caller allocating space on the stack for the result type and passing ina pointer (though this varies slightly depending on OS ABI; sometimes largertypes are returned in multiple registers, or large SIMD registers), and thecompiler is generally required to perform copy elision on the return valueif you return a temporary or a local of the same type, i.e., when you write:


 Thing foo() {
   Thing bar;
   bar.mThing = 1;
   return bar;
 }

`bar` essentially just refers to the pointer that the caller passed in.

The real problem comes when storing large values in, or extracting largevalues from, the `Result` itself. In general, if you construct the `Result`with a named value of any sort, even with `std::move`, it will have to becopied (unless the compiler can guarantee that there will be no observableside-effects of omitting the copy, and it's willing to try that hard toprove that there aren't).


If the `Result` is constructed with a temporary, e.g.,

 Result<Thing, nsresult> foo() {
   return Thing { ... };
 }

Then the copy of the temporary `Thing` to the `Result` *may* be omitted, andis realistically quite likely to be. But if you're doing something like thisin particularly hot code, I'd at least check machine code clang generatesbefore going very far with it.


-Kris

On Tue, Aug 13, 2019 at 01:37:49PM -0400, Alexis Beingessner wrote:

Just chiming in here with some brief extra context on the performance of
Result (which I really need to do a writeup about so that I can just link
it):

TL;DR performance of Result is usually fine, but can randomly be a huge
problem. However there's also improvements for this on the (distant)
horizon.

Rust and Swift primarily use an error handling strategy based on Result.
For the most part performance is fine, but in some situations you can get
really bad problems. LLVM is very bad at optimizing around Results, and
tends to have copy and branch heavy codegen as a result (which further
hinders other optimizations). This was enough of an issue for the binary
deserializer webrender uses for IPC (bincode) that we just landed a rewrite
to remove the Results (after several attempts by myself to fix the issues
in other ways). [0]

Meanwhile, the Swift compiler team used their expertese in llvm to add new
custom ABIs and calling conventions to handle these performance issues (the
right fix, imo). [1] I need to interview them on these details, to figure
out if we can use their work for Rust. (Since runtime performance is mostly
excellent, it's difficult to motivate working on this and diverting the
limited resources of the Rust team away from the bigger issues like compile
times and async-await.)

Also Meanwhile, the C++ standards committee is apparently[2] investigating
introducing new calling conventions for their new light-weight exceptions
proposal (which is basically adding Result to C++ properly). [3] If that
work goes forward we should be able to take advantage of it for our own C++
code, and possibly also for Rust.

Gonna work on that writeup of this issue now.


[0]: https://bugzilla.mozilla.org/show_bug.cgi?id=1550640
[1]: https://lists.llvm.org/pipermail/llvm-dev/2016-March/096250.html
[2]:
https://botondballo.wordpress.com/2019/07/26/trip-report-c-standards-meeting-in-cologne-july-2019/
[3]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0709r3.pdf

On Tue, Aug 13, 2019 at 1:01 PM Kris Maglione <[email protected]> wrote:

On Mon, Aug 12, 2019 at 10:14:19PM -0700, Bryce Seager van Dyk wrote:
>>But either way, that's going to result in a copy when the
>>Result is constructed (unless the compiler is really clever).
>
>Is it the data being moved into the Result which is incurring
>the copy here, or the actual Result that's being returned?

The former.

>I would have thought that the data is moved into the Result
>avoids a copy, then the Result itself would be moved or RVOed
>(either way avoiding a copy).

The move into the result only means that we invoke move rather
than copy constructors when initializing the value stored in the
result. That's more efficient for a lot of things, but still
requires copying data from the old struct to the new one.

The return value Result is guaranteed to be optimized, though,
so you only wind up with a single copy rather than two.

_______________________________________________
dev-platform mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-platform

Re: PSA: mozilla::Result is about to understand move semantics

Reply via email to