[Bug libstdc++/78113] std::variant and std::visit's current implementations do not get optimized out (compared to "recursive visitation")

2019-04-18 Thread quicknir at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78113

--- Comment #5 from Nir Friedman  ---
Jonathan, I saw you just change the status of this. Michael Park's and I work
has resulted in a different implementation of std::visit which has much better
codegen; also backed by performance numbers. This also improves perf for things
like comparison, since internally this uses the same mechanisms. We've actually
been trying to contact someone from clang and gcc standard libraries about this
for a while to no success... please feel free to email me at quick...@gmail.com
to get the convo going about how to maybe merge some of that code in. Happy to
help with the actual merging work as well.

[Bug libstdc++/78113] std::variant and std::visit's current implementations do not get optimized out (compared to "recursive visitation")

2018-09-28 Thread quicknir at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78113

Nir Friedman  changed:

   What|Removed |Added

 CC||quicknir at gmail dot com

--- Comment #3 from Nir Friedman  ---
I noticed this issue independently recently, and in fact just mentioned it in
my cppcon presentation.

I think the optimal implementation is to do switch case on the index as a
special case when there is only one variant of up to N types (suggested value
N=10) passed to visit. Whatever the boost implementation is (someone mentioned
recursive if?), I do not think it is optimal. On the other hand, switch case
optimizes well: https://gcc.godbolt.org/z/ysKr5s. Note that clang with libc++
yields basically the same results. Also note the optimal assembly for Antony's
example.

I'm willing to write the implementation (basically, use a switch case for a
single visitor of up to 10 types and fall back to previous implementation
otherwise) if there's interest in this approach, or otherwise I'd be curious if
there are any objections to it. I also plan to approach clang and MSVC library
maintainers.

One final thing to note: even if gcc/clang get to the point that they can
inline the function pointers and get a jump table, that still won't really be
as fast as switch case. Switch case for instance won't typically turn into a
jump table for 2-3 options, presumably that's because the compiler thinks it's
actually faster to branch. It's unlikely that you'd see the same
transformations from an array of function pointers. Bottom line is that switch
case seems to give the compiler the most information, has the most man-years of
optimization behind it, so to me it makes a lot of sense to optimize variant by
using it for the common case.

[Bug libstdc++/78113] std::variant and std::visit's current implementations do not get optimized out (compared to "recursive visitation")

2018-10-02 Thread quicknir at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78113

--- Comment #4 from Nir Friedman  ---
Started a PR on mpark github variant: https://github.com/mpark/variant/pull/52.