https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
--- Comment #10 from GCC Commits ---
The master branch has been updated by Hongyu Wang :
https://gcc.gnu.org/g:a52940cfee0908aed0d2a674a451f6d9984d
commit r14-6575-ga52940cfee0908aed0d2a674a451f6d9984d
Author: Hongyu Wang
Date: Mon
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
Hongyu Wang changed:
What|Removed |Added
CC||wwwhhhyyy333 at gmail dot com
--- Comment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
--- Comment #7 from Hongtao Liu ---
(In reply to Chris Elrod from comment #6)
> Hongtao Liu, I do think that one should ideally be able to get optimal
> codegen when using 512-bit builtin vectors or vector intrinsics, without
> needing to set `-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
--- Comment #8 from Chris Elrod ---
> If it's designed the way you want it to be, another issue would be like,
> should we lower 512-bit vector builtins/intrinsic to ymm/xmm when
> -mprefer-vector-width=256, the answer is we'd rather not.
To
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
--- Comment #6 from Chris Elrod ---
Hongtao Liu, I do think that one should ideally be able to get optimal codegen
when using 512-bit builtin vectors or vector intrinsics, without needing to set
`-mprefer-vector-width=512` (and, currently, also
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
Richard Biener changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
liuhongt at gcc dot gnu.org changed:
What|Removed |Added
CC||liuhongt at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
--- Comment #3 from Chris Elrod ---
> I thought I hit the important cases, but my non-minimal example still gets
> unnecessary register splits and stack spills, so maybe I missed places, or
> perhaps there's another issue.
Adding the unroll p
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112824
--- Comment #2 from Chris Elrod ---
https://godbolt.org/z/3648aMTz8
Perhaps a simpler diff is that you can reproduce by uncommenting the pragma,
but codegen becomes good with it.
template
constexpr auto operator*(OuterDualUA2 a, OuterDualUA2
b