https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110013
--- Comment #2 from Devin Hussey ---
Scratch that. There is a somewhat easy way to fix this following psABI AND
using MMX with SSE.
Upon calling a function, we can have the following sequence
func:
movdq2q mm0, xmm0
movq mm1, [esp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110013
--- Comment #1 from Devin Hussey ---
As a side note, the official psABI does say that function call parameters use
MM0-MM2, if Clang follows its own rules then it means that the supposed
stability of the ABI is meaningless.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110013
Bug ID: 110013
Summary: [i386] vector_size(8) on 32-bit ABI
Product: gcc
Version: 14.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103781
--- Comment #4 from Devin Hussey ---
Makes sense because the multiplier is what, 5 cycles on an A53?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103781
--- Comment #2 from Devin Hussey ---
Yeah my bad, I meant SLP, I get them mixed up all the time.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103781
Bug ID: 103781
Summary: [AArch64, 11 regr.] Failed partial vectorization of
mulv2di3
Product: gcc
Version: 11.2.1
Status: UNCONFIRMED
Severity: normal
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103641
--- Comment #19 from Devin Hussey ---
> The new costs on AArch64 have a vector multiplication cost of 4, which is
> very reasonable.
Would this include multv2di3 by any chance?
Because another thing I noticed is that GCC is also trying to mul
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103641
Bug ID: 103641
Summary: [aarch64][11 regression] Severe compile time
regression in SLP vectorize step
Product: gcc
Version: 11.2.0
Status: UNCONFIRMED
Severity