https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
H.J. Lu changed:
What|Removed |Added
Resolution|--- |DUPLICATE
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
H.J. Lu changed:
What|Removed |Added
CC||crazylht at gmail dot com
--- Comment #8 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
H.J. Lu changed:
What|Removed |Added
CC||rcc.dark at gmail dot com
--- Comment #7 from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
--- Comment #6 from H.J. Lu ---
Please take a look at usr/hjl/pieces/master branch:
https://gitlab.com/x86-gcc/wip
[hjl@gnu-cfl-1 gcc]$ cat x.cc
#include
// DUMB PAIR
struct dumb_pair {
alignas(2*sizeof(__m256i)) __m256i x[2];
};
void co
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
--- Comment #5 from H.J. Lu ---
(In reply to Jakub Jelinek from comment #3)
> Seems most of the *by_pieces code actually uses widest_int_mode_for_size
> which already handles even the wider modes as long as they have a mov
> instruction. With th
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
--- Comment #4 from Jakub Jelinek ---
Maybe i386.c would need its own ix86_use_by_pieces_infrastructure_p target hook
if the default wouldn't do the right thing with this. Maybe we'll need to
split STORE_MAX_PIECES into separately overridable CL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
Jakub Jelinek changed:
What|Removed |Added
Status|UNCONFIRMED |NEW
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
--- Comment #2 from Jakub Jelinek ---
That is because in copy1 it is a normal memcpy expansion.
And, the generic move_by_pieces case is done in preference to target specific
one. In i386.h we have:
/* Max number of bytes we can move from memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89226
--- Comment #1 from Marc Glisse ---
The optimized dump for copy1 looks like
*to_2(D) = *from_3(D);
so we get essentially memcpy, while copy2 has
_4 = MEM[(const struct foo512 &)from_3(D)].a;
MEM[(struct foo512 *)to_2(D)].a = _4;
_5 = M