[Bug target/97891] [x86] Consider using registers on large initializations

2020-12-23 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891 --- Comment #6 from Hongtao.liu --- cat test.c typedef struct { long a; long b; }TI; extern TI r; void foo () { r.a = 0; r.b = 0; } gcc -Ofast -march=cascadelake -S got foo: .LFB0: .cfi_startproc movq$0, r(%rip)

[Bug target/97891] [x86] Consider using registers on large initializations

2020-11-19 Thread andysem at mail dot ru via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891 --- Comment #5 from andysem at mail dot ru --- Using a register is beneficial even for bytes and words if there are multiple of mov instructions. But there has to be a single reg0 for all movs. I'm not very knowlegeable about gcc internals, but w

[Bug target/97891] [x86] Consider using registers on large initializations

2020-11-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891 --- Comment #4 from Hongtao.liu --- (In reply to Hongtao.liu from comment #3) > This problem is very similar to the one pass_rpad deals with. We already have mov_xor for mov $0 to reg, so we only need to handle mov $0 to mem. and size for: xor

[Bug target/97891] [x86] Consider using registers on large initializations

2020-11-18 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891 --- Comment #3 from Hongtao.liu --- This problem is very similar to the one pass_rpad deals with.

[Bug target/97891] [x86] Consider using registers on large initializations

2020-11-18 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891 Richard Biener changed: What|Removed |Added Last reconfirmed||2020-11-18 Status|UNCONFIRM

[Bug target/97891] [x86] Consider using registers on large initializations

2020-11-18 Thread andysem at mail dot ru via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891 --- Comment #1 from andysem at mail dot ru --- As a side note, the "xorl %edx, %edx" in the original code should have been moved outside the loop, as it was in the code with __asm__ block.