On Tue, Jan 23, 2024 at 6:15 AM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Mon, Jan 22, 2024 at 11:10 PM Richard Biener <rguent...@suse.de> wrote: > > > > On Mon, 22 Jan 2024, Jeff Law wrote: > > > > > > > > > > > On 1/15/24 06:34, Richard Biener wrote: > > > > When the x86 backend generates code for cpymem with the rep_8byte > > > > strathegy for the 8 byte aligned main rep movq it needs to compute > > > > an adjusted pointer to the source after doing a prologue aligning > > > > the destination. It computes that via > > > > > > > > src_ptr + (dest_ptr - orig_dest_ptr) > > > > > > > > which is perfectly fine. On RTL this is then > > > > > > > > 8: r134:DI=const(`g'+0x44) > > > > 9: {r133:DI=frame:DI-0x4c;clobber flags:CC;} > > > > REG_UNUSED flags:CC > > > > 56: r129:DI=const(`g'+0x4c) > > > > 57: {r129:DI=r129:DI&0xfffffffffffffff8;clobber flags:CC;} > > > > REG_UNUSED flags:CC > > > > REG_EQUAL const(`g'+0x4c)&0xfffffffffffffff8 > > > > 58: {r118:DI=r134:DI-r129:DI;clobber flags:CC;} > > > > REG_DEAD r134:DI > > > > REG_UNUSED flags:CC > > > > REG_EQUAL const(`g'+0x44)-r129:DI > > > > 59: {r119:DI=r133:DI-r118:DI;clobber flags:CC;} > > > > REG_DEAD r133:DI > > > > REG_UNUSED flags:CC > > > > > > > > but as written find_base_term happily picks the first candidate > > > > it finds for the MINUS which means it picks const(`g') rather > > > > than the correct frame:DI. This way find_base_term (but also > > > > the unfixed find_base_value used by init_alias_analysis to > > > > initialize REG_BASE_VALUE) performs pointer analysis isn't > > > > sound. The following restricts the handling of multi-operand > > > > operations to the case we know only one can be a pointer. > > > > > > > > This for example causes gcc.dg/tree-ssa/pr94969.c to miss some > > > > RTL PRE (I've opened PR113395 for this). A more drastic patch, > > > > removing base_alias_check results in only gcc.dg/guality/pr41447-1.c > > > > regressing (so testsuite coverage is bad). I've looked at > > > > gcc.dg/tree-ssa tests and mostly scheduling changes are present, > > > > the cc1plus .text size is only 230 bytes worse. With the this > > > > less drastic patch below most scheduling changes are gone. > > > > > > > > x86_64 might not the very best target to test for impact, but > > > > test coverage on other targets is unlikely to be very much better. > > > > > > > > Bootstrapped and tested on x86_64-unknown-linux-gnu (together > > > > with 2/2). Jeff, can you maybe throw this on your tester? > > > > Jakub, you did the PR64025 fix which was for a similar issue. > > > No issues across the cross compilers with those two patches. > > > > Thanks, pushed. I'm probably going to revert when bigger issues > > appear (and hopefully we'd get some test coverage then). > > > > Richard. > > The test failed with -m32: > > FAIL: gcc.dg/torture/pr113255.c -O1 (test for excess errors) > Excess errors: > cc1: error: '-mstringop-strategy=rep_8byte' not supported for 32-bit code >
I am checking in this: diff --git a/gcc/testsuite/gcc.dg/torture/pr113255.c b/gcc/testsuite/gcc.dg/torture/pr113255.c index 2f009524c6b..78af6a5a563 100644 --- a/gcc/testsuite/gcc.dg/torture/pr113255.c +++ b/gcc/testsuite/gcc.dg/torture/pr113255.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-additional-options "-mtune=k8 -mstringop-strategy=rep_8byte" { target { x86_64-*-* i?86-*-* } } } */ +/* { dg-additional-options "-mtune=k8 -mstringop-strategy=rep_8byte" { target { { i?86-*-* x86_64-*-* } && { ! ia32 } } } } */ struct S { unsigned a[10]; unsigned y; unsigned b[6]; } g[2]; -- H.J.