Jeff Law <jeffreya...@gmail.com> writes: > On 8/2/24 2:23 PM, Andrew Pinski wrote: >> On Wed, Jul 31, 2024 at 9:41 AM Richard Sandiford >> <richard.sandif...@arm.com> wrote: >>> >>> The testcase contains the constant: >>> >>> arr2 = svreinterpret_u8(svdup_u32(0x0a0d5c3f)); >>> >>> which was initially hoisted by hand, but which gimple optimisers later >>> propagated to each use (as expected). The constant was then expanded >>> as a load-and-duplicate from the constant pool. Normally that load >>> should then be hoisted back out of the loop, but may_trap_or_fault_p >>> stopped that from happening in this case. >>> >>> The code responsible was: >>> >>> if (/* MEM_NOTRAP_P only relates to the actual position of the memory >>> reference; moving it out of context such as when moving code >>> when optimizing, might cause its address to become invalid. >>> */ >>> code_changed >>> || !MEM_NOTRAP_P (x)) >>> { >>> poly_int64 size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1; >>> return rtx_addr_can_trap_p_1 (XEXP (x, 0), 0, size, >>> GET_MODE (x), code_changed); >>> } >>> >>> where code_changed is true. (Arguably it doesn't need to be true in >>> this case, if we inserted invariants on the preheader edge, but it >>> would still need to be true for conditionally executed loads.) >>> >>> Normally this wouldn't be a problem, since rtx_addr_can_trap_p_1 >>> would recognise that the address refers to the constant pool. >>> However, the SVE load-and-replicate instructions have a limited >>> offset range, so it isn't possible for them to have a LO_SUM address. >>> All we have is a plain pseudo base register. >>> >>> MEM_READONLY_P is defined as: >>> >>> /* 1 if RTX is a mem that is statically allocated in read-only memory. */ >>> (RTL_FLAG_CHECK1 ("MEM_READONLY_P", (RTX), MEM)->unchanging) >>> >>> and so I think it should be safe to move memory references if both >>> MEM_READONLY_P and MEM_NOTRAP_P are true. >>> >>> The testcase isn't a minimal reproducer, but I think it's good >>> to have a realistic full routine in the testsuite. >>> >>> Bootstrapped & regression-tested on aarch64-linux-gnu. OK to install? >> >> >> This is breaking the build on a few targets (x86_64 and powerpc64le so >> far reported, see PR 116200). >> >> From what I can tell is that it is treating `(plus:DI (ashift:DI >> (reg:DI 0 ax [690]) (const_int 3 [0x3])) (label_ref:DI 1620))` as not >> trapping and allowing it to be moved before the check of ax being in >> the range [0..2] and we have eax being (unsigned long)(unsigned int)-9 >> in value. So we get a bogus address which will trap. I put my findings >> in PR 116200 too. > I think it's the root cause of the x86_64 libgomp failures on the trunk > as well. I haven't done any debugging beyond that.
Sorry for the breakage. I've reverted the patch. Richard