Jeff Law <jeffreya...@gmail.com> writes:
> On 8/2/24 2:23 PM, Andrew Pinski wrote:
>> On Wed, Jul 31, 2024 at 9:41 AM Richard Sandiford
>> <richard.sandif...@arm.com> wrote:
>>>
>>> The testcase contains the constant:
>>>
>>>    arr2 = svreinterpret_u8(svdup_u32(0x0a0d5c3f));
>>>
>>> which was initially hoisted by hand, but which gimple optimisers later
>>> propagated to each use (as expected).  The constant was then expanded
>>> as a load-and-duplicate from the constant pool.  Normally that load
>>> should then be hoisted back out of the loop, but may_trap_or_fault_p
>>> stopped that from happening in this case.
>>>
>>> The code responsible was:
>>>
>>>        if (/* MEM_NOTRAP_P only relates to the actual position of the memory
>>>               reference; moving it out of context such as when moving code
>>>               when optimizing, might cause its address to become invalid.  
>>> */
>>>            code_changed
>>>            || !MEM_NOTRAP_P (x))
>>>          {
>>>            poly_int64 size = MEM_SIZE_KNOWN_P (x) ? MEM_SIZE (x) : -1;
>>>            return rtx_addr_can_trap_p_1 (XEXP (x, 0), 0, size,
>>>                                          GET_MODE (x), code_changed);
>>>          }
>>>
>>> where code_changed is true.  (Arguably it doesn't need to be true in
>>> this case, if we inserted invariants on the preheader edge, but it
>>> would still need to be true for conditionally executed loads.)
>>>
>>> Normally this wouldn't be a problem, since rtx_addr_can_trap_p_1
>>> would recognise that the address refers to the constant pool.
>>> However, the SVE load-and-replicate instructions have a limited
>>> offset range, so it isn't possible for them to have a LO_SUM address.
>>> All we have is a plain pseudo base register.
>>>
>>> MEM_READONLY_P is defined as:
>>>
>>> /* 1 if RTX is a mem that is statically allocated in read-only memory.  */
>>>    (RTL_FLAG_CHECK1 ("MEM_READONLY_P", (RTX), MEM)->unchanging)
>>>
>>> and so I think it should be safe to move memory references if both
>>> MEM_READONLY_P and MEM_NOTRAP_P are true.
>>>
>>> The testcase isn't a minimal reproducer, but I think it's good
>>> to have a realistic full routine in the testsuite.
>>>
>>> Bootstrapped & regression-tested on aarch64-linux-gnu.  OK to install?
>> 
>> 
>> This is breaking the build on a few targets (x86_64 and powerpc64le so
>> far reported, see PR 116200).
>> 
>>  From what I can tell is that it is treating `(plus:DI (ashift:DI
>> (reg:DI 0 ax [690]) (const_int 3 [0x3]))  (label_ref:DI 1620))` as not
>> trapping and allowing it to be moved before the check of ax being in
>> the range [0..2] and we have eax being (unsigned long)(unsigned int)-9
>> in value. So we get a bogus address which will trap. I put my findings
>> in PR 116200 too.
> I think it's the root cause of the x86_64 libgomp failures on the trunk 
> as well.  I haven't done any debugging beyond that.

Sorry for the breakage.  I've reverted the patch.

Richard

Reply via email to