Re: [PATCH] arm: [MVE intrinsics] fix vctpq intrinsic implementation [PR target/117814]

Andre Vieira (lists) Thu, 28 Nov 2024 06:14:13 -0800

Hi Christophe,

On 28/11/2024 10:22, Christophe Lyon wrote:

The VCTP instruction creates a Vector Tail Predicate in VPR.P0, based
on the input value, but also constrained by a VPT block (if present),
or if used within a DLSTP/LETP loop.


Therefore we need to inform the compiler that this intrinsic reads the
FPCXT register, otherwise it could make incorrect assumptions: for
instance in test7() from gcc.target/arm/mve/dlstp-compile-asm-2.c it
would hoist p1 = vctp32q (g) outside of the loop.

We chatted about this offlist but it's good to share here for otherstoo. I do not believe the transformation gcc is doing here is wrong. Thetransformation we do for test 7, along with some others in thetestsuite, relies on analysis we do to check whether masks, that are notthe loop predicate mask, used within the loop have a side effect. Inother words, any instruction that is not predicated by the looppredicate, be that unpredicated or predicated by another mask, triggersan analysis to check whether the results are used in a safe way. Checkthe comments above 'arm_mve_impl_predicated_p' in arm.cc

For test7 the non-loop predicate 'p1' is used to predicate a load insidethe loop, when dlstp'ed that load will be masked by 'p & p1' instead,which means it could be loading less than initially intended, however,the results of that load are used in a vadd predicated by 'p' whichmeans any values that it would have loaded if not masked by 'p' wouldhave been discarded in the add, so it has no relevant effect.

Furthermore, I also believe the compiler is already aware that VCTPwrites P0, given it has an input operand with the predicate'vpr_register_operand' and the register constraint '=Up'. During DLSTPtransformation we rely on reads and writes to such operands to do ourtransformation and it should also provide other backend passes withenough information.


So I don't think this patch is needed.

Re: [PATCH] arm: [MVE intrinsics] fix vctpq intrinsic implementation [PR target/117814]

Reply via email to