"Kewen.Lin" <li...@linux.ibm.com> writes:
> Hi Richard,
>
> Thanks for the comments!
>
> on 2024/8/12 16:55, Richard Sandiford wrote:
>> "Kewen.Lin" <li...@linux.ibm.com> writes:
>>> Hi,
>>>
>>> Commit r15-2084 exposes one ICE in LRA.  Firstly, before
>>> r15-2084 KFmode has 126 bit precision while V1TImode has 128
>>> bit precision, so the subreg (subreg:V1TI (reg:KF 131) 0) is
>>> paradoxical_subreg_p, which stops some passes from doing
>>> some optimization.  After r15-2084, KFmode has the same mode
>>> precision as V1TImode, passes are able to optimize more, but
>>> it causes this ICE in LRA as described below:
>>>
>>> For insn 106 (set (mem:V1TI ...) (subreg:V1TI (reg:KF 133) 0)),
>>> which matches pattern
>>>
>>> (define_insn "*vsx_le_perm_store_<mode>"
>>>   [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
>>>         (match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]
>> 
>> Is it well-formed to have "+" on an operand that is only semantically
>> an input?  df and most passes would consider it to be a pure input
>> and so would disregard any write that is intended to take place.
>> 
>> E.g. if we have:
>> 
>>   (set (reg:M R2) (reg:M R1)) ;; R1 dead
>>   (set (reg:M R3) (reg:M R2))
>> 
>> then it would be valid to change that to:
>> 
>>   (set (reg:M R2) (reg:M R1))
>>   (set (reg:M R3) (reg:M R1))
>> 
>> without considering the "+" on the input to the first instruction.
>
> Good question, I guess the "+" is to match the fact that operand[1]
> can be both read and written by this insn, operand[1] has to be
> re-used as the dest register of vector rotate 64bit (doubleword swap).
> but it'll get swapped back so it's safe if the register is still live
> (like the above example).  For this case that writing to but later
> restoring, I'm not sure if we can take it as "no write" (strip "+").

Yeah.  AIUI, there is no way of satisfying the constraints in a way
that makes operand 0 overlap operand 1, and:

>
> ;; The post-reload split requires that we re-permute the source
> ;; register in case it is still live.
> (define_split
>   [(set (match_operand:VSX_LE_128 0 "memory_operand")
>         (match_operand:VSX_LE_128 1 "vsx_register_operand"))]
>   "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR
>    && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
>   [(const_int 0)]
> {
>   rs6000_emit_le_vsx_permute (operands[1], operands[1], <MODE>mode);
>   rs6000_emit_le_vsx_permute (operands[0], operands[1], <MODE>mode);
>   rs6000_emit_le_vsx_permute (operands[1], operands[1], <MODE>mode);

would not work correctly for that kind of overlap in any case.
 
So it looks on the face of it like the pattern would be (more) correct
without the "+".

Thanks,
Richard

>   DONE;
> })
>
> CC the author Mike as well.
>
>> 
>> But if we do stick with the patch...
>> 
>>>   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
>>>    && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
>>>   "@
>>>    #
>>>    #"
>>>   [(set_attr "type" "vecstore,store")
>>>    (set_attr "length" "12,8")
>>>    (set_attr "isa" "<VSisa>,*")])
>>>
>>> LRA makes equivalence substitution on r133 with const double
>>> (const_double:KF 0.0), selects alternative 0 and fixes up
>>> operand 1 for constraint "wa", because operand 1 is OP_INOUT,
>>> so it considers assigning back to it as well, that is:
>>>
>>>   lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);
>>>
>>> But because old has been changed to const_double in equivalence
>>> substitution, the move is actually assigning to const_double,
>>> which is invalid and cause ICE.
>>>
>>> Considering reg:KF 133 is equivalent with (const_double:KF 0.0)
>>> even though this operand is OP_INOUT, IMHO there should not be
>>> any following uses of reg:KF 133, otherwise it doesn't have the
>>> chance to be equivalent to (const_double:KF 0.0).  From this
>>> perspective, I think we can guard the lra_emit_move with
>>> nonimmediate_operand to exclude such case.
>>>
>>> Does it sound reasonable?
>>>
>>> btw, I also tried with disallowing equivalence substitution with
>>> CONSTANT_P value if the corresponding operand is OP_INOUT or
>>> OP_OUT, it can also fix this issue, but with more thinking it
>>> seems not necessary to stop such substitution if we can handle it
>>> later as above.
>>>
>>> Bootstrapped and regtested on x86_64-redhat-linux and
>>> powerpc64{,le}-linux-gnu.
>>>
>>> BR,
>>> Kewen
>>> -----
>>>     PR rtl-optimization/116170
>>>
>>> gcc/ChangeLog:
>>>
>>>     * lra-constraints.cc (curr_insn_transform): Don't emit move back to
>>>     old operand if it's nonimmediate_operand.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>>     * gcc.target/powerpc/pr116170.c: New test.
>>> ---
>>>  gcc/lra-constraints.cc                      |  3 ++-
>>>  gcc/testsuite/gcc.target/powerpc/pr116170.c | 18 ++++++++++++++++++
>>>  2 files changed, 20 insertions(+), 1 deletion(-)
>>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr116170.c
>>>
>>> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
>>> index 92b343fa99a..024c85c87d9 100644
>>> --- a/gcc/lra-constraints.cc
>>> +++ b/gcc/lra-constraints.cc
>>> @@ -4742,7 +4742,8 @@ curr_insn_transform (bool check_only_p)
>>>         }
>>>       *loc = new_reg;
>>>       if (type != OP_IN
>>> -         && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX)
>>> +         && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX
>>> +         && nonimmediate_operand (old, GET_MODE (old)))
>> 
>> ...IMO it would be better to use CONSTANT_P instead.  It's sometimes
>> useful to accept constants in specific instructions only, meaning that
>> they aren't general enough to be immediate_operands.
>
> OK, will do.  I used "CONSTANT_P" in the 1st draft, but considered it's
> guarding a move, by checking many existing mov optab predicates (they
> only accept general operands) so felt nonimmediate_operand seems to be a
> better fit here.
>
> BR,
> Kewen

Reply via email to