Hi Richard,

Thanks for the comments!

on 2024/8/12 16:55, Richard Sandiford wrote:
> "Kewen.Lin" <li...@linux.ibm.com> writes:
>> Hi,
>>
>> Commit r15-2084 exposes one ICE in LRA.  Firstly, before
>> r15-2084 KFmode has 126 bit precision while V1TImode has 128
>> bit precision, so the subreg (subreg:V1TI (reg:KF 131) 0) is
>> paradoxical_subreg_p, which stops some passes from doing
>> some optimization.  After r15-2084, KFmode has the same mode
>> precision as V1TImode, passes are able to optimize more, but
>> it causes this ICE in LRA as described below:
>>
>> For insn 106 (set (mem:V1TI ...) (subreg:V1TI (reg:KF 133) 0)),
>> which matches pattern
>>
>> (define_insn "*vsx_le_perm_store_<mode>"
>>   [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
>>         (match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]
> 
> Is it well-formed to have "+" on an operand that is only semantically
> an input?  df and most passes would consider it to be a pure input
> and so would disregard any write that is intended to take place.
> 
> E.g. if we have:
> 
>   (set (reg:M R2) (reg:M R1)) ;; R1 dead
>   (set (reg:M R3) (reg:M R2))
> 
> then it would be valid to change that to:
> 
>   (set (reg:M R2) (reg:M R1))
>   (set (reg:M R3) (reg:M R1))
> 
> without considering the "+" on the input to the first instruction.

Good question, I guess the "+" is to match the fact that operand[1]
can be both read and written by this insn, operand[1] has to be
re-used as the dest register of vector rotate 64bit (doubleword swap).
but it'll get swapped back so it's safe if the register is still live
(like the above example).  For this case that writing to but later
restoring, I'm not sure if we can take it as "no write" (strip "+").

;; The post-reload split requires that we re-permute the source
;; register in case it is still live.
(define_split
  [(set (match_operand:VSX_LE_128 0 "memory_operand")
        (match_operand:VSX_LE_128 1 "vsx_register_operand"))]
  "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR
   && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
  [(const_int 0)]
{
  rs6000_emit_le_vsx_permute (operands[1], operands[1], <MODE>mode);
  rs6000_emit_le_vsx_permute (operands[0], operands[1], <MODE>mode);
  rs6000_emit_le_vsx_permute (operands[1], operands[1], <MODE>mode);
  DONE;
})

CC the author Mike as well.

> 
> But if we do stick with the patch...
> 
>>   "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
>>    && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
>>   "@
>>    #
>>    #"
>>   [(set_attr "type" "vecstore,store")
>>    (set_attr "length" "12,8")
>>    (set_attr "isa" "<VSisa>,*")])
>>
>> LRA makes equivalence substitution on r133 with const double
>> (const_double:KF 0.0), selects alternative 0 and fixes up
>> operand 1 for constraint "wa", because operand 1 is OP_INOUT,
>> so it considers assigning back to it as well, that is:
>>
>>   lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);
>>
>> But because old has been changed to const_double in equivalence
>> substitution, the move is actually assigning to const_double,
>> which is invalid and cause ICE.
>>
>> Considering reg:KF 133 is equivalent with (const_double:KF 0.0)
>> even though this operand is OP_INOUT, IMHO there should not be
>> any following uses of reg:KF 133, otherwise it doesn't have the
>> chance to be equivalent to (const_double:KF 0.0).  From this
>> perspective, I think we can guard the lra_emit_move with
>> nonimmediate_operand to exclude such case.
>>
>> Does it sound reasonable?
>>
>> btw, I also tried with disallowing equivalence substitution with
>> CONSTANT_P value if the corresponding operand is OP_INOUT or
>> OP_OUT, it can also fix this issue, but with more thinking it
>> seems not necessary to stop such substitution if we can handle it
>> later as above.
>>
>> Bootstrapped and regtested on x86_64-redhat-linux and
>> powerpc64{,le}-linux-gnu.
>>
>> BR,
>> Kewen
>> -----
>>      PR rtl-optimization/116170
>>
>> gcc/ChangeLog:
>>
>>      * lra-constraints.cc (curr_insn_transform): Don't emit move back to
>>      old operand if it's nonimmediate_operand.
>>
>> gcc/testsuite/ChangeLog:
>>
>>      * gcc.target/powerpc/pr116170.c: New test.
>> ---
>>  gcc/lra-constraints.cc                      |  3 ++-
>>  gcc/testsuite/gcc.target/powerpc/pr116170.c | 18 ++++++++++++++++++
>>  2 files changed, 20 insertions(+), 1 deletion(-)
>>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr116170.c
>>
>> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
>> index 92b343fa99a..024c85c87d9 100644
>> --- a/gcc/lra-constraints.cc
>> +++ b/gcc/lra-constraints.cc
>> @@ -4742,7 +4742,8 @@ curr_insn_transform (bool check_only_p)
>>          }
>>        *loc = new_reg;
>>        if (type != OP_IN
>> -          && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX)
>> +          && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX
>> +          && nonimmediate_operand (old, GET_MODE (old)))
> 
> ...IMO it would be better to use CONSTANT_P instead.  It's sometimes
> useful to accept constants in specific instructions only, meaning that
> they aren't general enough to be immediate_operands.

OK, will do.  I used "CONSTANT_P" in the 1st draft, but considered it's
guarding a move, by checking many existing mov optab predicates (they
only accept general operands) so felt nonimmediate_operand seems to be a
better fit here.

BR,
Kewen

Reply via email to