"Kewen.Lin" <[email protected]> writes:
> Hi Richard,
>
> Thanks for the comments!
>
> on 2024/8/12 16:55, Richard Sandiford wrote:
>> "Kewen.Lin" <[email protected]> writes:
>>> Hi,
>>>
>>> Commit r15-2084 exposes one ICE in LRA. Firstly, before
>>> r15-2084 KFmode has 126 bit precision while V1TImode has 128
>>> bit precision, so the subreg (subreg:V1TI (reg:KF 131) 0) is
>>> paradoxical_subreg_p, which stops some passes from doing
>>> some optimization. After r15-2084, KFmode has the same mode
>>> precision as V1TImode, passes are able to optimize more, but
>>> it causes this ICE in LRA as described below:
>>>
>>> For insn 106 (set (mem:V1TI ...) (subreg:V1TI (reg:KF 133) 0)),
>>> which matches pattern
>>>
>>> (define_insn "*vsx_le_perm_store_<mode>"
>>> [(set (match_operand:VSX_LE_128 0 "memory_operand" "=Z,Q")
>>> (match_operand:VSX_LE_128 1 "vsx_register_operand" "+wa,r"))]
>>
>> Is it well-formed to have "+" on an operand that is only semantically
>> an input? df and most passes would consider it to be a pure input
>> and so would disregard any write that is intended to take place.
>>
>> E.g. if we have:
>>
>> (set (reg:M R2) (reg:M R1)) ;; R1 dead
>> (set (reg:M R3) (reg:M R2))
>>
>> then it would be valid to change that to:
>>
>> (set (reg:M R2) (reg:M R1))
>> (set (reg:M R3) (reg:M R1))
>>
>> without considering the "+" on the input to the first instruction.
>
> Good question, I guess the "+" is to match the fact that operand[1]
> can be both read and written by this insn, operand[1] has to be
> re-used as the dest register of vector rotate 64bit (doubleword swap).
> but it'll get swapped back so it's safe if the register is still live
> (like the above example). For this case that writing to but later
> restoring, I'm not sure if we can take it as "no write" (strip "+").
Yeah. AIUI, there is no way of satisfying the constraints in a way
that makes operand 0 overlap operand 1, and:
>
> ;; The post-reload split requires that we re-permute the source
> ;; register in case it is still live.
> (define_split
> [(set (match_operand:VSX_LE_128 0 "memory_operand")
> (match_operand:VSX_LE_128 1 "vsx_register_operand"))]
> "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed && !TARGET_P9_VECTOR
> && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
> [(const_int 0)]
> {
> rs6000_emit_le_vsx_permute (operands[1], operands[1], <MODE>mode);
> rs6000_emit_le_vsx_permute (operands[0], operands[1], <MODE>mode);
> rs6000_emit_le_vsx_permute (operands[1], operands[1], <MODE>mode);
would not work correctly for that kind of overlap in any case.
So it looks on the face of it like the pattern would be (more) correct
without the "+".
Thanks,
Richard
> DONE;
> })
>
> CC the author Mike as well.
>
>>
>> But if we do stick with the patch...
>>
>>> "!BYTES_BIG_ENDIAN && TARGET_VSX && !TARGET_P9_VECTOR
>>> && !altivec_indexed_or_indirect_operand (operands[0], <MODE>mode)"
>>> "@
>>> #
>>> #"
>>> [(set_attr "type" "vecstore,store")
>>> (set_attr "length" "12,8")
>>> (set_attr "isa" "<VSisa>,*")])
>>>
>>> LRA makes equivalence substitution on r133 with const double
>>> (const_double:KF 0.0), selects alternative 0 and fixes up
>>> operand 1 for constraint "wa", because operand 1 is OP_INOUT,
>>> so it considers assigning back to it as well, that is:
>>>
>>> lra_emit_move (type == OP_INOUT ? copy_rtx (old) : old, new_reg);
>>>
>>> But because old has been changed to const_double in equivalence
>>> substitution, the move is actually assigning to const_double,
>>> which is invalid and cause ICE.
>>>
>>> Considering reg:KF 133 is equivalent with (const_double:KF 0.0)
>>> even though this operand is OP_INOUT, IMHO there should not be
>>> any following uses of reg:KF 133, otherwise it doesn't have the
>>> chance to be equivalent to (const_double:KF 0.0). From this
>>> perspective, I think we can guard the lra_emit_move with
>>> nonimmediate_operand to exclude such case.
>>>
>>> Does it sound reasonable?
>>>
>>> btw, I also tried with disallowing equivalence substitution with
>>> CONSTANT_P value if the corresponding operand is OP_INOUT or
>>> OP_OUT, it can also fix this issue, but with more thinking it
>>> seems not necessary to stop such substitution if we can handle it
>>> later as above.
>>>
>>> Bootstrapped and regtested on x86_64-redhat-linux and
>>> powerpc64{,le}-linux-gnu.
>>>
>>> BR,
>>> Kewen
>>> -----
>>> PR rtl-optimization/116170
>>>
>>> gcc/ChangeLog:
>>>
>>> * lra-constraints.cc (curr_insn_transform): Don't emit move back to
>>> old operand if it's nonimmediate_operand.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/powerpc/pr116170.c: New test.
>>> ---
>>> gcc/lra-constraints.cc | 3 ++-
>>> gcc/testsuite/gcc.target/powerpc/pr116170.c | 18 ++++++++++++++++++
>>> 2 files changed, 20 insertions(+), 1 deletion(-)
>>> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr116170.c
>>>
>>> diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
>>> index 92b343fa99a..024c85c87d9 100644
>>> --- a/gcc/lra-constraints.cc
>>> +++ b/gcc/lra-constraints.cc
>>> @@ -4742,7 +4742,8 @@ curr_insn_transform (bool check_only_p)
>>> }
>>> *loc = new_reg;
>>> if (type != OP_IN
>>> - && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX)
>>> + && find_reg_note (curr_insn, REG_UNUSED, old) == NULL_RTX
>>> + && nonimmediate_operand (old, GET_MODE (old)))
>>
>> ...IMO it would be better to use CONSTANT_P instead. It's sometimes
>> useful to accept constants in specific instructions only, meaning that
>> they aren't general enough to be immediate_operands.
>
> OK, will do. I used "CONSTANT_P" in the 1st draft, but considered it's
> guarding a move, by checking many existing mov optab predicates (they
> only accept general operands) so felt nonimmediate_operand seems to be a
> better fit here.
>
> BR,
> Kewen