Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

Richard Sandiford Fri, 20 Oct 2023 02:16:27 -0700

Thanks for the context.

Robin Dapp <[email protected]> writes:
>> Sorry for the slow review.  TBH I was hoping someone else would pick
>> it up, since (a) I'm not very familiar with this code, and (b) I don't
>> really agree with the way that the current code works.  I'm not sure the
>> current dependency checking is safe, so I'm nervous about adding even
>> more cases to it.  And it feels like the different ifcvt techniques are
>> not now well distinguished, so that they're beginning to overlap and
>> compete with one another.  None of that is your fault, of course. :)
>
> I might be to blame, at least partially :)  The idea back then was to
> do it here because (1) it can handle cases the other part cannot and
> (2) its costing is based on sequence cost rather than just counting
> instructions.


Ah, OK.  (2) seems like a good reason.

>> This condition looks odd.  A SET_SRC is never null.  But more importantly,
>> continuing means "assume the best", and I don't think we should assume
>> the best for (say) an insn with two parallel sets.
>
> IIRC I didn't consider two parallel sets at all :/
>
>> But I'm not sure which cases this code is trying to catch.  Is it trying
>> to catch cases where seq2 "spontaneously" uses registers that happen to
>> overlap with cond?  If so, then when does that happen?  And if it does
>> happen, wouldn't the sequence also have to set the registers first?
>
> In order for sequence costing to be superior to just counting "conditional"
> instructions we need to make sure that as few redundant instructions as
> possible are present in the costed sequences. (redundant as in "will be
> removed in a subsequent pass").
>  
> So, originally, the backend would always be passed a comparison
> (set cc (cmp a b)).  On e.g. s390 this would result in at least two
> redundant instructions per conditional move so costing was always wrong.
> When passing the backend the comparison "result" e.g. (cmp cc 0)
> there is no redundant instruction.
>
> Now, passing the comparison is useful as well when we want to turn
> a conditional move into a min/max in the backend.  This, however,
> overwrites the initial condition result and any subsequent iterations
> must not pass the result but the comparison itself to the backend.

Yeah, makes sense.  Using your example, there seem to be two things
that we're checking:

(1) Does the sequence change cc?  This is checked via:

      if (cc_cmp)
        {
          /* Check if SEQ can clobber registers mentioned in
             cc_cmp and/or rev_cc_cmp.  If yes, we need to use
             only seq1 from that point on.  */
          rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp };
          for (walk = seq; walk; walk = NEXT_INSN (walk))
            {
              note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair);
              if (cc_cmp_pair[0] == NULL_RTX)
                {
                  cc_cmp = NULL_RTX;
                  rev_cc_cmp = NULL_RTX;
                  break;
                }
            }
        }

    and is the case that Manolis's patch is dealing with.

(2) Does the sequence use a and b?  If so, we need to use temporary
    destinations for any earlier writes to a and b.

Is that right?

(1) looks OK, although Manolis's modified_in_p would work too.
(2) is the code I quoted yesterday and is the part I'm not sure
about.  First of all:

      seq1 = try_emit_cmove_seq (if_info, temp, cond,
                                 new_val, old_val, need_cmov,
                                 &cost1, &temp_dest1);

must have a consistent view of what a and b are.  So old_val and new_val
cannot at this point reference "newer" values of a and b (as set by previous
instructions in the sequence).  AIUI:

      if (int *ii = rewired_src->get (insn))
        new_val = simplify_replace_rtx (new_val, (*targets)[*ii],
                                        (*temporaries)[*ii]);

is the code that ensures this.  If a previous write to a and b has been
replaced by a temporary, this code substitutes that temporary into new_val.

The same cond, new_val and old_val are used in:

        seq2 = try_emit_cmove_seq (if_info, temp, cond,
                                   new_val, old_val, need_cmov,
                                   &cost2, &temp_dest2, cc_cmp, rev_cc_cmp);

So won't any use of a and b in seq2 to be from cond, rather than old_val
and new_val?  If so, shouldn't we set read_comparison for any use of a
and b, rather than skipping IF_THEN_ELSE?

> In my first approach I hadn't considered several special cases which,
> of course, complicated things further down the line.
>
> All that said - maybe trying hard to get rid of later-redundant insns
> is not a good approach after all?  A lot of complexity would go away
> if we counted emitted conditional instructions just like the other
> ifcvt part.  Maybe a middle ground would be to cost the initial
> sequence as well as if + jmp and compare this against insn_cost of
> only the created conditional insns?

Using seq_cost seems like the best way of costing things.  And I agree
that it's worth trying to avoid costing (and generating) redundant
instructions.

> In that case we might need to tighten/audit our preconditions in order
> not to accidentally allow cases that result in strictly worse code.
> But maybe that's an easier problem than what we are currently solving?

Not sure :)

Thanks,
Richard

Re: [PATCH v3 1/4] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets

Reply via email to