Hello!

>> I also wonder if compare-elim ought to be helping here.  Isn't that the
>> point here, to eliminate the comparison and instead get it for free as
>> part of the arithmetic?  If so, is it the fact that we have memory
>> references that prevents compare-elim from kicking in?
>
> Yes, compare-elim doesn't work with memory references but, more radically, it
> is not enabled for x86 (it is only enabled for aarch64, mn10300 and rx).

I did experiment a bit with a compare-elim pass on x86. However, as
rth said in [1]:

--quote--
If we want to use this pass for x86, then for 4.8 we should also fix the
discrepancy between the compare-elim canonical

  [(operate)
   (set-cc)]

and the combine canonical

  [(set-cc)
   (operate)]

(Because of the simplicity of the substitution in compare-elim, I prefer
the former as the canonical canonical.)
--/quote--

There were some patches flowing around [2], [3] that enhanced
compare-elim pass for x86 needs, but the target never switched to new
pass, mostly because compare-elim pass did not catch all cases that
traditional RTX combine pass did. However, combine-elim pass can cross
BB boundaries, where traditional RTX combine doesn't (and IIRC it even
has a comment why it doesn't try too hard to do so).

The reason why x86 doesn't use both passes is simply due to the fact
quoted above. compare-elim pass substitutes the clobber in the
PARALLEL RTX with a new set-cc in-place, so all relevant patterns in
i386.md (and a couple of support functions in i386.c) would have to be
swapped around. Unfortunately, simply changing i386.md insn patterns
would disable existing RTX combiner functionality, leading to various
missed-optimization regressions.

Due to the above, I would like to propose that existing RTX compare
pass be updated to handle [(operate)(set-cc)] patterns (exclusively?).
>From my experience, compare-elim post-reload pass would catch a bunch
of remaining cross-BB opportunities, left by RTX combine pass, so
compare-elim pass would be effective on x86 also after RTX combiner
does its job. While target-dependent changes would be fairly trivial,
I don't know about the amount of work in combine.c to handle new
canonical patterns. Maybe RTL maintainer can chime in (hint, hint, wnk
wink ;)

There is also hidden benefit for "other", compare-elim only targets.
Having this pass enabled on a wildly popular target would help
catching eventual bugs in the pass.

[1] https://gcc.gnu.org/ml/gcc-patches/2012-02/msg00251.html
[2] https://gcc.gnu.org/ml/gcc-patches/2012-02/msg00466.html
[3] https://gcc.gnu.org/ml/gcc-patches/2012-04/msg01487.html

Uros.

Reply via email to