On Wed, 2020-02-12 at 07:47 +0100, Hans-Peter Nilsson wrote:
> I just rebased and updated the vendors/axis branch
> axis/cris-decc0 with the following commits, which should bring
> back compare-elimination results to that of cc0 on master.
> 
> With the exception of the bit-test patterns (btst / btstq which
> is more of a "combine" matter), everything is centered around
> working together with the "cmpelim" pass with the help of
> define_subst attributes.  Regression test-cases have already
> been committed to master (the recently committed pr93372-*
> tests), covering all patterns but not all CCmodes or conditions.
> All patches regtested for cris-elf, at a smaller granularity
> than these partially squashed commits, but naturally with
> regressions for the pr93372-* testcases until the last one of
> these commits.
> 
> No performance tests yet though, but I expect axis/cris-decc0 to
> be a win over master, since as I've mentioned before, I see
> improvements in register-allocation already in libgcc, which
> should get back what's lost in all the special patterns I
> deleted.  I haven't looked into the cause, but it shouldn't
> surprise anyone that there's some noticeable goodies inside
> something to the effect of #ifndef HAVE_cc0, even with IRA.
> (Conversion to LRA is way down on the TODO list.)
> 
> It's a bit unfortunate that so many pattern names are now
> obfuscated with the define_subst_attr attributes (like
> "<acc><anz><anzvc>zero_extend<mode>si2<setcc><setnz><setnzvc>"
> instead of "zero_extend<mode>si2"), but I'll take that single
> line change in patterns over duplicated or triplicated patterns.
FWIW, I'm evaluating the converted H8 on/off.  In general it looks to be a wash
there.  THere's a few cases where we're doing better, possibly because I've
actually improved the precision of condition code tracking in various patterns
and done some other simplifications along the way.

The H8 is a type-2 port.  It's easiest to think of it as everything clobbering
the condition codes, even most moves.

The H8 also doesn't perform variable or multi-position shifts -- and the 
shifting
patterns sometimes need scratch registers.  So at expand time we inject a
(clobber (match_scratch ...)) expression.  Then post-reload splitting add the
clobber of the condition code register resulting in two clobbers on all the 
shift
insns.

As it turns out cmp-elim won't handle that.  So I improved some of the H8
expanders so generate simpler RTL when we know the scratch won't be needed at
expansion time.  That's allowing cmp-elim to do a reasonable job exploiting the
condition codes set by the shift/rotate insns.  But it also allows fwprop to 
make
a trivial improvement on some tests.  That trivial improvement from fwprop in
turn hinders combine and can occasionally causes us to fail to narrow certain
shift constructs from HI to QI modes.

Anyway, I mostly mention it because of the multi-clobber problem.  If your 
CC_REG
clobbering insn has more clobbers than just the CC register, then it won't be
used to do eliminate comparisons.

Jeff

Reply via email to