Hi,
this patch set solves a case seen on ARM when -march= is set to armv6 or
above. ARMv6 has a uxtb (unsigned extend byte) instruction,
corresponding to a ZERO_EXTEND rtx, which is equivalent to 'and reg,
reg, #255'.

The problem with 'uxtb' is that, unlike 'and', it does not have a 's'
condition flag setting form. Which means that for -march=armv5 and
earlier, and+cmp can be combined to 'ands', while under armv6/v7
uxtb+cmp currently ends up as separate insns, causing a performance
regression.

A possible solution is to add patterns in the ARM backend to match the
ZERO_EXTEND compares, and fake them as 'ands'. However, I think a better
solution would be to have a chance in combine for the backend to
canonicalize the compare insn operands, e.g. for ARM here, a chance to
change (zero_extend:SI (subreg:QI (reg:SI ...))) into (and:SI (reg:SI)
255), and match the existing patterns.

And to those wondering, yes this is a specific effort towards the
crcu8() routine in CoreMark :)

This patch is not too large, but I split it into 3 parts to be clearer.
Descriptions are in the following mails.

The entire patch set has been bootstrapped and regtested for i686 and
x86_64 without regressions. ARM cross-testing also completed without
regressions, and currently doing a native bootstrap test.

Thanks,
Chung-Lin

2011-04-22  Chung-Lin Tang  <clt...@codesourcery.com>

        * combine.c (simplify_comparison): Abstract out parts into...
        (simplify_compare_const): ... new function.
        (try_combine): Generalize parallel arithmetic/compare
        combining to call simplify_compare_const() and
        CANONICALIZE_COMPARE().
        * config/arm/arm.c (arm_canonicalize_comparison): Add case to
        canonicalize left operand from ZERO_EXTEND to AND.

        testsuite/
        * gcc.target/arm/combine-movs.c: New.
        * gcc.target/arm/unsigned-extend-2.c: New.

Reply via email to