Hi, this patch set solves a case seen on ARM when -march= is set to armv6 or above. ARMv6 has a uxtb (unsigned extend byte) instruction, corresponding to a ZERO_EXTEND rtx, which is equivalent to 'and reg, reg, #255'.
The problem with 'uxtb' is that, unlike 'and', it does not have a 's' condition flag setting form. Which means that for -march=armv5 and earlier, and+cmp can be combined to 'ands', while under armv6/v7 uxtb+cmp currently ends up as separate insns, causing a performance regression. A possible solution is to add patterns in the ARM backend to match the ZERO_EXTEND compares, and fake them as 'ands'. However, I think a better solution would be to have a chance in combine for the backend to canonicalize the compare insn operands, e.g. for ARM here, a chance to change (zero_extend:SI (subreg:QI (reg:SI ...))) into (and:SI (reg:SI) 255), and match the existing patterns. And to those wondering, yes this is a specific effort towards the crcu8() routine in CoreMark :) This patch is not too large, but I split it into 3 parts to be clearer. Descriptions are in the following mails. The entire patch set has been bootstrapped and regtested for i686 and x86_64 without regressions. ARM cross-testing also completed without regressions, and currently doing a native bootstrap test. Thanks, Chung-Lin 2011-04-22 Chung-Lin Tang <clt...@codesourcery.com> * combine.c (simplify_comparison): Abstract out parts into... (simplify_compare_const): ... new function. (try_combine): Generalize parallel arithmetic/compare combining to call simplify_compare_const() and CANONICALIZE_COMPARE(). * config/arm/arm.c (arm_canonicalize_comparison): Add case to canonicalize left operand from ZERO_EXTEND to AND. testsuite/ * gcc.target/arm/combine-movs.c: New. * gcc.target/arm/unsigned-extend-2.c: New.