On Sun, Nov 24, 2024 at 10:02:22PM +0100, Uros Bizjak wrote:
>     PR target/36503
> 
> gcc/ChangeLog:
> 
>     * config/i386/i386.md (*ashl<mode>3_negcnt):
>     New define_insn_and_split pattern.
>     (*ashl<mode>3_negcnt_1): Ditto.
>     (*<insn><mode>3_negcnt): Ditto.
>     (*<insn><mode>3_negcnt_1): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
>     * gcc.target/i386/pr36503-1.c: New test.
>     * gcc.target/i386/pr36503-2.c: New test.

> +(define_insn_and_split "*ashl<mode>3_negcnt"
> +  [(set (match_operand:SWI48 0 "nonimmediate_operand")
> +     (ashift:SWI48
> +       (match_operand:SWI48 1 "nonimmediate_operand")
> +       (subreg:QI
> +         (minus
> +           (match_operand 3 "const_int_operand")
> +           (match_operand 2 "int248_register_operand" "c,r")) 0)))
> +   (clobber (reg:CC FLAGS_REG))]
> +  "ix86_binary_operator_ok (ASHIFT, <MODE>mode, operands)
> +   && INTVAL (operands[3]) == <MODE_SIZE> * BITS_PER_UNIT

Any reason for an exact comparison rather than
  && (INTVAL (operands[3]) & (<MODE_SIZE> * BITS_PER_UNIT - 1)) == 0
?
I mean, we can optimize this way 1U << (32 - x) or
1U << (1504 - x) or any other multiply of 32.
Similarly, we can optimize 1U << (32 + x) to 1U << x and
again do that for any other multiplies of 32.

        Jakub

Reply via email to