On Mon, Nov 25, 2024 at 12:10 PM Jakub Jelinek <ja...@redhat.com> wrote:
>
> On Mon, Nov 25, 2024 at 11:52:31AM +0100, Uros Bizjak wrote:
> > > Any reason for an exact comparison rather than
> > >   && (INTVAL (operands[3]) & (<MODE_SIZE> * BITS_PER_UNIT - 1)) == 0
> > > ?
> > > I mean, we can optimize this way 1U << (32 - x) or
> > > 1U << (1504 - x) or any other multiply of 32.
> >
> > Count values outside of [0, bitwidth) are undefined in general. Also,
>
> Sure, e.g. the 1504 - x case will be undefined if x is not in [1473, 1504]
> But it very well could be in that range.
>
> What I'm arguing about is that changing the test (in all the new patterns)
> will not noticeably slow compilation down and be more general.
>
> > during the bootstrap only values of 32 or 64 were found, so I thought
> > to not complicate the condition too much
> >
> > FYI, the conversion triggers 504 times during the bootstrap, I somehow
> > expected a lower number.
> >
> > > Similarly, we can optimize 1U << (32 + x) to 1U << x and
> > > again do that for any other multiplies of 32.
> >
> > I don't think anybody uses the above idiom, it is valid only for
> > certain targets (x86 and the ones that mask count argument), but
> > undefined in general.
>
> No, it is again valid whenever x is in [-32, -1] range, which it very well
> could be (or if x is unsigned, in [-32U, -1U] range).
> Of course if it is rare, adding new patterns for it might not be worth it.

Huh, I was wrong. There are 356 cases where 1U << (32 + x) to 1U << x
optimization triggers during the bootstrap.

I am testing the attached patch.

Thanks,
Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index df78e4df9d8..788d2555a1a 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -15896,7 +15896,53 @@ (define_insn_and_split "*ashl<mode>3_mask_1"
   ""
   [(set_attr "isa" "*,bmi2")])
 
-(define_insn_and_split "*ashl<mode>3_negcnt"
+(define_insn_and_split "*ashl<mode>3_add"
+  [(set (match_operand:SWI48 0 "nonimmediate_operand")
+       (ashift:SWI48
+         (match_operand:SWI48 1 "nonimmediate_operand")
+         (subreg:QI
+           (plus
+             (match_operand 2 "int248_register_operand" "c,r")
+             (match_operand 3 "const_int_operand")) 0)))
+   (clobber (reg:CC FLAGS_REG))]
+  "ix86_binary_operator_ok (ASHIFT, <MODE>mode, operands)
+   && (INTVAL (operands[3]) & (<MODE_SIZE> * BITS_PER_UNIT - 1)) == 0
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(parallel
+     [(set (match_dup 0)
+          (ashift:SWI48 (match_dup 1)
+                        (match_dup 2)))
+      (clobber (reg:CC FLAGS_REG))])]
+{
+  operands[2] = force_reg (GET_MODE (operands[2]), operands[2]);
+  operands[2] = gen_lowpart (QImode, operands[2]);
+}
+  [(set_attr "isa" "*,bmi2")])
+
+(define_insn_and_split "*ashl<mode>3_add_1"
+  [(set (match_operand:SWI48 0 "nonimmediate_operand")
+       (ashift:SWI48
+         (match_operand:SWI48 1 "nonimmediate_operand")
+         (plus:QI
+           (match_operand:QI 2 "register_operand" "c,r")
+           (match_operand:QI 3 "const_int_operand"))))
+   (clobber (reg:CC FLAGS_REG))]
+  "ix86_binary_operator_ok (ASHIFT, <MODE>mode, operands)
+   && (INTVAL (operands[3]) & (<MODE_SIZE> * BITS_PER_UNIT - 1)) == 0
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(parallel
+     [(set (match_dup 0)
+          (ashift:SWI48 (match_dup 1)
+                        (match_dup 2)))
+      (clobber (reg:CC FLAGS_REG))])]
+  ""
+  [(set_attr "isa" "*,bmi2")])
+
+(define_insn_and_split "*ashl<mode>3_sub"
   [(set (match_operand:SWI48 0 "nonimmediate_operand")
        (ashift:SWI48
          (match_operand:SWI48 1 "nonimmediate_operand")
@@ -15927,7 +15973,7 @@ (define_insn_and_split "*ashl<mode>3_negcnt"
 }
   [(set_attr "isa" "*,bmi2")])
 
-(define_insn_and_split "*ashl<mode>3_negcnt_1"
+(define_insn_and_split "*ashl<mode>3_sub_1"
   [(set (match_operand:SWI48 0 "nonimmediate_operand")
        (ashift:SWI48
          (match_operand:SWI48 1 "nonimmediate_operand")
@@ -16678,7 +16724,53 @@ (define_insn_and_split "*<insn><mode>3_mask_1"
   ""
   [(set_attr "isa" "*,bmi2")])
 
-(define_insn_and_split "*<insn><mode>3_negcnt"
+(define_insn_and_split "*<insn><mode>3_add"
+  [(set (match_operand:SWI48 0 "nonimmediate_operand")
+       (any_shiftrt:SWI48
+         (match_operand:SWI48 1 "nonimmediate_operand")
+         (subreg:QI
+           (plus
+             (match_operand 2 "int248_register_operand" "c,r")
+             (match_operand 3 "const_int_operand")) 0)))
+   (clobber (reg:CC FLAGS_REG))]
+  "ix86_binary_operator_ok (ASHIFT, <MODE>mode, operands)
+   && (INTVAL (operands[3]) & (<MODE_SIZE> * BITS_PER_UNIT - 1)) == 0
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(parallel
+     [(set (match_dup 0)
+          (any_shiftrt:SWI48 (match_dup 1)
+                             (match_dup 2)))
+      (clobber (reg:CC FLAGS_REG))])]
+{
+  operands[2] = force_reg (GET_MODE (operands[2]), operands[2]);
+  operands[2] = gen_lowpart (QImode, operands[2]);
+}
+  [(set_attr "isa" "*,bmi2")])
+
+(define_insn_and_split "*<insn><mode>3_add_1"
+  [(set (match_operand:SWI48 0 "nonimmediate_operand")
+       (any_shiftrt:SWI48
+         (match_operand:SWI48 1 "nonimmediate_operand")
+         (plus:QI
+           (match_operand:QI 2 "register_operand" "c,r")
+           (match_operand:QI 3 "const_int_operand"))))
+   (clobber (reg:CC FLAGS_REG))]
+  "ix86_binary_operator_ok (<CODE>, <MODE>mode, operands)
+   && (INTVAL (operands[3]) & (<MODE_SIZE> * BITS_PER_UNIT - 1)) == 0
+   && ix86_pre_reload_split ()"
+  "#"
+  "&& 1"
+  [(parallel
+     [(set (match_dup 0)
+          (any_shiftrt:SWI48 (match_dup 1)
+                             (match_dup 2)))
+      (clobber (reg:CC FLAGS_REG))])]
+  ""
+  [(set_attr "isa" "*,bmi2")])
+
+(define_insn_and_split "*<insn><mode>3_sub"
   [(set (match_operand:SWI48 0 "nonimmediate_operand")
        (any_shiftrt:SWI48
          (match_operand:SWI48 1 "nonimmediate_operand")
@@ -16709,7 +16801,7 @@ (define_insn_and_split "*<insn><mode>3_negcnt"
 }
   [(set_attr "isa" "*,bmi2")])
 
-(define_insn_and_split "*<insn><mode>3_negcnt_1"
+(define_insn_and_split "*<insn><mode>3_sub_1"
   [(set (match_operand:SWI48 0 "nonimmediate_operand")
        (any_shiftrt:SWI48
          (match_operand:SWI48 1 "nonimmediate_operand")

Reply via email to