Hi!

[ Please CC: me on rs6000 patches.  Thanks! ]

On Tue, Sep 29, 2020 at 12:26:28PM +0200, Eric Botcazou wrote:
> and the masking is present all the way down to the assembly at -O2:
> 
>         rlwinm 4,4,0,27,31
>         rotlw 3,3,4
> 
> Now this masking is redundant since it's done by the hardware so it would be 
> nice to get rid of it.  I have attached a couple of patches to that effect: 
> the first one adds new instructions while the second one only adds splitters.

>       * config/rs6000/rs6000.md (*rotl<mode>3_mask): New.
>       (*rotlsi3_mask_64): Likewise.
>       (*rotl<mode>3_dot): Change to use P mode iterator.
>       (*rotl<mode>3_mask_dot): New.
>       (*rotl<mode>3_dot2): Change to use P mode iterator.
>       (*rotl<mode>3_mask_dot2): New.

Don't call it "mask" please: *all* of our basic rotate instructions
already have something called "mask" (that is the "m" in "rlwnm" for
example; and "rotlw d,a,b" is just an extended mnemonic for
"rlwnm d,a,b,0,31").  The hardware also does not mask the shift amount
at all (instead, only the low 5 bits of RB *are* the shift amount).

> +;; Avoid useless masking of count operand

(Sentences end in a full stop.)

> +(define_insn "*rotl<mode>3_mask"
> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> +     (rotate:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
> +                 (and:GPR (match_operand:GPR 2 "gpc_reg_operand" "r")
> +                          (match_operand:GPR 3 "const_int_operand" "n"))))]
> +  "(UINTVAL (operands[3]) & (GET_MODE_BITSIZE (<MODE>mode) - 1))
> +   == (unsigned HOST_WIDE_INT) (GET_MODE_BITSIZE (<MODE>mode) - 1)"

(Useless casts are useless.)

Don't mask operands[3] please (in the UINTVAL): RTL with the number
outside of that range is *undefined*.  So just check that it is equal?

>  (define_insn_and_split "*rotl<mode>3_dot"
>    [(set (match_operand:CC 3 "cc_reg_operand" "=x,?y")
> -     (compare:CC (rotate:GPR (match_operand:GPR 1 "gpc_reg_operand" "r,r")
> -                             (match_operand:SI 2 "reg_or_cint_operand" 
> "rn,rn"))
> +     (compare:CC (rotate:P (match_operand:P 1 "gpc_reg_operand" "r,r")
> +                           (match_operand:SI 2 "reg_or_cint_operand" 
> "rn,rn"))
>                   (const_int 0)))
> -   (clobber (match_scratch:GPR 0 "=r,r"))]
> -  "<MODE>mode == Pmode"
> +   (clobber (match_scratch:P 0 "=r,r"))]
> +  ""
>    "@
>     rotl<wd>%I2. %0,%1,%<hH>2
>     #"
> -  "&& reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)"
> +  "reload_completed && cc_reg_not_cr0_operand (operands[3], CCmode)"
>    [(set (match_dup 0)
> -     (rotate:GPR (match_dup 1)
> -                 (match_dup 2)))
> +     (rotate:P (match_dup 1)
> +               (match_dup 2)))
>     (set (match_dup 3)
>       (compare:CC (match_dup 0)
>                   (const_int 0)))]

Why?  This diverges the "dot" version from the non-dot for no reason.

(We can do 32-bit rotates on 64-bit implementations just fine, and even
the record ("dot") form works just fine; except for the setting of
"smaller than" in CR0.  And we can fix that even (by not using 0,31 but
a wrapping mask, say, 1,0), but more readable generated code was more
important so far.)


I don't see a patch with splitters only?  Huh.  Did you attach the same
patch twice?

Since this won't be handled before combine (or what do I miss?), it is
fine to do splitters only (splitters for combine).  But the other
approach is fine as well.

Thanks,


Segher

Reply via email to