On 29/01/2024 14:14, Matthieu Longo wrote:
Hi Richard,
Please find below the new patch where I addressed your comments and
updated the changelog.
rev16 pattern was not recognised anymore as a change in the bswap tree
pass was introducing a new GIMPLE form, not recognized by the assembly
final transformation pass.
More details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933
gcc/ChangeLog:
PR target/108933
* config/arm/arm.md (arm_rev16si2): Convert to define_insn.
Correct generated RTL.
(arm_rev16si2_alt1): Correctly handle conditional execution.
(arm_rev16si2_alt2): Likewise.
gcc/testsuite/ChangeLog:
PR target/108933
* gcc.target/arm/rev16.c: Moved to...
* gcc.target/arm/rev16_1.c: ...here.
* gcc.target/arm/rev16_2.c: New test to check that rev16 is
emitted.
Thanks. I've tweaked the commit message very slightly and pushed this.
Could you please prepare backports for gcc-11 thru 13? It should just
be a matter of cherry-picking the commit.
R.
On 2024-01-22 16:25, Richard Earnshaw (lists) wrote:
On 22/01/2024 12:18, Matthieu Longo wrote:
rev16 pattern was not recognised anymore as a change in the bswap tree
pass was introducing a new GIMPLE form, not recognized by the assembly
final transformation pass.
More details in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108933
gcc/ChangeLog:
PR target/108933
* config/arm/arm.md (*arm_rev16si2_alt3): new pattern to
convert
a bswap + rotate by 16 bits into rev16
ChangeLog entries need to be written as sentences, so start with a
capital letter and end with a full stop; continuation lines should
start in column 8 (one hard tab, don't use spaces). But in this case,
"New pattern." is sufficient.
gcc/testsuite/ChangeLog:
PR target/108933
* gcc.target/arm/rev16.c: Moved to...
* gcc.target/arm/rev16_1.c: ...here.
* gcc.target/arm/rev16_2.c: New test to check that rev16 is
emitted.
+;; Similar pattern to match (rotate (bswap) 16)
+(define_insn "*arm_rev16si2_alt3"
+ [(set (match_operand:SI 0 "register_operand" "=l,r")
+ (rotate:SI (bswap:SI (match_operand:SI 1 "register_operand"
"l,r"))
+ (const_int 16)))]
+ "arm_arch6"
+ "rev16\\t%0, %1"
+ [(set_attr "arch" "t,32")
+ (set_attr "length" "2,4")
+ (set_attr "type" "rev")]
+)
+
Unfortunately, this is insufficient. When generating Arm or Thumb2
code (but not thumb1) we also have to handle conditional execution: we
need to have '%?' in the output template at the point where a
condition code might be needed. That means we need separate output
templates for all three alternatives (as we need a 16-bit variant for
thumb2 that's conditional and a 16-bit for thumb1 that isn't). See
the output of arm_rev16 for a guide of what is really needed.
I note that the arm_rev16si2_alt1, and arm_rev16si2_alt2 patterns are
incorrect in this regard as well; that will need fixing.
I also see that arm_rev16si2 currently expands to the alt1 variant
above; given that the preferred canonical form would now appear to use
bswap + rotate, we should change that as well. In fact, we can merge
your new pattern with the expand entirely and eliminate the need to
call gen_arm_rev16si2_alt1. Something like:
(define_insn "arm_rev16si2"
[(set (match_operand:SI 0 "s_register_operand")
(rotate:SI (bswap:SI (match_operand:SI 1
"s_register_operand")) (const_int 16))]
"arm_arch6"
"@
rev16...
...
R.