[PATCH] simplify-rtx: Combine bitwise operations in more cases

2025-04-23 Thread Pengfei Li
This patch transforms RTL expressions of the form (subreg (not X) off) into (not (subreg X off)) when the subreg is an operand of a bitwise AND or OR. This transformation can expose opportunities to combine a NOT operation with the bitwise AND/OR. For example, it improves the codegen of the follow

[PATCH v2] simplify-rtx: Combine bitwise operations in more cases

2025-04-28 Thread Pengfei Li
This patch transforms RTL expressions of the form (subreg (not X)) into (not (subreg X)) if the subreg is an operand of another binary logical operation. This transformation can expose opportunities to combine more logical operations. For example, it improves the codegen of the following AArch64 N

Re: [PATCH] simplify-rtx: Combine bitwise operations in more cases

2025-04-28 Thread Pengfei Li
Thanks Richard for all review comments. I have addressed the comments and sent a v2 patch in a new email thread. -- Thanks, Pengfei

[PATCH] AArch64: Fold unsigned ADD + LSR by 1 to UHADD

2025-04-28 Thread Pengfei Li
This patch implements the folding of a vector addition followed by a logical shift right by 1 (add + lsr #1) on AArch64 into an unsigned halving add, allowing GCC to emit NEON or SVE2 UHADD instructions. For example, this patch helps improve the codegen from: add v0.4s, v0.4s, v31.4s

Re: [PATCH] (not just) AArch64: Fold unsigned ADD + LSR by 1 to UHADD

2025-05-02 Thread Pengfei Li
> Heh. This is a bit of a hobby-horse of mine. IMO we should be trying > to make the generic, target-independent vector operations as useful > as possible, so that people only need to resort to target-specific > intrinsics if they're doing something genuinely target-specific. > At the moment, we

Re: [PATCH] (not just) AArch64: Fold unsigned ADD + LSR by 1 to UHADD

2025-05-01 Thread Pengfei Li
Thank you for the comments. > I don't think we can use an unbounded recursive walk, since that > would become quadratic if we ever used it when optimising one > AND in a chain of ANDs. (And using this function for ANDs > seems plausible.) Maybe we should be handling the information > in a simila