[Bug rtl-optimization/112398] Suboptimal code generation for xor pattern on subreg

alexey.merzlyakov at samsung dot com via Gcc-bugs Thu, 03 Oct 2024 07:17:07 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112398


Alexey Merzlyakov <alexey.merzlyakov at samsung dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |alexey.merzlyakov at samsung 
dot c
                   |                            |om

--- Comment #6 from Alexey Merzlyakov <alexey.merzlyakov at samsung dot com> ---
I've made the solution that is trying to optimize the following patterns:

  (zero_extend:M (subreg:N (not:O==M (X:Q==M)))) ->
  (xor:M (zero_extend:M (subreg:N (X:M)), 0xffff))
    ... where mask takes 0xffff bits of N mode bitsize

For the cases when X:M doesn't have any non-zero bits outside of mode N,
(zero_extend:M (subreg:N (X:M)) could be simplified to just (X:M) and whole
optimization will be:

  (zero_extend:M (subreg:N (not:M (X:M)))) ->
  (xor:M (X:M, 0xffff))

It was added to simplify_context::simplify_unary_operation_1() for ZERO_EXTEND
case and it simplifies the tests from the initially reported by Siarhei
repro-cases for RISCV64 and MIPS32 targets.

Although the shift test from the Comment #3 is simplified only when using
"-fno-tree-forwprop". Forward propagation seem to optimize the code so that
there is a necessity of using output registers from both NOT and ZERO_EXTEND
instructions at the same time, which does not allow just leave XOR results for
further flows (NOT insn results are also being used). So, this case seem to be
out of combine scope (not sure is this even possible to handle this fine?)

Anyway, the solution locally seem work fine where combine could allow us to do
NOT-ZERO_EXTEND expressions shortening.

If you are OK with this approach and no one else handles this problem, I am
ready to test it carefully and submit the patch.

[Bug rtl-optimization/112398] Suboptimal code generation for xor pattern on subreg

Reply via email to