https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112398
Alexey Merzlyakov <alexey.merzlyakov at samsung dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |alexey.merzlyakov at samsung dot c | |om --- Comment #6 from Alexey Merzlyakov <alexey.merzlyakov at samsung dot com> --- I've made the solution that is trying to optimize the following patterns: (zero_extend:M (subreg:N (not:O==M (X:Q==M)))) -> (xor:M (zero_extend:M (subreg:N (X:M)), 0xffff)) ... where mask takes 0xffff bits of N mode bitsize For the cases when X:M doesn't have any non-zero bits outside of mode N, (zero_extend:M (subreg:N (X:M)) could be simplified to just (X:M) and whole optimization will be: (zero_extend:M (subreg:N (not:M (X:M)))) -> (xor:M (X:M, 0xffff)) It was added to simplify_context::simplify_unary_operation_1() for ZERO_EXTEND case and it simplifies the tests from the initially reported by Siarhei repro-cases for RISCV64 and MIPS32 targets. Although the shift test from the Comment #3 is simplified only when using "-fno-tree-forwprop". Forward propagation seem to optimize the code so that there is a necessity of using output registers from both NOT and ZERO_EXTEND instructions at the same time, which does not allow just leave XOR results for further flows (NOT insn results are also being used). So, this case seem to be out of combine scope (not sure is this even possible to handle this fine?) Anyway, the solution locally seem work fine where combine could allow us to do NOT-ZERO_EXTEND expressions shortening. If you are OK with this approach and no one else handles this problem, I am ready to test it carefully and submit the patch.