My apologies in advance for a middle-end patch without a test case.   The
patch below

implements a simple/safe missing transformation in the RTL optimizers, that
transforms

(A&C)^(B&C) into the equivalent (A^B)&C, when C doesn't side-effect, such as
a constant.

 

I originally identified this opportunity in gfortran, where:

 

integer function foo(i) result(res)

  integer(kind=16), intent(in) :: i

  res = poppar(i)

end function

 

currently on x86_64 with -O2 -march=corei7 gfortran produces:

foo_:   popcntq (%rdi), %rdx

        popcntq 8(%rdi), %rax

        andl    $1, %edx

        andl    $1, %eax

        xorl    %edx, %eax

        ret

 

But with this patch, now produces:

foo_:   popcntq (%rdi), %rdx

        popcntq 8(%rdi), %rax

        xorl    %edx, %eax

        andl    $1, %eax

        ret

 

The equivalent C/C++ testcase is:

 

unsigned int foo(unsigned int x, unsigned  int y)

{

  return __builtin_parityll(x) ^ __builtin_parityll(y);

}

 

where GCC currently generates:

foo:    movl    %esi, %eax

        movl    %edi, %edi

        popcntq %rdi, %rdi

        popcntq %rax, %rax

        andl    $1, %edi

        andl    $1, %eax

        xorl    %edi, %eax

        ret

 

and with this patch, it instead generates:

foo:    movl    %esi, %eax

        movl    %edi, %edi

        popcntq %rdi, %rdi

        popcntq %rax, %rax

        xorl    %edi, %eax

        andl    $1, %eax

        ret

 

 

The trouble is I'm just about to submit a patch to improve constant folding
of parity in the

middle-end's match.pd, which will generate different RTL code sequences for
the above

two examples.  Hopefully, folks agree it's better to have a RTL optimization
that's difficult

to test, than not perform this simplification at all.  The
semantics/correctness of this

transformation are tested by the run-time tests in
gfortran.dg/popcnt_poppar_2.F90

 

This patch has been tested with "make bootstrap" and "make -k check" on

x86_64-pc-linux-gnu with no regressions.  If approved, I'd very much
appreciate it if

someone could commit this change for me.

 

 

2020-06-11  Roger Sayle  <ro...@nextmovesoftware.com>

 

        * simplify-rtx.c (simplify_binary_operation_1): Simplify

        (X & C) ^ (Y & C) to (X ^ Y) & C, when C is simple (i.e. a
constant).

 

 

Thanks very much in advance,

Roger

--

Roger Sayle

NextMove Software

Cambridge, UK

 

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 28c2dc6..ccf5f6d 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -3128,6 +3128,17 @@ simplify_binary_operation_1 (enum rtx_code code, 
machine_mode mode,
                                     mode);
       }
 
+      /* Convert (xor (and A C) (and B C)) into (and (xor A B) C).  */
+      if (GET_CODE (op0) == AND 
+         && GET_CODE (op1) == AND
+         && rtx_equal_p (XEXP (op0, 1), XEXP (op1, 1))
+         && ! side_effects_p (XEXP (op0, 1)))
+       return simplify_gen_binary (AND, mode,
+                                   simplify_gen_binary (XOR, mode,
+                                                        XEXP (op0, 0),
+                                                        XEXP (op1, 0)),
+                                   XEXP (op0, 1));
+
       /* Convert (xor (and A B) B) to (and (not A) B).  The latter may
         correspond to a machine insn or result in further simplifications
         if B is a constant.  */

Reply via email to