The PR18041 testcase is about bitfield insertion of the style
b->bit |= <...> where the RMW cycle we end up generating contains redundant masking and ORing of the original b->bit value. The following adds a combine pattern in simplify-rtx to specifically match (X & C) | ((X | Y) & ~C) and simplifying that to X | (Y & ~C). That helps improving code-generation from movzbl (%rdi), %eax orl %eax, %esi andl $-2, %eax andl $1, %esi orl %esi, %eax movb %al, (%rdi) to andl $1, %esi orb %sil, (%rdi) if you OR in more state association might break the pattern again. Still the bug was long-time assigned to me (for doing sth on the tree level for combining multiple adjacent bitfield accesses as in the original testcase). So this is my shot at the part of the problem that isn't going to be solved on trees. Bootstrap & regtest running on x86_64-unknown-linux-gnu. The "simpler" testcase manages to break the combination on x86-64 with -m32, a combine missed-optimization I guess. A similar case can be made for b->bit &= <...>. OK for trunk? Thanks, Richard. 2018-11-05 Richard Biener <rguent...@suse.de> PR middle-end/18041 * simplify-rtx.c (simplify_binary_operation_1): Add pattern matching bitfield insertion. * gcc.target/i386/pr18041-1.c: New testcase. * gcc.target/i386/pr18041-2.c: Likewise. diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 2ff68ceb4e3..0d53135f1ff 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -2857,6 +2857,38 @@ simplify_binary_operation_1 (enum rtx_code code, machine_mode mode, XEXP (op0, 1)); } + /* The following happens with bitfield merging. + (X & C) | ((X | Y) & ~C) -> X | (Y & ~C) */ + if (GET_CODE (op0) == AND + && GET_CODE (op1) == AND + && CONST_INT_P (XEXP (op0, 1)) + && CONST_INT_P (XEXP (op1, 1)) + && (INTVAL (XEXP (op0, 1)) + == ~INTVAL (XEXP (op1, 1)))) + { + /* The IOR may be on both sides. */ + rtx top0 = NULL_RTX, top1 = NULL_RTX; + if (GET_CODE (XEXP (op1, 0)) == IOR) + top0 = op0, top1 = op1; + else if (GET_CODE (XEXP (op0, 0)) == IOR) + top0 = op1, top1 = op0; + if (top0 && top1) + { + /* X may be on either side of the inner IOR. */ + rtx tem = NULL_RTX; + if (rtx_equal_p (XEXP (top0, 0), + XEXP (XEXP (top1, 0), 0))) + tem = XEXP (XEXP (top1, 0), 1); + else if (rtx_equal_p (XEXP (top0, 0), + XEXP (XEXP (top1, 0), 1))) + tem = XEXP (XEXP (top1, 0), 0); + if (tem) + return simplify_gen_binary (IOR, mode, XEXP (top0, 0), + simplify_gen_binary + (AND, mode, tem, XEXP (top1, 1))); + } + } + tem = simplify_byte_swapping_operation (code, mode, op0, op1); if (tem) return tem; diff --git a/gcc/testsuite/gcc.target/i386/pr18041-1.c b/gcc/testsuite/gcc.target/i386/pr18041-1.c new file mode 100644 index 00000000000..24da41a02ec --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr18041-1.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +struct B { unsigned bit0 : 1; unsigned bit1 : 1; }; + +void +foo (struct B *b) +{ + b->bit0 = b->bit0 | b->bit1; +} + +/* { dg-final { scan-assembler-times "and" 1 } } */ +/* { dg-final { scan-assembler-times "or" 1 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr18041-2.c b/gcc/testsuite/gcc.target/i386/pr18041-2.c new file mode 100644 index 00000000000..00ebd2ae36d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr18041-2.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +struct B { unsigned bit0 : 1; unsigned bit1 : 1; }; + +void +bar (struct B *b, int x) +{ + b->bit0 |= x; +} + +/* This fails to combine in 32bit mode but not for x32. */ +/* { dg-final { scan-assembler-times "and" 1 { xfail { { ! x32 } && ilp32 } } } } */ +/* { dg-final { scan-assembler-times "or" 1 { xfail { { ! x32 } && ilp32 } } } } */