On 11/5/18 7:44 AM, Richard Biener wrote: > > The PR18041 testcase is about bitfield insertion of the style > > b->bit |= <...> > > where the RMW cycle we end up generating contains redundant > masking and ORing of the original b->bit value. The following > adds a combine pattern in simplify-rtx to specifically match > > (X & C) | ((X | Y) & ~C) > > and simplifying that to X | (Y & ~C). That helps improving > code-generation from > > movzbl (%rdi), %eax > orl %eax, %esi > andl $-2, %eax > andl $1, %esi > orl %esi, %eax > movb %al, (%rdi) > > to > > andl $1, %esi > orb %sil, (%rdi) > > if you OR in more state association might break the pattern again. > > Still the bug was long-time assigned to me (for doing sth on > the tree level for combining multiple adjacent bitfield accesses > as in the original testcase). So this is my shot at the part > of the problem that isn't going to be solved on trees. > > Bootstrap & regtest running on x86_64-unknown-linux-gnu. > > The "simpler" testcase manages to break the combination on > x86-64 with -m32, a combine missed-optimization I guess. > > A similar case can be made for b->bit &= <...>. > > OK for trunk? > > Thanks, > Richard. > > 2018-11-05 Richard Biener <rguent...@suse.de> > > PR middle-end/18041 > * simplify-rtx.c (simplify_binary_operation_1): Add pattern > matching bitfield insertion. > > * gcc.target/i386/pr18041-1.c: New testcase. > * gcc.target/i386/pr18041-2.c: Likewise. There was at least one more BZ in this space (older than 18041). Essentially all the pieces are there for combine to figure out we've got a bitfield twiddle, but the the structure of some of combine's code made it exceedingly hard to exploit. I wonder if this would help. I'm sure I'll look at it during the stage3/stage4 cycle, so we'll know then.
OK for the trunk. As you note there's likely corresponding cases for BIT-AND as the toplevel op. jeff