https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78821
--- Comment #12 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Uroš Bizjak from comment #8) > (In reply to rguent...@suse.de from comment #6) > > > > In addition to a merge opportunity, there is a redundant move [*], that > > > results > > > in redundant operation [**]. The whole function could be just: > > > > > > movw %dx, -4(%rdi,%rsi) > > > notl %edx > > > movw %dx, -2(%rdi,%rsi) > > > > or > > > > xorl $0xffff0000, %edx > > movl %edx, -4(%rdi,%rsi) > > > > ? > > Yes, even this. It looks that store merging opens many optimization > opportunities. Actually, the testcase stores the same word (one inverted) to two different locations. But following testcase: --cut here-- struct s { char a; char b; char c; char d; }; void foo (struct s *__restrict a, struct s *__restrict b) { a->a = b->a; a->b = b->b; a->c = ~b->c; a->d = b->d; } --cut here-- This testcase can be optimized by inserting xorl mask between load and store, as suggested above.