https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63568
--- Comment #6 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 16 Dec 2014, mpolacek at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63568 > > --- Comment #5 from Marek Polacek <mpolacek at gcc dot gnu.org> --- > True. E.g. on my x86_64 i7 Nehalem I see (using ./cc1 -quiet -O2 qq.c -mbmi) > > andn %edi, %edx, %edi > andl %edx, %esi > movl %edi, %eax > orl %esi, %eax > ret > > for return (a & ~m) | (b & m); and > > xorl %edi, %esi > movl %edi, %eax > andl %esi, %edx > xorl %edx, %eax > ret > > for return a ^ ((a ^ b) & m); The former is also better for instruction level parallelism - but that just asks for a clever enough expander / combiner that can generate that from the latter. I think on GIMPLE we want to canoncalize to the variant with less (gimple) operations. single-use restrictions may apply (with the lack of a global unified combine / CSE phase)