http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48297
Summary: Suboptimal optimization of boolean expression addition Product: gcc Version: 4.6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassig...@gcc.gnu.org ReportedBy: darkshik...@gmail.com gcc does not seem to recognize that boolean expressions, such as (a==b), are at most 1. At least not when performing math on them. Take the following function: int foo( int a, int b, int c, int x ) { return (a==x)+(b==x)+(c==x); } With -O3 -fomit-frame-pointer -march=core2 on x86_64, gcc compiles this to: 0: 39 cf cmp edi,ecx 2: 40 0f 94 c7 sete dil 6: 31 c0 xor eax,eax 8: 39 ce cmp esi,ecx a: 40 0f b6 ff movzx edi,dil e: 0f 94 c0 sete al 11: 01 f8 add eax,edi 13: 39 ca cmp edx,ecx 15: 0f 94 c2 sete dl 18: 0f b6 d2 movzx edx,dl 1b: 01 d0 add eax,edx 1d: c3 ret As can be seen, gcc extends the inputs to 32-bit (with xor or movzx) and uses 32-bit additions, even though it could avoid this by using 8-bit additions and a single movzx at the end. Let's try to force it: int bar( int a, int b, int c, int x ) { return (uint8_t)((uint8_t)((uint8_t)(a==x)+(uint8_t)(b==x))+(uint8_t)(c==x)); } 0000000000000020 <bar>: 20: 39 ce cmp esi,ecx 22: 40 0f 94 c6 sete sil 26: 39 cf cmp edi,ecx 28: 0f 94 c0 sete al 2b: 01 f0 add eax,esi 2d: 39 ca cmp edx,ecx 2f: 0f 94 c2 sete dl 32: 01 d0 add eax,edx 34: 0f b6 c0 movzx eax,al 37: c3 ret Closer -- we saved two instructions -- but this is actually worse: the additions will have false dependencies on the previous contents of those registers. What we WANT is this: cmp esi,ecx sete sil cmp edi,ecx sete al add al,sil cmp edx,ecx sete dl add al,dl movzx eax,al ret But gcc won't generate it no matter how much I attempt to massage it into doing so. Is this a missing optimization or an optimization bug?