http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46514
Summary: 128-bit shifts on x86_64 generate silly code unless the shift amount is constant Product: gcc Version: 4.5.1 Status: UNCONFIRMED Severity: minor Priority: P3 Component: rtl-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: l...@mit.edu Created attachment 22428 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=22428 Preprocessed source I'm using 4.5.1 (Fedora 14) with -O3, but -O2 does the same thing. This really easy case: uint64_t shift_test_31(__uint128_t x, uint32_t shift) { if (shift != 31) __builtin_unreachable(); return (uint64_t)(x >> shift); } generates: 0000000000000050 <shift_test_31>: 50: 48 89 f8 mov %rdi,%rax 53: 48 0f ac f0 1f shrd $0x1f,%rsi,%rax 58: c3 retq 59: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) which is entirely sensible. But this: uint64_t shift_test_le_31(__uint128_t x, uint32_t shift) { if (shift >= 32) __builtin_unreachable(); return (uint64_t)(x >> shift); } generates this: 0000000000000060 <shift_test_le_31>: 60: 89 d1 mov %edx,%ecx 62: 48 89 6c 24 f8 mov %rbp,-0x8(%rsp) 67: 48 89 f5 mov %rsi,%rbp 6a: 48 0f ad f7 shrd %cl,%rsi,%rdi 6e: 48 d3 ed shr %cl,%rbp 71: f6 c2 40 test $0x40,%dl 74: 48 89 5c 24 f0 mov %rbx,-0x10(%rsp) 79: 48 0f 45 fd cmovne %rbp,%rdi 7d: 48 8b 5c 24 f0 mov -0x10(%rsp),%rbx 82: 48 8b 6c 24 f8 mov -0x8(%rsp),%rbp 87: 48 89 f8 mov %rdi,%rax 8a: c3 retq which contains a pointless shr, test, and cmovne. (Even if I change the __builtin_unreachable() into a real branch, I get the same code.)