http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59100
Bug ID: 59100 Summary: requesting optimization of safe rotate function Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: regehr at cs dot utah.edu Here is the obvious rotate idiom in C (this code is from Nettle): #define ROTL32(n,x) (((x)<<(n)) | ((x)>>(32-(n)))) GCC does an admirable job of recognizing the code and turning it into a rotate instruction, when one is available. The problem is that this code executes undefined behavior when n==0 or n==32. Most crypto libraries are careful not to rotate by 32, but out of 10 libraries that I examined, 5 execute undefined behavior when rotating by zero. We can make the obvious modification to protect against undefined rotate by 0: #define ROTL32(n,x) ((n)==0?(x):(((x)<<(n)) | ((x)>>(32-(n))))) Notice that this can be turned into exactly the same object code as the earlier macro since the rotate-by-0 special case is already handled by the rotate instruction. However, this isn't what we get out of the compiler: rotl32c: movl %edi, %eax movb %sil, %cl roll %cl, %eax testl %esi, %esi cmove %edi, %eax ret I'm in the process of trying to convince crypto library maintainers to fix their rotate functions and macros, and this will be easier if the fix doesn't have a performance penalty. So would it be possible for you folks to teach the compiler to emit the better code for the safe rotate idiom?