http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59100

            Bug ID: 59100
           Summary: requesting optimization of safe rotate function
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: regehr at cs dot utah.edu

Here is the obvious rotate idiom in C (this code is from Nettle):

#define ROTL32(n,x) (((x)<<(n)) | ((x)>>(32-(n))))

GCC does an admirable job of recognizing the code and turning it into a rotate
instruction, when one is available.

The problem is that this code executes undefined behavior when n==0 or n==32.

Most crypto libraries are careful not to rotate by 32, but out of 10 libraries
that I examined, 5 execute undefined behavior when rotating by zero.

We can make the obvious modification to protect against undefined rotate by 0:

#define ROTL32(n,x) ((n)==0?(x):(((x)<<(n)) | ((x)>>(32-(n)))))

Notice that this can be turned into exactly the same object code as the earlier
macro since the rotate-by-0 special case is already handled by the rotate
instruction.  However, this isn't what we get out of the compiler:

rotl32c:
    movl    %edi, %eax
    movb    %sil, %cl
    roll    %cl, %eax
    testl    %esi, %esi
    cmove    %edi, %eax
    ret

I'm in the process of trying to convince crypto library maintainers to fix
their rotate functions and macros, and this will be easier if the fix doesn't
have a performance penalty.

So would it be possible for you folks to teach the compiler to emit the better
code for the safe rotate idiom?

Reply via email to