https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71753

            Bug ID: 71753
           Summary: Clamp function does not work with O3 optimization
           Product: gcc
           Version: 7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lukasz.spintzyk at displaylink dot com
  Target Milestone: ---

Created attachment 38830
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38830&action=edit
Code that reproduces the issue

Hi,

The issue is that with O3 flag the fast_clamp function returns wrong value.
I have read your warning about compiler flags but i decided to submit that.
Sorry for that it if this is invalid.

I have tested the code with fwrapv flag and indeed it fix the problem so
probably the issue is related to signed arithmetic overflow optimizations.
I am submitting this as it can point real issue, on clang it is working and it
is producing very similar output(see below).


I have tested this against various gcc versions and problem seems to be
introduced between 4.9.3 and 5.1. Also reproducible on gcc 7.0


How to compile:
g++ -std=c++11 -O3 testClamp.cpp

When error happens running app should print following error:
> ./a.out
Fatal error, clamp of 260 gives 4 instead of 255



This issue is not reproducible on clang with the same optimizations enabled.
I was comparing its assembly output and functions differs only with one
instruction (movzbl %dil,%eax). Maybe this issue can be fixed without disabling
some of the optimizations with fwrapv flag.

gcc assebly:
0000000000000020 <_Z10fast_clampi>:
  20:   8d 87 00 ff ff 7f       lea    0x7fffff00(%rdi),%eax
  26:   c1 f8 1f                sar    $0x1f,%eax
  29:   09 f8                   or     %edi,%eax
  2b:   c1 ff 1f                sar    $0x1f,%edi
  2e:   f7 d7                   not    %edi
  30:   21 f8                   and    %edi,%eax
  32:   c3                      retq   


0000000000000020 <_Z10fast_clampi>:
  20:   8d 87 00 ff ff 7f       lea    0x7fffff00(%rdi),%eax
  26:   c1 f8 1f                sar    $0x1f,%eax
  29:   09 f8                   or     %edi,%eax
  2b:   c1 ff 1f                sar    $0x1f,%edi
  2e:   f7 d7                   not    %edi
  30:   21 c7                   and    %eax,%edi
  32:   40 0f b6 c7             movzbl %dil,%eax
  36:   c3                      retq

Reply via email to