https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104632

            Bug ID: 104632
           Summary: Missed optimization about backward reads
           Product: gcc
           Version: 11.2.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lh_mouse at 126 dot com
  Target Milestone: ---
            Target: x86_64-linux-gnu

This is a piece of code that has been simplified from a Boyer-Moore-Horspool
implementation:

https://gcc.godbolt.org/z/766GYM8xf
```c++
// In real code this was
//   `load_le32_backwards(::std::reverse_iterator<const unsigned char*> ptr)
unsigned
load_le32_backwards(const unsigned char* ptr)
  {
    unsigned word =    ptr[-1];
    word = word << 8 | ptr[-2];
    word = word << 8 | ptr[-3];
    word = word << 8 | ptr[-4];
    return word;
  }
```

This is equivalent to `return ((unsigned*)ptr)[-1];` on x86_64, but GCC fails
to optimize it:

GCC output:
```
load_le32_backwards(unsigned char const*):
        movzx   edx, BYTE PTR [rdi-1]
        movzx   eax, BYTE PTR [rdi-2]
        sal     edx, 8
        or      eax, edx
        movzx   edx, BYTE PTR [rdi-3]
        sal     eax, 8
        or      edx, eax
        movzx   eax, BYTE PTR [rdi-4]
        sal     edx, 8
        or      eax, edx
        ret
```

Clang output:
```
load_le32_backwards(unsigned char const*):             #
@load_le32_backwards(unsigned char const*)
        mov     eax, dword ptr [rdi - 4]
        ret
```

Reply via email to