https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93328

            Bug ID: 93328
           Summary: missed optimization opportunity in deserialization
                    code
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: felix-gcc at fefe dot de
  Target Milestone: ---

Deserialization code often deals with endianness and alignment. However, in
some cases, the protocol endianness is the same as the host endianness, and
your platform does not care about alignment or has an unaligned load
instruction. Take this code, for example:

unsigned int foo(const unsigned char* c) {
  return c[0] + c[1]*0x100 + c[2]*0x10000 + c[3]*0x1000000;
}

On i386 or x86_64, this could just be compiled into a single load.
In fact, clang does compile this into a single load.
gcc however turns it into four loads and three shifts.

For some use cases this optimization could be a huge improvement. In fact, even
if the alignment does not match, this could be a huge improvement. The compiler
could turn it into an load + bswap.
In fact, clang does compile the big endian version into a load + bswap.

Reply via email to