https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107006
Bug ID: 107006
Summary: Missing optimization: common idiom for external data
Product: gcc
Version: 12.2.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: hpa at zytor dot com
Target Milestone: ---
Created attachment 53602
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53602&action=edit
C test case source
The only *portable* way in C to deal with external data structures containing
data of specific endianness, possibly unaligned, is to operate on them as byte
(char) arrays.
At least on x86 (which supports arbitrarily aligned loads), gcc *sometimes*
recognize these as single loads, but sometimes not.
In the included test cases, there is a plain C implementation and an
implementation wrapped in a C++ class.
Compiling the former with:
gcc -std=c2x -g -O3 -W -Wall -[cSE] -o bswap.[osi] bswap.c
... recognizes the load idiom for 16-bit numbers but not for 32- or 64-bit
numbers.
Compiling the latter with:
gcc -std=c++20 -g -O3 -E -Wall -[cSE] -o bswapcc.[osi] bswapcc.cc
... *additionally* recognizes the 32-bit load, *but only in the bigendian case*
(that is, it generates a load and a bswap instruction); whereas in the
littleendian -- native -- case, this does not happen!
I am familiar with the used of packed arrays and __builtin_bswap*() for these
accesses, but unfortunately these are gcc-specific.