https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121225
Bug ID: 121225 Summary: Missed autovectorization of bswap8 in a loop Product: gcc Version: 16.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: dusan.stojko...@rt-rk.com Target Milestone: --- Hi, so a bswap8 in a loop like this: ``` void vbswap8(unsigned int* in, int len) { for (int i = 0; i < len; i++) in[i] = (in[i] & 0xffff0000) | ((in[i] & 0xff00) >> 8) | ((in[i] & 0xff) << 8); } ``` When compiled with: -O3 -march=mavx2. The loop is not being vectorized. Bswap finds a suitable expression to transform like so: ... 16 bit bswap implementation found at: _13 = _7 | _10; ... load_dst_8 = MEM <short unsigned int> [(unsigned int *)_3]; bswapdst_12 = load_dst_8 r>> 8; _13 = (unsigned int) bswapdst_12; ... And when it is time for the vectorization pass to execute, it fails with the following message: ... note: can tell at compile time that MEM <short unsigned int> [(unsigned int *)_3] and *_3 alias missed: not vectorized: compilation time alias: load_dst_8 = MEM <short unsigned int> [(unsigned int *)_3]; ... It seems to be related to the bswap pass, since compiling with -fno-expensive-optimizations produces the vectorized version: https://godbolt.org/z/ba5rfTbPP