https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89811
Bug ID: 89811 Summary: uint32_t load is not recognized if shifts are done in a fixed-size loop Product: gcc Version: 9.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: nok.raven at gmail dot com Target Milestone: --- Host: x86_64 Target: x86_64 I was expecting that fixed-size loop will be unrolled and the uint32_t load pattern is recognized, but it does not happen. Clang has no problems with this https://godbolt.org/z/8ES09V #include <cstdint> #include <cstring> // recognized std::uint32_t good(const unsigned char *p) { std::uint32_t result = 0; result |= (static_cast<std::uint32_t>(p[0]) << 0); result |= (static_cast<std::uint32_t>(p[1]) << 8); result |= (static_cast<std::uint32_t>(p[2]) << 16); result |= (static_cast<std::uint32_t>(p[3]) << 24); return result; } // not recognized if done in a loop std::uint32_t loop(const unsigned char *p) { std::uint32_t result = 0; for (int i = 0; i < 4; ++i) result |= (static_cast<std::uint32_t>(p[i]) << (i * 8)); return result; } // other variations are not recognized too std::uint32_t bad(const unsigned char *p) { std::uint32_t result = 0; //result <<= 8; result |= static_cast<std::uint32_t>(p[3]); result <<= 8; result |= static_cast<std::uint32_t>(p[2]); result <<= 8; result |= static_cast<std::uint32_t>(p[1]); result <<= 8; result |= static_cast<std::uint32_t>(p[0]); return result; } std::uint32_t loop2(const unsigned char *p) { std::uint32_t result = 0; for (int i = 0; i < 4; ++i) { result <<= 8; result |= static_cast<std::uint32_t>(p[3 - i]); } return result; }