https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89811

            Bug ID: 89811
           Summary: uint32_t load is not recognized if shifts are done in
                    a fixed-size loop
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: nok.raven at gmail dot com
  Target Milestone: ---
              Host: x86_64
            Target: x86_64

I was expecting that fixed-size loop will be unrolled and the uint32_t load
pattern is recognized, but it does not happen. Clang has no problems with this
https://godbolt.org/z/8ES09V

#include <cstdint>
#include <cstring>

// recognized
std::uint32_t good(const unsigned char *p)
{
    std::uint32_t result = 0;
    result |= (static_cast<std::uint32_t>(p[0]) << 0);
    result |= (static_cast<std::uint32_t>(p[1]) << 8);
    result |= (static_cast<std::uint32_t>(p[2]) << 16);
    result |= (static_cast<std::uint32_t>(p[3]) << 24);
    return result;
}

// not recognized if done in a loop
std::uint32_t loop(const unsigned char *p)
{
  std::uint32_t result = 0;
  for (int i = 0; i < 4; ++i)
      result |= (static_cast<std::uint32_t>(p[i]) << (i * 8));

  return result;
}

// other variations are not recognized too
std::uint32_t bad(const unsigned char *p)
{
  std::uint32_t result = 0;
  //result <<= 8;
  result |= static_cast<std::uint32_t>(p[3]);
  result <<= 8;
  result |= static_cast<std::uint32_t>(p[2]);
  result <<= 8;
  result |= static_cast<std::uint32_t>(p[1]);
  result <<= 8;
  result |= static_cast<std::uint32_t>(p[0]);

  return result;
}

std::uint32_t loop2(const unsigned char *p)
{
  std::uint32_t result = 0;
  for (int i = 0; i < 4; ++i) {
      result <<= 8;
      result |= static_cast<std::uint32_t>(p[3 - i]);
  }

  return result;
}

Reply via email to