https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70970

            Bug ID: 70970
           Summary: Misaligned SSE with auto-vectorization
           Product: gcc
           Version: 5.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rcc.dark at gmail dot com
  Target Milestone: ---

Created attachment 38424
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38424&action=edit
the code

Sorry if this has been reported before. I have tested it under:

gcc version 5.3.1 20151207 (Red Hat 5.3.1-2) (GCC)
gcc version 5.2.1 20151028 (Debian 5.2.1-23) 
gcc version 6.1.0 (GCC)     <----- Windows
gcc version 5.2.0 (GCC)     <----- Windows, MinGW-W64

The following code crashes with -std=c++14 -O3:


#include <cstdint>
#include <malloc.h>

template<typename RI>
__attribute__ ((noinline))
void symmetric_difference(RI ai, RI af, RI bi)
{
   while (ai != af) {
      *ai++ ^= *bi++;
   }
}

int main( )
{
   auto p1 = reinterpret_cast<char*>(memalign(4096, 32));
   auto p2 = reinterpret_cast<char*>(memalign(4096, 32));
   // _aligned_malloc under Windows

   auto ai = reinterpret_cast<std::uint64_t*>(p1 + 1);
   auto bi = reinterpret_cast<std::uint64_t*>(p2 + 1);

   symmetric_difference(ai, ai + 64, bi);
}


It stops crashing with -O2 or if I remove the + 1 to the pointers; GDB tells me
that the problem lies within:

  vmovdqa YMMWORD PTR [rbx+rcx*1],ymm0

The register rbx is not aligned and rcx = 0. It seems that it is reading with
vmovdqu but storing with vmovdqa.

Reply via email to