[Bug c++/48659] New: Segmentation fault when using openMP and SSE

2011-04-17 Thread npozar at quick dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48659

   Summary: Segmentation fault when using openMP and SSE
   Product: gcc
   Version: 4.5.2
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: npo...@quick.cz


Created attachment 24026
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=24026
A code snippet that reproduces the error

g++ tries to save an xmm register using MOVAPD (SSE instruction) to the stack
when switching between working chunks during openMP multithreading. This
obviously causes a random segmentation fault when the stack pointer happens not
be aligned to a 16-byte boundary. 

Please see the attached code. I compile it with g++ 4.5.2 (I'm using MinGW) and
flags -O3 -msse3 -fopenmp.

It is important that the optimization is on and the compiler tries to save the
xmm register containing the constant zero between working chunks. This is the
instruction that causes the segmentation fault if ebp-0x48 is not divisible by
0x10:

0040143a:   movapd %xmm1,-0x48(%ebp) // right here
0040143f:   call 0x4014bc 



PS. I have to admit that I'm completely new to GCC, openMP or SSE (I just
learned about openMP today and I've playing with them for a couple hours only),
so I might be just doing something really stupid.


[Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned

2011-05-14 Thread npozar at quick dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

Norbert Pozar  changed:

   What|Removed |Added

   Severity|major   |critical


[Bug target/49001] New: GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned

2011-05-14 Thread npozar at quick dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

   Summary: GCC uses VMOVAPS/PD AVX instructions to access stack
variables that are not 32-byte aligned
   Product: gcc
   Version: 4.6.1
Status: UNCONFIRMED
  Severity: major
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: npo...@quick.cz


I'm using a custom mingw64 build of GCC 4.6.1. My target is Windows 64bit. I
compile with g++ -03 -march=corei7-avx -mtune=corei7-avx -mavx.

GCC uses aligned moves VMOVAPS/PD from the new AVX instruction set to access
local variables of type __m256/__m256d on the stack. But the stack pointer is
only 16byte aligned on Win64, so this causes a segmentation fault error when
the stack pointer is not 32byte aligned, as in:

__m256 dummy_ps256;
void test_stackalign32() {
__m256 x = dummy_ps256;
dummy_ps256 = sin256_ps_avx(x);
}

which compiles to 

vmovapsdummy_ps256(%rip), %ymm0
leaq32(%rsp), %rdx
vmovaps%ymm0, 32(%rsp)  // possible SEGFAULT
leaq64(%rsp), %rcx
vzeroupper
call_Z13sin256_ps_avxDv8_f
vmovaps64(%rsp), %ymm0  // possible SEGFAULT

I couldn't figure out how to realign a stack with -mstackrealign.


[Bug target/49001] GCC uses VMOVAPS/PD AVX instructions to access stack variables that are not 32-byte aligned

2011-05-15 Thread npozar at quick dot cz
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49001

--- Comment #3 from Norbert Pozar  2011-05-16 06:05:37 
UTC ---
(In reply to comment #1)
> Please provide testcase that can be compiled without changes. See [1].

I'm sorry about this.

> Probably mingw64 specific problem... CC added.

Thank you for your time to test the code on linux. I was worried that this
might be mingw64 specific.

(In reply to comment #2)
> Stack alignment isn't supported on Windows.

Since this bug effectively prevents using 256bit AVX instructions when
compiling for Windows using GCC, I was wondering if there are any plans to
support the stack alignment. It seems that simply adding 

andq$-32, %rsp

to the function prologue would fix this. Or would it be feasible to replace
VMOVAPS by unaligned VMOVUPS when accessing the stack?