[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-04-10 Thread dimula73 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Dmitry Kazakov  changed:

   What|Removed |Added

 CC||dimula73 at gmail dot com

--- Comment #25 from Dmitry Kazakov  ---
Hi, all!

I would like to add one more test file, related to the problem. If GCC tries to
call a function, that accepts a __m256 register as a parameter, it unloads this
parameter into the stack using an **aligned** move (vmovaps), but the alignment
guarantee on Windows is only 16-byte. It means that the application will crash
because of unaligned memory access.

Affected versions: GCC 7.3.0 (MinGW64), GCC 8.1.0 (MinGW64)

Here is the testing source (see also in an attachment):

#include 

struct X { 
alignas(32) __m256 d;
};

void g1(X);
void g2(const X&);
void g3(const void *);

void f(float *ptr) {
X x = {_mm256_load_ps(ptr)};
g1(x);  // BUG: passes via unaligned (whatever rsp alignment is) stack
g2(x);  // OK: passes via aligned stack location
g3(&x); // OK: passes via aligned stack location
}


Compiled result (-O2 -march=skylake):

_Z1fPf:
.LFB5135:
pushq   %rbx
.seh_pushreg%rbx
addq$-128, %rsp
.seh_stackalloc 128
.seh_endprologue
vmovaps (%rcx), %ymm0
leaq95(%rsp), %rbx
leaq32(%rsp), %rcx
andq$-32, %rbx
vmovaps %ymm0, (%rbx)# %rbx is properly aligned 
vmovaps %ymm0, 32(%rsp)  # %rsp may be unaligned
vzeroupper
call_Z2g11X
movq%rbx, %rcx
call_Z2g2RK1X
movq%rbx, %rcx
call_Z2g3PKv
nop
subq$-128, %rsp
popq%rbx
ret

Related bug in Vc library: https://github.com/VcDevel/Vc/issues/241
Related bug in Krita: https://bugs.kde.org/show_bug.cgi?id=406209

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-04-10 Thread dimula73 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #26 from Dmitry Kazakov  ---
Created attachment 46133
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46133&action=edit
Test source for unaligned pass-by-value crash

Test file for the comment above

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-04-10 Thread dimula73 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #27 from Dmitry Kazakov  ---
As a workaround, one can either use __attribute__((always_inline)) for *all*
the functions accepting __m256 or pass *all* arguments by const-ref. Const-ref
arguments are passed correctly.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2021-08-24 Thread dimula73 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Dmitry Kazakov  changed:

   What|Removed |Added

 CC||dimula73 at gmail dot com

--- Comment #31 from Dmitry Kazakov  ---
Hi, all!

Just wanted to note that the bug is still present in GCC 10.3.0 on Windows
(from MSYS-MinGW64 packages).

> gcc (Rev5, Built by MSYS2 project) 10.3.0

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2024-03-27 Thread dimula73 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #42 from Dmitry Kazakov  ---
Hi, Avraham!

> Does it remain true that the only option to get around this bug without 
> killing all AVX2 is to pass "-Wa,-muse-unaligned-vector-move" when compiling 
> using GCC on Windows 64? Thank you

I'm not sure about your particular issue, but in our case we used to manage to
workaround this issue by passing AVX2-related structures by reference (or
const-reference, when possible).