https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91150
Bug ID: 91150 Summary: [10 Regression] wrong code with -O -mavx512vbmi due to wrong writemask Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 46594 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46594&action=edit reduced testcase Output: $ x86_64-pc-linux-gnu-gcc -O -mavx512vbmi testcase.c $ sde64 -- ./a.out Aborted At the assembly level, the problem seems to be: # testcase.c:12: { vpxor xmm2, xmm2, xmm2 # tmp117 mov eax, 4294967295 # tmp119, vmovdqa64 zmm4, ZMMWORD PTR [rsp+8] # tmp118, b kmovq k1, rax # tmp119, tmp119 vmovdqu8 zmm4{k1}, zmm2 # tmp118, tmp119, tmp118, tmp117 # testcase.c:11: a <<= (v64u64) (v64u128) vpsllvq zmm1, zmm1, zmm4 # a, tmp123, tmp118 vmovdqu8 is using the k1 mask to load zeros to given bytes - the mask should be 0xffffffffffff0000 instead (only the lowest-order 16byte word is kept) $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-273353-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/10.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-273353-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.0.0 20190710 (experimental) (GCC)