https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103252

--- Comment #7 from Jason A. Donenfeld <jason at zx2c4 dot com> ---
The strange thing in this case is that the non-avx512 codegen _doesn't_ spill
to memory. It just uses the gprs that are around. So it seems like that,
somehow, the mere existence of the mask registers causes the register allocator
to be lazier than usual, resulting in this situation, where the effects combine
to produce suboptimal code.

Reply via email to