https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111905

--- Comment #4 from Kamil Kaznowski <kamkaz at windowslive dot com> ---
(In reply to Andrew Pinski from comment #2)
> Why are you using `-mprefer-vector-width=512` here?
> 
> 512 causes the loop to be needing to be unrolled once more and that is why
> the confusion happening.

I don't think the preferred vector width mattered there? The vector width used
was already 512 (zmm registers).

I created a simpler example, no extra flags affecting preferred vector width,
chunks of 256 bit (8*32bit). Still the same issue appears - a lot of
weirdly-unrolled code, that I presume (hope?) is dead + some extra code at the
start I can't quite figure out.

Reply via email to