https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102294

--- Comment #33 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Mateusz Guzik from comment #32)
> For non-simd asm you can do at most 8 bytes per one mov instruction.
> 
> Stock gcc resorts to rep movsq for sizes bigger than 40 bytes. Telling it to
> not use rep movsq results in loops of 4 movsq instructions (aka 32 bytes per
> iteration).
> 
> An ok upper limit to still do this instead of punting to libcall is 256
> bytes.
> 
> In case of -mno-simd I'm advocating for issuing the 32-byte (aka 4 store)
> loops up to 256 bytes and punting to libcall otherwise.
> 
> Fully unrolling these would raise numerous eyebrows due to i-cache footprint
> and I don't believe this is warranted.

One store can move up to 64 bytes.

Reply via email to