https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908

--- Comment #44 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #43)
> One thing I found by experiments:
> Insert 64 vaddps %xmm18, %xmm19, %xmm20(no dependence between each other,
> just emulate for pipeline) before stalled load, stlf stall case is as fast
> as no stall cases on CLX. I guess this is "distance" you mean.
> 
But there's still event for STLF blocks, guess processor scheduler helps here.

Reply via email to