On 08/17/2016 07:30 PM, Christian König wrote:
But in my measurements POPF is not fast even in the case where restored flags are not changed at all:mov $200*1000*1000, %eax pushf pop %rbx .balign 64 loop: push %rbx popf dec %eax jnz loop # perf stat -r20 ./popf_1g 4,929,012,093 cycles # 3.412 GHz ( +- 0.02% ) 835,721,371 instructions # 0.17 insn per cycle ( +- 0.02% ) 1.446185359 seconds time elapsed ( +- 0.46% ) If I replace POPF with a pop into an unused register, I get this:You are comparing apples and bananas here.
Yes, I know. Pop into a register here is basically free. I'd also add a STI and measure how much it takes to enable interrupts, but unfortunately STI throws a #GP in CPL 3.

