> In x86/x86-64 world one can be almost sure that the load+execute instruction
> pair will execute (marginaly to noticeably) faster than move+load-and-execute
> instruction pair as the more complex instructions are harder for on-chip
> scheduling (they retire later).
                               ^^^ retirement filling up the scheduler
                               easilly.
> Perhaps we can move such a transformation somewhere more generically perhaps 
> to
> post-reload copyprop?
> 
> Honza

Reply via email to