On Wed, Sep 25, 2024 at 01:41:51PM +0200, Richard Biener wrote: > wide_int_storage shows up high in the profile for the testcase in > PR114855 where the apparent issue is that the conditional jump > on 'precision' after the (inlined) memcpy stalls the pipeline due > to the data dependence and required store-to-load forwarding. We > can add scheduling freedom by instead testing precision as from the > source which speeds up the function by 30%. I've applied the > same logic to the copy CTOR. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > * wide-int.h (wide_int_storage::wide_int_storage): Branch > on source precision to avoid data dependence on memcpy > destination. > (wide_int_storage::operator=): Likewise.
LGTM. Jakub