https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94037

--- Comment #10 from ncm at cantrip dot org ---
(In reply to Uroš Bizjak from comment #9)
> (In reply to ncm from comment #8)
> > It seems worth mentioning that the round trip through 
> > L1 cache is just a workaround for the optimizer refusing 
> > to ever emit two CMOV instructions in a basic block.
> > 
> > Recognizing and replacing the construct with CMOVs 
> > explicitly would speed up a great many algorithms.
>
> Not universally. See PR56309.

I am aware of that report.

Transforming this rendition of swap_if as suggested
would not create any _new_ dependencies, so may be done 
without fear of introducing regressions.

Actually using this version of swap_if in algorithms
requires careful consideration of whether it may build
such dependency chains, but its use in partitioning,
specifically, has been proven safe.

Reply via email to