Hi Kyrill,

Because the expansion now emits straightline code rather than conditionals and branches it should be easier to optimise in general, so I'd expect this to be an improvement overall.
That said, I have benchmarked it on SPEC2017 on aarch64.

If you have any benchmarks of interest to you you (or somebody else) can run on a target that you
care about I would be very grateful for any results.

Well, most people currently use x86_64 for scientific computing, so I
would be concerned most about this architecture. As for the test case,
min / max performance clearly has an effect on 521.wrf, so this would
be ideal.

If you could run 521.wrf on x86_64, and find that it does not
regress measureably (or even shows an improvement), the patch is OK.
I'd be interested in the timings you get.

Regards

        Thomas

Reply via email to