[RFC] Aarch64: Replace nested FP min/max with conditionals for TX2

Anton Youdkevitch Wed, 09 Sep 2020 08:51:32 -0700

ThunderxT2 chip has an odd property that nested scalar FP min and max are
slower than logically the same sequence of compares and branches.


Here is the patch where I'm trying to implement that transformation.
Please advise if the "combine" pass (actually after the pass itself) is the
appropriate place to do this.

I was considering the possibility to implement this in aarch64.md
(which would be much cleaner) but didn't manage to figure out how
to make fmin/fmax survive until later passes and replace them only
then.

-- 
  Thanks,
  Anton

0001-WIP-MIN-to-conditionals-1.patch
Description: Binary data

[RFC] Aarch64: Replace nested FP min/max with conditionals for TX2

Reply via email to