kbsmith1 added a comment.

In D99675#2671924 <https://reviews.llvm.org/D99675#2671924>, @efriedma wrote:

>> The expression “llvm.arith.fence(a * b) + c” means that “a * b” must happen 
>> before “+ c” and FMA guarantees that, but to prevent later optimizations 
>> from unpacking the FMA the correct transformation needs to be:
>>
>> llvm.arith.fence(a * b) + c  →  llvm.arith.fence(FMA(a, b, c))
>
> Does this actually block later transforms from unpacking the FMA?  Maybe if 
> the FMA isn't marked "fast"...

I think we could define llvm.arith.fence to be such that this FMA contraction 
isn't legal/correct, or it could be left as is.  In the implementation that was 
used for the Intel compiler FMA contraction did not occur across an an __fence 
boundary.  It is unclear whether that was intended as the semantic, or if we 
just never bothered to implement that contraction.
Not allowing the FMA contraction across the llvm.arith.fence would make 
unpacking an FMA allowed under the same circumstances that LLVM currently 
allows that.

> ----
>
> How is llvm.arith.fence() different from using "freeze" on a floating-point 
> value?  The goal isn't really the same, sure, but the effects seem similar at 
> first glance.

They are similar.  However, fence is a no-op if the operand can be proven not 
to be undef or poison, and in such circumstances could be removed by an 
optimizer.  llvm.arith.fence cannot be removed by an optimizer, because doing 
so might allow instructions that were "outside" the fence from being 
reassociated/distrbuted with the instructions/operands that were inside the 
fence.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99675/new/

https://reviews.llvm.org/D99675

_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to