hfinkel added inline comments.

================
Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220
+/// \brief FP_CONTRACT mode (on/off/fast).
+ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP 
contraction type")
 LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment")
----------------
yaxunl wrote:
> anemet wrote:
> > yaxunl wrote:
> > > anemet wrote:
> > > > yaxunl wrote:
> > > > > This change seemed to cause some performance degradations in some 
> > > > > OpenCL applications.
> > > > > 
> > > > > This option used to be on by default. Now it is off by default.
> > > > > 
> > > > > There are applications do separated compile/link/codegen stages. In 
> > > > > the codegen stage, clang is invoked with .bc as input, therefore this 
> > > > > option is off by default, whereas it was on by default before this 
> > > > > change.
> > > > > 
> > > > > Is there any reason not to keep the original behavior?
> > > > > 
> > > > > Thanks.
> > > > > This change seemed to cause some performance degradations in some 
> > > > > OpenCL applications.
> > > > > 
> > > > > This option used to be on by default. Now it is off by default.
> > > > 
> > > > It's always been off.  It was set to fast for CUDA which should still 
> > > > be the case.  See the changes further down on the patch.
> > > > 
> > > > > 
> > > > > There are applications do separated compile/link/codegen stages. In 
> > > > > the codegen stage, clang is invoked with .bc as input, therefore this 
> > > > > option is off by default, whereas it was on by default before this 
> > > > > change.
> > > > > 
> > > > > Is there any reason not to keep the original behavior?
> > > > 
> > > > Sorry but I am not sure what changed, can you elaborate on what you're 
> > > > doing and how things used to work for you?
> > > Before your change, -ffp-contract was a codegen option defined as
> > > 
> > > 
> > > ```
> > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On)
> > > ```
> > > 
> > > therefore the default value as on. After your change, it becomes off by 
> > > default.
> > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a backend 
> > option.  I still don't understand your use-case.  Can you include a small 
> > testcase how this used to work before?
> -ffp-contract=on used to be a codegen option before this change. In 
> CodeGen/BackendUtil.cpp, before this change:
> 
> ```
> switch (CodeGenOpts.getFPContractMode()) {
>   switch (LangOpts.getDefaultFPContractMode()) {
>   case LangOptions::FPC_Off:
>     Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
>     break;
>   case LangOptions::FPC_On:
>     Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
>     break;
>   case LangOptions::FPC_Fast:
>     Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
>     break;
>   }
> ```
> By default, -fp-contract=on, which results in llvm::FPOpFusion::Standard in 
> the backend. This options allows backend to translate llvm.fmuladd to FMA 
> machine instructions.
> 
> After this change, it becomes:
> 
> ```
> switch (LangOpts.getDefaultFPContractMode()) {
>   case LangOptions::FPC_Off:
>     Options.AllowFPOpFusion = llvm::FPOpFusion::Strict;
>     break;
>   case LangOptions::FPC_On:
>     Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
>     break;
>   case LangOptions::FPC_Fast:
>     Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
>     break;
>   }
> ```
> now by default -ffp-contract=off, which results in  llvm::FPOpFusion::Strict 
> in the backend. This option disables backend translating llvm.fmuladd to FMA 
> machine instructions in certain situations.
> 
> A simple lit test is as follows:
> 
> 
> ```
> ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s
> 
> define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, double %a, 
> double %b, double %c) local_unnamed_addr #0 {
> entry:
>   ; CHECK: v_fma_f64
>   %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double %a)
>   store double %0, double addrspace(1)* %out, align 8, !tbaa !6
>   ret void
> }
> 
> declare double @llvm.fmuladd.f64(double, double, double) #1
> 
> attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" 
> "disable-tail-calls"="false" "less-precise-fpmad"="false" 
> "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" 
> "no-jump-tables"="false" "no-nans-fp-math"="false" 
> "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" 
> "stack-protector-buffer-size"="8" 
> "target-features"="+fp64-fp16-denormals,-fp32-denormals" 
> "unsafe-fp-math"="false" "use-soft-float"="false" }
> attributes #1 = { nounwind readnone }
> 
> ```
> The test passes before this change and fails after this change.
I'm missing something here. We're calling for OpenCL:

  Opts.setDefaultFPContractMode(LangOptions::FPC_On);

which seems like it should change the result returned by 
getDefaultFPContractMode(). Why does it not?



Repository:
  rL LLVM

https://reviews.llvm.org/D31167



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to