hfinkel added inline comments.
================ Comment at: cfe/trunk/include/clang/Basic/LangOptions.def:220 +/// \brief FP_CONTRACT mode (on/off/fast). +ENUM_LANGOPT(DefaultFPContractMode, FPContractModeKind, 2, FPC_Off, "FP contraction type") LANGOPT(NoBitFieldTypeAlign , 1, 0, "bit-field type alignment") ---------------- yaxunl wrote: > anemet wrote: > > yaxunl wrote: > > > anemet wrote: > > > > yaxunl wrote: > > > > > This change seemed to cause some performance degradations in some > > > > > OpenCL applications. > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > > > There are applications do separated compile/link/codegen stages. In > > > > > the codegen stage, clang is invoked with .bc as input, therefore this > > > > > option is off by default, whereas it was on by default before this > > > > > change. > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > > > Thanks. > > > > > This change seemed to cause some performance degradations in some > > > > > OpenCL applications. > > > > > > > > > > This option used to be on by default. Now it is off by default. > > > > > > > > It's always been off. It was set to fast for CUDA which should still > > > > be the case. See the changes further down on the patch. > > > > > > > > > > > > > > There are applications do separated compile/link/codegen stages. In > > > > > the codegen stage, clang is invoked with .bc as input, therefore this > > > > > option is off by default, whereas it was on by default before this > > > > > change. > > > > > > > > > > Is there any reason not to keep the original behavior? > > > > > > > > Sorry but I am not sure what changed, can you elaborate on what you're > > > > doing and how things used to work for you? > > > Before your change, -ffp-contract was a codegen option defined as > > > > > > > > > ``` > > > ENUM_CODEGENOPT(FPContractMode, FPContractModeKind, 2, FPC_On) > > > ``` > > > > > > therefore the default value as on. After your change, it becomes off by > > > default. > > No. -ffp-contract=on was a FE option and -ffp-contract=fast was a backend > > option. I still don't understand your use-case. Can you include a small > > testcase how this used to work before? > -ffp-contract=on used to be a codegen option before this change. In > CodeGen/BackendUtil.cpp, before this change: > > ``` > switch (CodeGenOpts.getFPContractMode()) { > switch (LangOpts.getDefaultFPContractMode()) { > case LangOptions::FPC_Off: > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > break; > case LangOptions::FPC_On: > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > break; > case LangOptions::FPC_Fast: > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > break; > } > ``` > By default, -fp-contract=on, which results in llvm::FPOpFusion::Standard in > the backend. This options allows backend to translate llvm.fmuladd to FMA > machine instructions. > > After this change, it becomes: > > ``` > switch (LangOpts.getDefaultFPContractMode()) { > case LangOptions::FPC_Off: > Options.AllowFPOpFusion = llvm::FPOpFusion::Strict; > break; > case LangOptions::FPC_On: > Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; > break; > case LangOptions::FPC_Fast: > Options.AllowFPOpFusion = llvm::FPOpFusion::Fast; > break; > } > ``` > now by default -ffp-contract=off, which results in llvm::FPOpFusion::Strict > in the backend. This option disables backend translating llvm.fmuladd to FMA > machine instructions in certain situations. > > A simple lit test is as follows: > > > ``` > ; RUN: %clang_cc1 -triple amdgcn -S -o - %s| FileCheck %s > > define amdgpu_kernel void @f(double addrspace(1)* nocapture %out, double %a, > double %b, double %c) local_unnamed_addr #0 { > entry: > ; CHECK: v_fma_f64 > %0 = tail call double @llvm.fmuladd.f64(double %b, double %c, double %a) > store double %0, double addrspace(1)* %out, align 8, !tbaa !6 > ret void > } > > declare double @llvm.fmuladd.f64(double, double, double) #1 > > attributes #0 = { nounwind "correctly-rounded-divide-sqrt-fp-math"="false" > "disable-tail-calls"="false" "less-precise-fpmad"="false" > "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" > "no-jump-tables"="false" "no-nans-fp-math"="false" > "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" > "stack-protector-buffer-size"="8" > "target-features"="+fp64-fp16-denormals,-fp32-denormals" > "unsafe-fp-math"="false" "use-soft-float"="false" } > attributes #1 = { nounwind readnone } > > ``` > The test passes before this change and fails after this change. I'm missing something here. We're calling for OpenCL: Opts.setDefaultFPContractMode(LangOptions::FPC_On); which seems like it should change the result returned by getDefaultFPContractMode(). Why does it not? Repository: rL LLVM https://reviews.llvm.org/D31167 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits