[clang-tools-extra] [clang-tidy] Improve `google-explicit-constructor` checks handling of `explicit(bool)` (PR #82689)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/82689 >From 6bf8d649fb78b81adbaa7de82064161e14213fc4 Mon Sep 17 00:00:00 2001 From: AMS21 Date: Thu, 22 Feb 2024 22:10:09 +0100 Subject: [PATCH] [clang-tidy] Improve `google-explicit-constructor` checks handling of `explicit(bool)` We now treat `explicit(false)` the same way we treat `noexcept(false)` in the noexcept checks, which is ignoring it. Also introduced a new warning message if a constructor has an `explicit` declaration which evaluates to false and no longer emit a faulty FixIt. Fixes #81121 --- .../google/ExplicitConstructorCheck.cpp | 33 --- clang-tools-extra/docs/ReleaseNotes.rst | 4 ++ .../google/explicit-constructor-cxx20.cpp | 59 +++ 3 files changed, 88 insertions(+), 8 deletions(-) create mode 100644 clang-tools-extra/test/clang-tidy/checkers/google/explicit-constructor-cxx20.cpp diff --git a/clang-tools-extra/clang-tidy/google/ExplicitConstructorCheck.cpp b/clang-tools-extra/clang-tidy/google/ExplicitConstructorCheck.cpp index 34d49af9f81e23..6f26de9881357f 100644 --- a/clang-tools-extra/clang-tidy/google/ExplicitConstructorCheck.cpp +++ b/clang-tools-extra/clang-tidy/google/ExplicitConstructorCheck.cpp @@ -79,8 +79,10 @@ static bool isStdInitializerList(QualType Type) { } void ExplicitConstructorCheck::check(const MatchFinder::MatchResult &Result) { - constexpr char WarningMessage[] = + constexpr char NoExpressionWarningMessage[] = "%0 must be marked explicit to avoid unintentional implicit conversions"; + constexpr char WithExpressionWarningMessage[] = + "%0 explicit expression evaluates to 'false'"; if (const auto *Conversion = Result.Nodes.getNodeAs("conversion")) { @@ -91,7 +93,7 @@ void ExplicitConstructorCheck::check(const MatchFinder::MatchResult &Result) { // gmock to define matchers). if (Loc.isMacroID()) return; -diag(Loc, WarningMessage) +diag(Loc, NoExpressionWarningMessage) << Conversion << FixItHint::CreateInsertion(Loc, "explicit "); return; } @@ -101,9 +103,11 @@ void ExplicitConstructorCheck::check(const MatchFinder::MatchResult &Result) { Ctor->getMinRequiredArguments() > 1) return; + const ExplicitSpecifier ExplicitSpec = Ctor->getExplicitSpecifier(); + bool TakesInitializerList = isStdInitializerList( Ctor->getParamDecl(0)->getType().getNonReferenceType()); - if (Ctor->isExplicit() && + if (ExplicitSpec.isExplicit() && (Ctor->isCopyOrMoveConstructor() || TakesInitializerList)) { auto IsKwExplicit = [](const Token &Tok) { return Tok.is(tok::raw_identifier) && @@ -130,18 +134,31 @@ void ExplicitConstructorCheck::check(const MatchFinder::MatchResult &Result) { return; } - if (Ctor->isExplicit() || Ctor->isCopyOrMoveConstructor() || + if (ExplicitSpec.isExplicit() || Ctor->isCopyOrMoveConstructor() || TakesInitializerList) return; - bool SingleArgument = + // Don't complain about explicit(false) or dependent expressions + const Expr *ExplicitExpr = ExplicitSpec.getExpr(); + if (ExplicitExpr) { +ExplicitExpr = ExplicitExpr->IgnoreImplicit(); +if (isa(ExplicitExpr) || +ExplicitExpr->isInstantiationDependent()) + return; + } + + const bool SingleArgument = Ctor->getNumParams() == 1 && !Ctor->getParamDecl(0)->isParameterPack(); SourceLocation Loc = Ctor->getLocation(); - diag(Loc, WarningMessage) + auto Diag = + diag(Loc, ExplicitExpr ? WithExpressionWarningMessage + : NoExpressionWarningMessage) << (SingleArgument ? "single-argument constructors" - : "constructors that are callable with a single argument") - << FixItHint::CreateInsertion(Loc, "explicit "); + : "constructors that are callable with a single argument"); + + if (!ExplicitExpr) +Diag << FixItHint::CreateInsertion(Loc, "explicit "); } } // namespace clang::tidy::google diff --git a/clang-tools-extra/docs/ReleaseNotes.rst b/clang-tools-extra/docs/ReleaseNotes.rst index 3f90e7d63d6b23..f2df3d0d737c6b 100644 --- a/clang-tools-extra/docs/ReleaseNotes.rst +++ b/clang-tools-extra/docs/ReleaseNotes.rst @@ -151,6 +151,10 @@ Changes in existing checks ` check by replacing the local option `HeaderFileExtensions` by the global option of the same name. +- Improved :doc:`google-explicit-constructor + ` check to better handle + ``C++-20`` `explicit(bool)`. + - Improved :doc:`google-global-names-in-headers ` check by replacing the local option `HeaderFileExtensions` by the global option of the same name. diff --git a/clang-tools-extra/test/clang-tidy/checkers/google/explicit-constructor-cxx20.cpp b/clang-tools-extra/test/clang-tidy/checkers/google/explicit-constructor-cxx20.cpp new file mode 100644 index 00..95206f1ef420c3 --- /dev/null +++ b/clang-tools-extra/test/clang-tidy/checkers/go
[mlir] [llvm] [clang] [AArch64] Replace LLVM IR function attributes for PSTATE.ZA. (PR #79166)
@@ -1098,11 +1098,15 @@ LogicalResult ModuleTranslation::convertOneFunction(LLVMFuncOp func) { llvmFunc->addFnAttr("aarch64_pstate_sm_compatible"); if (func.getArmNewZa()) -llvmFunc->addFnAttr("aarch64_pstate_za_new"); - else if (func.getArmSharedZa()) -llvmFunc->addFnAttr("aarch64_pstate_za_shared"); +llvmFunc->addFnAttr("aarch64_new_za"); + else if (func.getArmInZa()) +llvmFunc->addFnAttr("aarch64_in_za"); + else if (func.getArmOutZa()) +llvmFunc->addFnAttr("aarch64_out_za"); + else if (func.getArmInoutZa()) +llvmFunc->addFnAttr("aarch64_inout_za"); if (func.getArmPreservesZa()) -llvmFunc->addFnAttr("aarch64_pstate_za_preserved"); +llvmFunc->addFnAttr("aarch64_preserves_za"); MacDue wrote: (it used to be that `aarch64_pstate_za_preserved` was used with `aarch64_pstate_za_shared`, which is why the import was originally like this) https://github.com/llvm/llvm-project/pull/79166 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [mlir] Move several vector intrinsics out of experimental namespace (PR #88748)
MacDue wrote: > @c-rhodes the remaining failure is MLIR :: > Conversion/TosaToTensor/tosa-to-tensor.mlir but it also fails without my > changes. Very likely unrelated we only added `interleave2` for use in ArmSME. It's not used within high-level dialects like tosa & tensor. https://github.com/llvm/llvm-project/pull/88748 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Fix possible crash when setting TBAA metadata on FP math libcalls (PR #108575)
https://github.com/MacDue created https://github.com/llvm/llvm-project/pull/108575 There's currently no code path that can reach this crash, but: ``` Instruction *Inst = cast(Call.getScalarVal()); ``` fails if the call returns `void`. This could happen if a builtin for something like `void sincos(double, double*, double*)` is added to clang. Instead, use the `llvm::CallBase` returned from `EmitCall()` to set the TBAA metadata, which should exist no matter the return type. >From a2f1bb60ecd31e8a52e29de60d7615abbe22160f Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 13 Sep 2024 14:06:37 + Subject: [PATCH] [clang][codegen] Fix possible crash when setting TBAA metadata on FP math libcalls There's currently no code path that can reach this crash, but: ``` Instruction *Inst = cast(Call.getScalarVal()); ``` fails if the call returns `void`. This could happen if a builtin for something like `void sincos(double, double*, double*)` is added to clang. Instead, use the `llvm::CallBase` returned from `EmitCall()` to set the TBAA metadata, which should exist no matter the return type. --- clang/lib/CodeGen/CGBuiltin.cpp | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 27abeba92999b3..d4c7eea3d20b24 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -690,8 +690,10 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, const CallExpr *E, llvm::Constant *calleeValue) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); CGCallee callee = CGCallee::forDirect(calleeValue, GlobalDecl(FD)); + llvm::CallBase *callOrInvoke = nullptr; RValue Call = - CGF.EmitCall(E->getCallee()->getType(), callee, E, ReturnValueSlot()); + CGF.EmitCall(E->getCallee()->getType(), callee, E, ReturnValueSlot(), + /*Chain=*/nullptr, &callOrInvoke); if (unsigned BuiltinID = FD->getBuiltinID()) { // Check whether a FP math builtin function, such as BI__builtin_expf @@ -705,8 +707,7 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, // Emit "int" TBAA metadata on FP math libcalls. clang::QualType IntTy = Context.IntTy; TBAAAccessInfo TBAAInfo = CGF.CGM.getTBAAAccessInfo(IntTy); - Instruction *Inst = cast(Call.getScalarVal()); - CGF.CGM.DecorateInstructionWithTBAA(Inst, TBAAInfo); + CGF.CGM.DecorateInstructionWithTBAA(callOrInvoke, TBAAInfo); } } return Call; ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Fix possible crash when setting TBAA metadata on FP math libcalls (PR #108575)
https://github.com/MacDue closed https://github.com/llvm/llvm-project/pull/108575 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Fix possible crash when setting TBAA metadata on FP math libcalls (PR #108575)
MacDue wrote: Thanks for letting me know, feel free to revert this change if it's a non-trivial breakage. https://github.com/llvm/llvm-project/pull/108575 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
https://github.com/MacDue created https://github.com/llvm/llvm-project/pull/108853 On some targets, an FP libcall with argument types such as long double will be lowered to pass arguments indirectly via pointers. When this is the case we should not mark the libcall with "int" TBAA as it may lead to incorrect optimizations. Currently, this can be seen for long doubles on x86_64-w64-mingw32. The `load x86_fp80` after the call is (incorrectly) marked with "int" TBAA (overwriting the previous metadata for "long double"). Nothing seems to break due to this currently as the metadata is being incorrectly placed on the load and not the call. But if the metadata is moved to the call (which this patch ensures), LLVM will optimize out the setup for the arguments. >From 6db9f6d56f0bbd56d017156f858eae68653fbd1b Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:27:23 + Subject: [PATCH 1/2] Precommit math-libcalls-tbaa-indirect-args.c --- .../math-libcalls-tbaa-indirect-args.c| 38 +++ 1 file changed, 38 insertions(+) create mode 100644 clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c new file mode 100644 index 00..dd013dcc8b3ca8 --- /dev/null +++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c @@ -0,0 +1,38 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 -triple=x86_64-w64-mingw32 -fmath-errno -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK + +long double powl(long double a, long double b); + +// Negative test: powl is a floating-point math function that is +// ConstWithoutErrnoAndExceptions, however, for this target long doubles are +// passed indirectly via a pointer. Annotating the call with "int" TBAA metadata +// will cause the setup for the BYVAL arguments to be incorrectly optimized out. + +// CHECK-LABEL: define dso_local void @test_powl( +// CHECK-SAME: ptr dead_on_unwind noalias nocapture writable writeonly sret(x86_fp80) align 16 [[AGG_RESULT:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT:[[TMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP1:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3:[0-9]+]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP]], align 16, !tbaa [[TBAA3:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:[[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:ret void +// +long double test_powl() { + return powl(2.0L, 2.0L); // Don't emit TBAA metadata +} +//. +// CHECK: [[TBAA3]] = !{[[META4:![0-9]+]], [[META4]], i64 0} +// CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0} +// CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0} +// CHECK: [[META6]] = !{!"Simple C/C++ TBAA"} +// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0} +// CHECK: [[META8]] = !{!"int", [[META5]], i64 0} +//. >From 482639f9a8df7785d1b24c723571f477eb5febd7 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:14:01 + Subject: [PATCH 2/2] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args On some targets, an FP libcall with argument types such as long double will be lowered to pass arguments indirectly via pointers. When this is the case we should not mark the libcall with "int" TBAA as it may lead to incorrect optimizations. Currently, this can be seen for long doubles on x86_64-w64-mingw32. The `load x86_fp80` after the call is (incorrectly) marked with "int" TBAA (overwriting the previous metadata for "long double"). Nothing seems to break due to this currently as the metadata is being incorrectly placed on the load and not the call. But if the metadata is moved to the call (which this patch ensures), LLVM will optimize out the setup for the arguments. --- clang/lib/CodeGen/CGBuiltin.cpp | 24 +++ clang/lib/CodeGen/CGExpr.cpp | 6 - clang/
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: Note the first commit (https://github.com/llvm/llvm-project/pull/108853/commits/6db9f6d56f0bbd56d017156f858eae68653fbd1b) shows a correctness issue with what's currently upstream as the `load x86_fp80` is incorrectly marked with "int" TBAA metadata (overwriting its metadata for long double). https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Fix possible crash when setting TBAA metadata on FP math libcalls (PR #108575)
MacDue wrote: Thanks for the reduction! I tracked this down to the "int" TBAA metadata being added to calls with indirect arguments (with seems broken even without this change). I've created a possible fix here: https://github.com/llvm/llvm-project/pull/108853#event-14276322905. https://github.com/llvm/llvm-project/pull/108575 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: > How does this interact with #107598? I think it's solving the same problem, but a different way (looking at the final LLVM function type rather than checking the ABI information). https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
MacDue wrote: It looks like this crash is not unique to struct types, I can reproduce it with array types too (which have been allowed for some time). https://github.com/llvm/llvm-project/pull/110506 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
MacDue wrote: > Hi, > > I bisected a crash to this patch. I can't share the C reproducer but it's > instcombine that crashes and a reduced reproducer for that is > ```opt -passes=instcombine bbi-99792.ll -o /dev/null``` > > [bbi-99792.ll.gz](https://github.com/user-attachments/files/17253640/bbi-99792.ll.gz) > Thanks for reporting this, I'll have a look into the crash today 👍 https://github.com/llvm/llvm-project/pull/110506 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue created https://github.com/llvm/llvm-project/pull/112580 These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). Note: Both SLEEF and ArmPL state that they do not set `errno`: - https://developer.arm.com/documentation/101004/2410/General-information/Arm-Performance-Libraries-math-functions * "The vector functions in libamath which are available on Linux may not set errno nor raise exceptions" - https://sleef.org/2-references/libm/ * "These functions do not set errno nor raise an exception." >From d8ac47d27ad860a8b11424621ab88cd9267cf866 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 379e75b197cf96..7965f70e290408 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 3fc39296f44281..7e7f3770cfb62d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From d8ac47d27ad860a8b11424621ab88cd9267cf866 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/6] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 379e75b197cf96..7965f70e290408 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 3fc39296f44281..7e7f3770cfb62d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 75bcb
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From e59c67a0fb40f1a0de96a626bf4f4fa9291436ec Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/6] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 152c43d7908ff8..ac93cba71c6b16 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index d032fd7a59f330..f8527035b7ae24 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 99d0f
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
@@ -502,6 +502,10 @@ def err_sls_hardening_arm_not_supported : Error< def warn_drv_large_data_threshold_invalid_code_model: Warning< "'%0' only applies to medium and large code models">, InGroup; +def warn_drv_math_errno_reenabled_after_veclib: Warning< + "math errno re-enabled by '%0' after it was implicitly disabled by '%1'," + " this may prevent vectorization with the specified vector library">, + InGroup; MacDue wrote: Done :+1: https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
@@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" MacDue wrote: I've followed @paulwalker-arm's suggestion and removed the vectorizer specific phrasing from the warning, so it's now: > math errno enabled by '-ffp-model=strict' after it was implicitly disabled by > '-fveclib=ArmPL', this may limit the utilization of the vector library > [-Wmath-errno-enabled-with-veclib] Which I think addresses the "combines 2 things" issue. I'm not sure if pt 2 exists. Is `errno` a concept that exists in the loop vectorizer, other than as a memory effect? It seems more C/C++ language specific. That said, I did spot: https://github.com/llvm/llvm-project/blob/e13f1d1daf9b76134c3585e8250941920bdf3da6/llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp#L908-L918 https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From 9d378377a16798f4a866364a1c3f5d71b963cf15 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/6] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 152c43d7908ff8..ac93cba71c6b16 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index d032fd7a59f330..f8527035b7ae24 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From a8495
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From 88352d935b1d0aca24a296c8e086d4e63cb98dd4 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/6] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 152c43d7908ff8..ac93cba71c6b16 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index d032fd7a59f330..f8527035b7ae24 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 43809
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From 88352d935b1d0aca24a296c8e086d4e63cb98dd4 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/7] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 152c43d7908ff8..ac93cba71c6b16 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index d032fd7a59f330..f8527035b7ae24 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 43809
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
@@ -3125,6 +3140,13 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + VecLibArg = A; + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) { +MathErrno = false; +NoMathErrnoWasImpliedByVecLib = true; + } MacDue wrote: Done (and added test cases for the two scenarios discussed in these comments) :+1: https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
@@ -2960,6 +2969,12 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, } for (const Arg *A : Args) { +auto CheckMathErrnoForVecLib = +llvm::make_scope_exit([&, MathErrnoBeforeArg = MathErrno] { + if (NoMathErrnoWasImpliedByVecLib && !MathErrnoBeforeArg && MathErrno) +ArgThatEnabledMathErrnoAfterVecLib = A; +}); MacDue wrote: Maybe a little contrived, but the suggested check will give a _kinda_ incorrect warning in some cases: Given this (somewhat) contrived example: `-fveclib=ArmPL -fno-fast-math -fno-math-errno -ffp-model=strict` The current check will warn about `-ffp-model=strict` which is the option that in the end turns on `-fmath-errno`, the suggested check will warn about `-fno-fast-math`, which is not really correct as there's `-fno-math-errno` after it. https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
MacDue wrote: > > Have you checked the flang driver? Is it not applicable there since errno > > is not used in Flang? > > We don't support the gfortran extension for checking errno in flang and I > can't see another way of checking it portably, so I wonder if we should just > have this flag on by default in flang in general? It shouldn't provide any > observable change and might increase performance as far as I can tell. As far as I can tell from looking at the `flang` driver, it already defaults to the equivalent of `-fno-math-errno` (looks like there's no way to set `-fmath-errno` at all). That's corroborated by looking at the LLVM IR for a call to `sin` which uses the non-errno setting LLVM intrinsic by default: https://godbolt.org/z/dvTvP3vPr. https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From d8ac47d27ad860a8b11424621ab88cd9267cf866 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/4] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 379e75b197cf96..7965f70e290408 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 3fc39296f44281..7e7f3770cfb62d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 75bcb
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From d8ac47d27ad860a8b11424621ab88cd9267cf866 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/5] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 379e75b197cf96..7965f70e290408 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 3fc39296f44281..7e7f3770cfb62d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 75bcb
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
https://github.com/MacDue closed https://github.com/llvm/llvm-project/pull/110506 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From ce7cac7c2fcc672abfd8ab2a49b59a73994bee64 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/7] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 70f2fb6bdc4db9..452746bbd66a00 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index d032fd7a59f330..f8527035b7ae24 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From c30fe
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From ce7cac7c2fcc672abfd8ab2a49b59a73994bee64 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/8] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 70f2fb6bdc4db9..452746bbd66a00 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index d032fd7a59f330..f8527035b7ae24 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From c30fe
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
@@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: In clang -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, MacDue wrote: Ah thanks, done :+1: https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From 3b84e05da76a2c9a7190095450aa4181bf199bd7 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/7] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 152c43d7908ff8..ac93cba71c6b16 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index d032fd7a59f330..f8527035b7ae24 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 263b3
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/114086 >From eae78e24f06ada3ebcc767319f5ad147bae27715 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 9 Sep 2024 10:15:20 + Subject: [PATCH] [clang] Add sincos builtin using `llvm.sincos` intrinsic This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. --- clang/include/clang/Basic/Builtins.td| 13 +++ clang/lib/CodeGen/CGBuiltin.cpp | 43 clang/test/CodeGen/AArch64/sincos.c | 33 ++ clang/test/CodeGen/X86/math-builtins.c | 35 +++ clang/test/OpenMP/declare_simd_aarch64.c | 4 +-- 5 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sincos.c diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td index 9bd67e0cefebc3..27eadf80d623e6 100644 --- a/clang/include/clang/Basic/Builtins.td +++ b/clang/include/clang/Basic/Builtins.td @@ -3562,6 +3562,19 @@ def Frexp : FPMathTemplate, LibBuiltin<"math.h"> { let AddBuiltinPrefixedAlias = 1; } +def Sincos : FPMathTemplate, GNULibBuiltin<"math.h"> { + let Spellings = ["sincos"]; + let Attributes = [NoThrow]; + let Prototype = "void(T, T*, T*)"; + let AddBuiltinPrefixedAlias = 1; +} + +def SincosF16F128 : F16F128MathTemplate, Builtin { + let Spellings = ["__builtin_sincos"]; + let Attributes = [FunctionWithBuiltinPrefix, NoThrow]; + let Prototype = "void(T, T*, T*)"; +} + def Ldexp : FPMathTemplate, LibBuiltin<"math.h"> { let Spellings = ["ldexp"]; let Attributes = [NoThrow, ConstIgnoringErrnoAndExceptions]; diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 65d7f5c54a1913..331b367e63d91b 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -722,6 +722,38 @@ static Value *emitFrexpBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateExtractValue(Call, 0); } +static void emitSincosBuiltin(CodeGenFunction &CGF, const CallExpr *E, + llvm::Intrinsic::ID IntrinsicID) { + llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(0)); + llvm::Value *Dest0 = CGF.EmitScalarExpr(E->getArg(1)); + llvm::Value *Dest1 = CGF.EmitScalarExpr(E->getArg(2)); + + llvm::Function *F = CGF.CGM.getIntrinsic(IntrinsicID, {Val->getType()}); + llvm::Value *Call = CGF.Builder.CreateCall(F, Val); + + llvm::Value *SinResult = CGF.Builder.CreateExtractValue(Call, 0); + llvm::Value *CosResult = CGF.Builder.CreateExtractValue(Call, 1); + + QualType DestPtrType = E->getArg(1)->getType()->getPointeeType(); + LValue SinLV = CGF.MakeNaturalAlignAddrLValue(Dest0, DestPtrType); + LValue CosLV = CGF.MakeNaturalAlignAddrLValue(Dest1, DestPtrType); + + llvm::StoreInst *StoreSin = + CGF.Builder.CreateStore(SinResult, SinLV.getAddress()); + llvm::StoreInst *StoreCos = + CGF.Builder.CreateStore(CosResult, CosLV.getAddress()); + + // Mark the two stores as non-aliasing with eachother. The order of stores + // emitted by this builtin is arbitrary, enforcing a particular order will + // prevent optimizations later on. + llvm::MDBuilder MDHelper(CGF.getLLVMContext()); + MDNode *Domain = MDHelper.createAnonymousAliasScopeDomain(); + MDNode *AliasScope = MDHelper.createAnonymousAliasScope(Domain); + MDNode *AliasScopeList = MDNode::get(Call->getContext(), AliasScope); + StoreSin->setMetadata(LLVMContext::MD_alias_scope, AliasScopeList); + StoreCos->setMetadata(LLVMContext::MD_noalias, AliasScopeList); +} + /// EmitFAbs - Emit a call to @llvm.fabs(). static Value *EmitFAbs(CodeGenFunction &CGF, Value *V) { Function *F = CGF.CGM.getIntrinsic(Intrinsic::fabs, V->getType()); @@ -3094,6 +3126,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, return RValue::get(emitUnaryMaybeConstrainedFPBuiltin( *this, E, Intrinsic::sinh, Intrinsic::experimental_constrained_sinh)); +case Builtin::BIsincos: +case Builtin::BIsincosf: +case Builtin::BIsincosl: +case Builtin::BI__builtin_sincos: +case Builtin::BI__builtin_sincosf: +case Builtin::BI__builtin_sincosl: +case Builtin::BI__builtin_sincosf128: +case Builtin::BI__builtin_sincosf16: + emitSincosBuiltin(*this, E, Intrinsic::sincos); + return RValue::get(nullptr); + case Builtin::BIsqrt: case Builtin::BIsqrtf: case Builtin::BIsqrtl: diff --git a/clang/test/CodeGen/AArch64/sincos.c b/clang/test/CodeGen/AArch64/sincos.c new file mode 100644 index 00..240d921b2b7034 --- /dev/null +++ b/clang/test/CodeGen/AArch64/sincos.c @@ -0,0 +1,33 @@ +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm %s -o - | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - | FileCheck --check-prefix=MATH-ERRN
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
https://github.com/MacDue created https://github.com/llvm/llvm-project/pull/114086 This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. >From 3433ebee477c17f634fbc1b32ee7c297ff4c1942 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 9 Sep 2024 10:15:20 + Subject: [PATCH] [clang] Add sincos builtin using `llvm.sincos` intrinsic This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. --- clang/include/clang/Basic/Builtins.td| 13 clang/lib/CodeGen/CGBuiltin.cpp | 41 clang/test/CodeGen/AArch64/sincos.c | 33 +++ clang/test/CodeGen/X86/math-builtins.c | 35 clang/test/OpenMP/declare_simd_aarch64.c | 4 +-- 5 files changed, 124 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sincos.c diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td index 9bd67e0cefebc3..27eadf80d623e6 100644 --- a/clang/include/clang/Basic/Builtins.td +++ b/clang/include/clang/Basic/Builtins.td @@ -3562,6 +3562,19 @@ def Frexp : FPMathTemplate, LibBuiltin<"math.h"> { let AddBuiltinPrefixedAlias = 1; } +def Sincos : FPMathTemplate, GNULibBuiltin<"math.h"> { + let Spellings = ["sincos"]; + let Attributes = [NoThrow]; + let Prototype = "void(T, T*, T*)"; + let AddBuiltinPrefixedAlias = 1; +} + +def SincosF16F128 : F16F128MathTemplate, Builtin { + let Spellings = ["__builtin_sincos"]; + let Attributes = [FunctionWithBuiltinPrefix, NoThrow]; + let Prototype = "void(T, T*, T*)"; +} + def Ldexp : FPMathTemplate, LibBuiltin<"math.h"> { let Spellings = ["ldexp"]; let Attributes = [NoThrow, ConstIgnoringErrnoAndExceptions]; diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 65d7f5c54a1913..5bb6851c3a2702 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -722,6 +722,36 @@ static Value *emitFrexpBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateExtractValue(Call, 0); } +static void emitSincosBuiltin(CodeGenFunction &CGF, const CallExpr *E, + llvm::Intrinsic::ID IntrinsicID) { + llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(0)); + llvm::Value *Dest0 = CGF.EmitScalarExpr(E->getArg(1)); + llvm::Value *Dest1 = CGF.EmitScalarExpr(E->getArg(2)); + + llvm::Function *F = CGF.CGM.getIntrinsic(IntrinsicID, {Val->getType()}); + llvm::Value *Call = CGF.Builder.CreateCall(F, Val); + + llvm::Value *SinResult = CGF.Builder.CreateExtractValue(Call, 0); + llvm::Value *CosResult = CGF.Builder.CreateExtractValue(Call, 1); + + QualType DestPtrType = E->getArg(1)->getType()->getPointeeType(); + LValue SinLV = CGF.MakeNaturalAlignAddrLValue(Dest0, DestPtrType); + LValue CosLV = CGF.MakeNaturalAlignAddrLValue(Dest1, DestPtrType); + + llvm::StoreInst *StoreSin = CGF.Builder.CreateStore(SinResult, SinLV.getAddress()); + llvm::StoreInst *StoreCos = CGF.Builder.CreateStore(CosResult, CosLV.getAddress()); + + // Mark the two stores as non-aliasing with eachother. The order of stores + // emitted by this builtin is arbitrary, enforcing a particular order will + // prevent optimizations later on. + llvm::MDBuilder MDHelper(CGF.getLLVMContext()); + MDNode* Domain = MDHelper.createAnonymousAliasScopeDomain(); + MDNode* AliasScope = MDHelper.createAnonymousAliasScope(Domain); + MDNode* AliasScopeList = MDNode::get(Call->getContext(), AliasScope); + StoreSin->setMetadata(LLVMContext::MD_alias_scope, AliasScopeList); + StoreCos->setMetadata(LLVMContext::MD_noalias, AliasScopeList); +} + /// EmitFAbs - Emit a call to @llvm.fabs(). static Value *EmitFAbs(CodeGenFunction &CGF, Value *V) { Function *F = CGF.CGM.getIntrinsic(Intrinsic::fabs, V->getType()); @@ -3094,6 +3124,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, return RValue::get(emitUnaryMaybeConstrainedFPBuiltin( *this, E, Intrinsic::sinh, Intrinsic::experimental_constrained_sinh)); +case Builtin::BIsincos: +case Builtin::BIsincosf: +case Builtin::BIsincosl: +case Builtin::BI__builtin_sincos: +case Builtin::BI__builtin_sincosf: +case Builtin::BI__builtin_sincosl: +case Builtin::BI__builtin_sincosf128: +case Builtin::BI__builtin_sincosf16: + emitSincosBuiltin(*this, E, Intrinsic::sincos); + return RValue::get(nullptr); + case Builtin::BIsqrt: case Builtin::BIsqrtf: case Builtin::BIsqrtl: diff --git a/clang/test/CodeGen/AArch64/sincos.c b/clang/test/CodeGen/AArch64/sincos.c new file mode 100644 index 00..240d921b2b7034 --- /dev/null +++ b/clang/test/CodeGen/AArch64/sincos.c @@ -0,0 +1,33 @@ +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm %s -o - | FileCheck --check-prefix=
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
@@ -722,6 +722,36 @@ static Value *emitFrexpBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateExtractValue(Call, 0); } +static void emitSincosBuiltin(CodeGenFunction &CGF, const CallExpr *E, + llvm::Intrinsic::ID IntrinsicID) { + llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(0)); + llvm::Value *Dest0 = CGF.EmitScalarExpr(E->getArg(1)); + llvm::Value *Dest1 = CGF.EmitScalarExpr(E->getArg(2)); + + llvm::Function *F = CGF.CGM.getIntrinsic(IntrinsicID, {Val->getType()}); + llvm::Value *Call = CGF.Builder.CreateCall(F, Val); + + llvm::Value *SinResult = CGF.Builder.CreateExtractValue(Call, 0); + llvm::Value *CosResult = CGF.Builder.CreateExtractValue(Call, 1); + + QualType DestPtrType = E->getArg(1)->getType()->getPointeeType(); + LValue SinLV = CGF.MakeNaturalAlignAddrLValue(Dest0, DestPtrType); + LValue CosLV = CGF.MakeNaturalAlignAddrLValue(Dest1, DestPtrType); + + llvm::StoreInst *StoreSin = CGF.Builder.CreateStore(SinResult, SinLV.getAddress()); + llvm::StoreInst *StoreCos = CGF.Builder.CreateStore(CosResult, CosLV.getAddress()); + + // Mark the two stores as non-aliasing with eachother. The order of stores + // emitted by this builtin is arbitrary, enforcing a particular order will + // prevent optimizations later on. + llvm::MDBuilder MDHelper(CGF.getLLVMContext()); + MDNode* Domain = MDHelper.createAnonymousAliasScopeDomain(); + MDNode* AliasScope = MDHelper.createAnonymousAliasScope(Domain); + MDNode* AliasScopeList = MDNode::get(Call->getContext(), AliasScope); + StoreSin->setMetadata(LLVMContext::MD_alias_scope, AliasScopeList); + StoreCos->setMetadata(LLVMContext::MD_noalias, AliasScopeList); MacDue wrote: What do people think of this? Looking at various documentation for `sincos[f]` I've never seen the store order (sin, cos) or (cos, sin) specified, and I imagine they're assumed not to alias anyway making it a moot point. The issue is without adding this `noalias` metadata the order of stores emitted here becomes significant, and later they're chained together in SDAG. This means an unnecessary stack slot may be created to ensure stores follow the arbitrary order from this built-in lowering. For example, for the store order (sin, cos) you get: ``` mov x19, x1 add x1, sp, #12 bl sincosf ldr s0, [sp, #12] str s0, [x19] ``` and for (cos, sin): ``` mov x19, x0 add x0, sp, #12 bl sincosf ldr s0, [sp, #12] str s0, [x19] ``` https://github.com/llvm/llvm-project/pull/114086 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/114086 >From 7fd19eefa1d1f61843fe1844a72e14c7f4bae03b Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 9 Sep 2024 10:15:20 + Subject: [PATCH] [clang] Add sincos builtin using `llvm.sincos` intrinsic This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. --- clang/include/clang/Basic/Builtins.td| 13 +++ clang/lib/CodeGen/CGBuiltin.cpp | 43 clang/test/CodeGen/AArch64/sincos.c | 33 ++ clang/test/CodeGen/X86/math-builtins.c | 35 +++ clang/test/OpenMP/declare_simd_aarch64.c | 4 +-- 5 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sincos.c diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td index 9bd67e0cefebc3..27eadf80d623e6 100644 --- a/clang/include/clang/Basic/Builtins.td +++ b/clang/include/clang/Basic/Builtins.td @@ -3562,6 +3562,19 @@ def Frexp : FPMathTemplate, LibBuiltin<"math.h"> { let AddBuiltinPrefixedAlias = 1; } +def Sincos : FPMathTemplate, GNULibBuiltin<"math.h"> { + let Spellings = ["sincos"]; + let Attributes = [NoThrow]; + let Prototype = "void(T, T*, T*)"; + let AddBuiltinPrefixedAlias = 1; +} + +def SincosF16F128 : F16F128MathTemplate, Builtin { + let Spellings = ["__builtin_sincos"]; + let Attributes = [FunctionWithBuiltinPrefix, NoThrow]; + let Prototype = "void(T, T*, T*)"; +} + def Ldexp : FPMathTemplate, LibBuiltin<"math.h"> { let Spellings = ["ldexp"]; let Attributes = [NoThrow, ConstIgnoringErrnoAndExceptions]; diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 65d7f5c54a1913..331b367e63d91b 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -722,6 +722,38 @@ static Value *emitFrexpBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateExtractValue(Call, 0); } +static void emitSincosBuiltin(CodeGenFunction &CGF, const CallExpr *E, + llvm::Intrinsic::ID IntrinsicID) { + llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(0)); + llvm::Value *Dest0 = CGF.EmitScalarExpr(E->getArg(1)); + llvm::Value *Dest1 = CGF.EmitScalarExpr(E->getArg(2)); + + llvm::Function *F = CGF.CGM.getIntrinsic(IntrinsicID, {Val->getType()}); + llvm::Value *Call = CGF.Builder.CreateCall(F, Val); + + llvm::Value *SinResult = CGF.Builder.CreateExtractValue(Call, 0); + llvm::Value *CosResult = CGF.Builder.CreateExtractValue(Call, 1); + + QualType DestPtrType = E->getArg(1)->getType()->getPointeeType(); + LValue SinLV = CGF.MakeNaturalAlignAddrLValue(Dest0, DestPtrType); + LValue CosLV = CGF.MakeNaturalAlignAddrLValue(Dest1, DestPtrType); + + llvm::StoreInst *StoreSin = + CGF.Builder.CreateStore(SinResult, SinLV.getAddress()); + llvm::StoreInst *StoreCos = + CGF.Builder.CreateStore(CosResult, CosLV.getAddress()); + + // Mark the two stores as non-aliasing with eachother. The order of stores + // emitted by this builtin is arbitrary, enforcing a particular order will + // prevent optimizations later on. + llvm::MDBuilder MDHelper(CGF.getLLVMContext()); + MDNode *Domain = MDHelper.createAnonymousAliasScopeDomain(); + MDNode *AliasScope = MDHelper.createAnonymousAliasScope(Domain); + MDNode *AliasScopeList = MDNode::get(Call->getContext(), AliasScope); + StoreSin->setMetadata(LLVMContext::MD_alias_scope, AliasScopeList); + StoreCos->setMetadata(LLVMContext::MD_noalias, AliasScopeList); +} + /// EmitFAbs - Emit a call to @llvm.fabs(). static Value *EmitFAbs(CodeGenFunction &CGF, Value *V) { Function *F = CGF.CGM.getIntrinsic(Intrinsic::fabs, V->getType()); @@ -3094,6 +3126,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, return RValue::get(emitUnaryMaybeConstrainedFPBuiltin( *this, E, Intrinsic::sinh, Intrinsic::experimental_constrained_sinh)); +case Builtin::BIsincos: +case Builtin::BIsincosf: +case Builtin::BIsincosl: +case Builtin::BI__builtin_sincos: +case Builtin::BI__builtin_sincosf: +case Builtin::BI__builtin_sincosl: +case Builtin::BI__builtin_sincosf128: +case Builtin::BI__builtin_sincosf16: + emitSincosBuiltin(*this, E, Intrinsic::sincos); + return RValue::get(nullptr); + case Builtin::BIsqrt: case Builtin::BIsqrtf: case Builtin::BIsqrtl: diff --git a/clang/test/CodeGen/AArch64/sincos.c b/clang/test/CodeGen/AArch64/sincos.c new file mode 100644 index 00..240d921b2b7034 --- /dev/null +++ b/clang/test/CodeGen/AArch64/sincos.c @@ -0,0 +1,33 @@ +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm %s -o - | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - | FileCheck --check-prefix=MATH-ERRN
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/114086 >From 7fd19eefa1d1f61843fe1844a72e14c7f4bae03b Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 9 Sep 2024 10:15:20 + Subject: [PATCH 1/2] [clang] Add sincos builtin using `llvm.sincos` intrinsic This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. --- clang/include/clang/Basic/Builtins.td| 13 +++ clang/lib/CodeGen/CGBuiltin.cpp | 43 clang/test/CodeGen/AArch64/sincos.c | 33 ++ clang/test/CodeGen/X86/math-builtins.c | 35 +++ clang/test/OpenMP/declare_simd_aarch64.c | 4 +-- 5 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sincos.c diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td index 9bd67e0cefebc3..27eadf80d623e6 100644 --- a/clang/include/clang/Basic/Builtins.td +++ b/clang/include/clang/Basic/Builtins.td @@ -3562,6 +3562,19 @@ def Frexp : FPMathTemplate, LibBuiltin<"math.h"> { let AddBuiltinPrefixedAlias = 1; } +def Sincos : FPMathTemplate, GNULibBuiltin<"math.h"> { + let Spellings = ["sincos"]; + let Attributes = [NoThrow]; + let Prototype = "void(T, T*, T*)"; + let AddBuiltinPrefixedAlias = 1; +} + +def SincosF16F128 : F16F128MathTemplate, Builtin { + let Spellings = ["__builtin_sincos"]; + let Attributes = [FunctionWithBuiltinPrefix, NoThrow]; + let Prototype = "void(T, T*, T*)"; +} + def Ldexp : FPMathTemplate, LibBuiltin<"math.h"> { let Spellings = ["ldexp"]; let Attributes = [NoThrow, ConstIgnoringErrnoAndExceptions]; diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 65d7f5c54a1913..331b367e63d91b 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -722,6 +722,38 @@ static Value *emitFrexpBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateExtractValue(Call, 0); } +static void emitSincosBuiltin(CodeGenFunction &CGF, const CallExpr *E, + llvm::Intrinsic::ID IntrinsicID) { + llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(0)); + llvm::Value *Dest0 = CGF.EmitScalarExpr(E->getArg(1)); + llvm::Value *Dest1 = CGF.EmitScalarExpr(E->getArg(2)); + + llvm::Function *F = CGF.CGM.getIntrinsic(IntrinsicID, {Val->getType()}); + llvm::Value *Call = CGF.Builder.CreateCall(F, Val); + + llvm::Value *SinResult = CGF.Builder.CreateExtractValue(Call, 0); + llvm::Value *CosResult = CGF.Builder.CreateExtractValue(Call, 1); + + QualType DestPtrType = E->getArg(1)->getType()->getPointeeType(); + LValue SinLV = CGF.MakeNaturalAlignAddrLValue(Dest0, DestPtrType); + LValue CosLV = CGF.MakeNaturalAlignAddrLValue(Dest1, DestPtrType); + + llvm::StoreInst *StoreSin = + CGF.Builder.CreateStore(SinResult, SinLV.getAddress()); + llvm::StoreInst *StoreCos = + CGF.Builder.CreateStore(CosResult, CosLV.getAddress()); + + // Mark the two stores as non-aliasing with eachother. The order of stores + // emitted by this builtin is arbitrary, enforcing a particular order will + // prevent optimizations later on. + llvm::MDBuilder MDHelper(CGF.getLLVMContext()); + MDNode *Domain = MDHelper.createAnonymousAliasScopeDomain(); + MDNode *AliasScope = MDHelper.createAnonymousAliasScope(Domain); + MDNode *AliasScopeList = MDNode::get(Call->getContext(), AliasScope); + StoreSin->setMetadata(LLVMContext::MD_alias_scope, AliasScopeList); + StoreCos->setMetadata(LLVMContext::MD_noalias, AliasScopeList); +} + /// EmitFAbs - Emit a call to @llvm.fabs(). static Value *EmitFAbs(CodeGenFunction &CGF, Value *V) { Function *F = CGF.CGM.getIntrinsic(Intrinsic::fabs, V->getType()); @@ -3094,6 +3126,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, return RValue::get(emitUnaryMaybeConstrainedFPBuiltin( *this, E, Intrinsic::sinh, Intrinsic::experimental_constrained_sinh)); +case Builtin::BIsincos: +case Builtin::BIsincosf: +case Builtin::BIsincosl: +case Builtin::BI__builtin_sincos: +case Builtin::BI__builtin_sincosf: +case Builtin::BI__builtin_sincosl: +case Builtin::BI__builtin_sincosf128: +case Builtin::BI__builtin_sincosf16: + emitSincosBuiltin(*this, E, Intrinsic::sincos); + return RValue::get(nullptr); + case Builtin::BIsqrt: case Builtin::BIsqrtf: case Builtin::BIsqrtl: diff --git a/clang/test/CodeGen/AArch64/sincos.c b/clang/test/CodeGen/AArch64/sincos.c new file mode 100644 index 00..240d921b2b7034 --- /dev/null +++ b/clang/test/CodeGen/AArch64/sincos.c @@ -0,0 +1,33 @@ +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm %s -o - | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - | FileCheck --check-prefix=MATH-
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
MacDue wrote: > CC @rohitaggarwal007 who added sincos vectorisation for amdlibm recently - > hopefully we can get ensure amdlibm uses the new builtin + intrinsic safely I have another patch #114039, that allows lowering the `llvm.sincos` intrinsic to the existing vector function mappings. So once the vectorizer can handle this intrinsic, it should work :slightly_smiling_face: https://github.com/llvm/llvm-project/pull/114086 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][doc] Add release note for changes to `-fveclib={ArmPL,SLEEF}` (PR #113673)
https://github.com/MacDue created https://github.com/llvm/llvm-project/pull/113673 Changed in #112580. >From fa7576522c8dfc59365c6caa8407469a5a4d Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 25 Oct 2024 10:38:01 + Subject: [PATCH] [clang][doc] Add release note for changes to `-fveclib={ArmPL,SLEEF}` Changed in #112580. --- clang/docs/ReleaseNotes.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index ed0c0e369fca74..6a9d986eb704ad 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -302,6 +302,11 @@ Modified Compiler Flags the ``promoted`` algorithm for complex division when possible rather than the less basic (limited range) algorithm. +- The ``-fveclib`` option has been updated to enable ``-fno-fast-math`` for + ``-fveclib=ArmPL`` and ``-fveclib=SLEEF``. This gives Clang more opportunities + to utilize these vector libraries. The behavior for all other vector function + libraries remains unchanged. + Removed Compiler Flags - ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue closed https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][doc] Add release note for changes to `-fveclib={ArmPL,SLEEF}` (PR #113673)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/113673 >From fa7576522c8dfc59365c6caa8407469a5a4d Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 25 Oct 2024 10:38:01 + Subject: [PATCH 1/2] [clang][doc] Add release note for changes to `-fveclib={ArmPL,SLEEF}` Changed in #112580. --- clang/docs/ReleaseNotes.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index ed0c0e369fca74..6a9d986eb704ad 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -302,6 +302,11 @@ Modified Compiler Flags the ``promoted`` algorithm for complex division when possible rather than the less basic (limited range) algorithm. +- The ``-fveclib`` option has been updated to enable ``-fno-fast-math`` for + ``-fveclib=ArmPL`` and ``-fveclib=SLEEF``. This gives Clang more opportunities + to utilize these vector libraries. The behavior for all other vector function + libraries remains unchanged. + Removed Compiler Flags - >From b4a8d8dddee174dfb183214a745900825d9b7da9 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 25 Oct 2024 10:55:43 + Subject: [PATCH 2/2] Fix typo --- clang/docs/ReleaseNotes.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst index 6a9d986eb704ad..9e1558d8acc99f 100644 --- a/clang/docs/ReleaseNotes.rst +++ b/clang/docs/ReleaseNotes.rst @@ -302,7 +302,7 @@ Modified Compiler Flags the ``promoted`` algorithm for complex division when possible rather than the less basic (limited range) algorithm. -- The ``-fveclib`` option has been updated to enable ``-fno-fast-math`` for +- The ``-fveclib`` option has been updated to enable ``-fno-math-errno`` for ``-fveclib=ArmPL`` and ``-fveclib=SLEEF``. This gives Clang more opportunities to utilize these vector libraries. The behavior for all other vector function libraries remains unchanged. ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/108853 >From ec50a22e21ab44daafe3913847bf831fdf398f79 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:27:23 + Subject: [PATCH 1/6] Precommit math-libcalls-tbaa-indirect-args.c --- .../math-libcalls-tbaa-indirect-args.c| 38 +++ 1 file changed, 38 insertions(+) create mode 100644 clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c new file mode 100644 index 00..dd013dcc8b3ca8 --- /dev/null +++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c @@ -0,0 +1,38 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 -triple=x86_64-w64-mingw32 -fmath-errno -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK + +long double powl(long double a, long double b); + +// Negative test: powl is a floating-point math function that is +// ConstWithoutErrnoAndExceptions, however, for this target long doubles are +// passed indirectly via a pointer. Annotating the call with "int" TBAA metadata +// will cause the setup for the BYVAL arguments to be incorrectly optimized out. + +// CHECK-LABEL: define dso_local void @test_powl( +// CHECK-SAME: ptr dead_on_unwind noalias nocapture writable writeonly sret(x86_fp80) align 16 [[AGG_RESULT:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT:[[TMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP1:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3:[0-9]+]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP]], align 16, !tbaa [[TBAA3:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:[[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:ret void +// +long double test_powl() { + return powl(2.0L, 2.0L); // Don't emit TBAA metadata +} +//. +// CHECK: [[TBAA3]] = !{[[META4:![0-9]+]], [[META4]], i64 0} +// CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0} +// CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0} +// CHECK: [[META6]] = !{!"Simple C/C++ TBAA"} +// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0} +// CHECK: [[META8]] = !{!"int", [[META5]], i64 0} +//. >From 15391d71f60f019727be88dbeef510f69852e547 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:14:01 + Subject: [PATCH 2/6] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args On some targets, an FP libcall with argument types such as long double will be lowered to pass arguments indirectly via pointers. When this is the case we should not mark the libcall with "int" TBAA as it may lead to incorrect optimizations. Currently, this can be seen for long doubles on x86_64-w64-mingw32. The `load x86_fp80` after the call is (incorrectly) marked with "int" TBAA (overwriting the previous metadata for "long double"). Nothing seems to break due to this currently as the metadata is being incorrectly placed on the load and not the call. But if the metadata is moved to the call (which this patch ensures), LLVM will optimize out the setup for the arguments. --- clang/lib/CodeGen/CGBuiltin.cpp | 24 +++ clang/lib/CodeGen/CGExpr.cpp | 6 - clang/lib/CodeGen/CodeGenFunction.h | 3 ++- .../math-libcalls-tbaa-indirect-args.c| 4 +--- 4 files changed, 27 insertions(+), 10 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 942468204f054c..7a4e9811afd605 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -691,23 +691,37 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, const CallExpr *E, llvm::Constant *calleeValue) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); CGCallee callee = CGCallee::forDirect(calleeValue, GlobalDecl(FD)); + llvm::CallBase *
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: > LGTM, but like I mentioned on #107598, it would be good if there was a test > that requires the argument check, and the return check isn't sufficient I've added a test case for `int ilogbl(long double a);` (which tests this in the `MINGW32` case :+1: https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: I'll land this later today if there's no objections :slightly_smiling_face: https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
https://github.com/MacDue closed https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/108853 >From 6db9f6d56f0bbd56d017156f858eae68653fbd1b Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:27:23 + Subject: [PATCH 1/5] Precommit math-libcalls-tbaa-indirect-args.c --- .../math-libcalls-tbaa-indirect-args.c| 38 +++ 1 file changed, 38 insertions(+) create mode 100644 clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c new file mode 100644 index 00..dd013dcc8b3ca8 --- /dev/null +++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c @@ -0,0 +1,38 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 -triple=x86_64-w64-mingw32 -fmath-errno -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK + +long double powl(long double a, long double b); + +// Negative test: powl is a floating-point math function that is +// ConstWithoutErrnoAndExceptions, however, for this target long doubles are +// passed indirectly via a pointer. Annotating the call with "int" TBAA metadata +// will cause the setup for the BYVAL arguments to be incorrectly optimized out. + +// CHECK-LABEL: define dso_local void @test_powl( +// CHECK-SAME: ptr dead_on_unwind noalias nocapture writable writeonly sret(x86_fp80) align 16 [[AGG_RESULT:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT:[[TMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP1:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3:[0-9]+]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP]], align 16, !tbaa [[TBAA3:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:[[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:ret void +// +long double test_powl() { + return powl(2.0L, 2.0L); // Don't emit TBAA metadata +} +//. +// CHECK: [[TBAA3]] = !{[[META4:![0-9]+]], [[META4]], i64 0} +// CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0} +// CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0} +// CHECK: [[META6]] = !{!"Simple C/C++ TBAA"} +// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0} +// CHECK: [[META8]] = !{!"int", [[META5]], i64 0} +//. >From 482639f9a8df7785d1b24c723571f477eb5febd7 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:14:01 + Subject: [PATCH 2/5] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args On some targets, an FP libcall with argument types such as long double will be lowered to pass arguments indirectly via pointers. When this is the case we should not mark the libcall with "int" TBAA as it may lead to incorrect optimizations. Currently, this can be seen for long doubles on x86_64-w64-mingw32. The `load x86_fp80` after the call is (incorrectly) marked with "int" TBAA (overwriting the previous metadata for "long double"). Nothing seems to break due to this currently as the metadata is being incorrectly placed on the load and not the call. But if the metadata is moved to the call (which this patch ensures), LLVM will optimize out the setup for the arguments. --- clang/lib/CodeGen/CGBuiltin.cpp | 24 +++ clang/lib/CodeGen/CGExpr.cpp | 6 - clang/lib/CodeGen/CodeGenFunction.h | 3 ++- .../math-libcalls-tbaa-indirect-args.c| 4 +--- 4 files changed, 27 insertions(+), 10 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 27abeba92999b3..5730e7867a648f 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -690,23 +690,37 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, const CallExpr *E, llvm::Constant *calleeValue) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); CGCallee callee = CGCallee::forDirect(calleeValue, GlobalDecl(FD)); + llvm::CallBase *
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
@@ -686,12 +686,31 @@ static Value *EmitSignBit(CodeGenFunction &CGF, Value *V) { return CGF.Builder.CreateICmpSLT(V, Zero); } +/// Checks no arguments or results are passed indirectly in the ABI (i.e. via a +/// hidden pointer). This is used to check annotating FP libcalls (that could +/// set `errno`) with "int" TBAA metadata is safe. If any floating-point +/// arguments are passed indirectly, setup for the call could be incorrectly +/// optimized out. +static bool HasNoIndirectArgumentsOrResults(CGFunctionInfo const &FnInfo) { + auto IsIndirect = [&](ABIArgInfo const &info) { MacDue wrote: If you mean on the clang function declaration, they're not a pointer there (it's just how the ABI says to pass them). If you mean why not check the LLVM function declaration, one reason is the pointers are opaque/untyped there, which may prevent setting the metadata on functions like `float frexpf(float, int*)` (which is a TODO in one of the tests). https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/110506 >From 328357f2300ebe55b8385c01f9c655f703933736 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 30 Sep 2024 11:07:45 + Subject: [PATCH 1/4] [IR] Allow fast math flags on calls with homogeneous FP struct types This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`). --- llvm/docs/LangRef.rst | 19 ++-- llvm/include/llvm/IR/DerivedTypes.h| 4 +++ llvm/include/llvm/IR/Operator.h| 14 +++-- llvm/lib/IR/Type.cpp | 13 + llvm/test/Bitcode/compatibility.ll | 20 + llvm/unittests/IR/InstructionsTest.cpp | 40 ++ 6 files changed, 87 insertions(+), 23 deletions(-) diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 3f39d58b322a4f..1eb2982385fda0 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -12472,9 +12472,8 @@ instruction's return value on the same edge). The optional ``fast-math-flags`` marker indicates that the phi has one or more :ref:`fast-math-flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math-flags -are only valid for phis that return a floating-point scalar or vector -type, or an array (nested to any depth) of floating-point scalar or vector -types. +are only valid for phis that return a floating-point scalar or vector type, +possibly within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12523,8 +12522,8 @@ class ` type. #. The optional ``fast-math flags`` marker indicates that the select has one or more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for selects that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for selects that return a floating-point scalar or vector type, possibly + within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12762,8 +12761,8 @@ This instruction requires several arguments: #. The optional ``fast-math flags`` marker indicates that the call has one or more :ref:`fast-math flags `, which are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for calls that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for calls that return a floating-point scalar or vector type, possibly within + an array (nested to any depth), or a homogeneous struct literal. #. The optional "cconv" marker indicates which :ref:`calling convention ` the call should use. If none is @@ -20528,7 +20527,8 @@ the explicit vector length. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for selects that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" @@ -20586,7 +20586,8 @@ is the pivot. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for merges that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h index 975c142f1a4572..a24801d8bdf834 100644 --- a/llvm/include/llvm/IR/DerivedTypes.h +++ b/llvm/include/llvm/IR/DerivedTypes.h @@ -301,6 +301,10 @@ class StructType : public Type { /// {, }} bool containsHomogeneousScalableVectorTypes() const; + /// Return true if this struct is non-empty and all element types are the + /// same. + bool containsHomogeneousTypes() const; + /// Return true if this is a named struct that has a non-empty name. bool hasName() const { return SymbolTableEntry != nullptr; } diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h index 88b9bfc0be4b15..22ffcc730e7b68 100644 --- a/llvm/include/llvm/IR/Operator.h +++ b/llvm/include/llvm/IR/Operator.h @@ -15,6 +15,7 @@ #define LLVM_IR_OPERATOR_H #include "llvm/ADT/MapVector.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/IR/Constants.h" #include "llvm/IR/FMF.h" #include "llvm/IR/GEPNo
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
@@ -12472,9 +12472,8 @@ instruction's return value on the same edge). The optional ``fast-math-flags`` marker indicates that the phi has one or more :ref:`fast-math-flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math-flags -are only valid for phis that return a floating-point scalar or vector -type, or an array (nested to any depth) of floating-point scalar or vector -types. +are only valid for phis that return a floating-point scalar or vector type, +possibly within an array (nested to any depth), or a homogeneous struct literal. MacDue wrote: I've listed the cases under the "Fast-Math Flags" section and updated the instructions to reference that. I think that's a little clearer than trying to squash call the cases into a single sentence (and makes future updates easier). I've also updated one of the struct examples to point out it's a homogeneous struct. https://github.com/llvm/llvm-project/pull/110506 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/110506 >From 328357f2300ebe55b8385c01f9c655f703933736 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 30 Sep 2024 11:07:45 + Subject: [PATCH 1/4] [IR] Allow fast math flags on calls with homogeneous FP struct types This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`). --- llvm/docs/LangRef.rst | 19 ++-- llvm/include/llvm/IR/DerivedTypes.h| 4 +++ llvm/include/llvm/IR/Operator.h| 14 +++-- llvm/lib/IR/Type.cpp | 13 + llvm/test/Bitcode/compatibility.ll | 20 + llvm/unittests/IR/InstructionsTest.cpp | 40 ++ 6 files changed, 87 insertions(+), 23 deletions(-) diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 3f39d58b322a4f..1eb2982385fda0 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -12472,9 +12472,8 @@ instruction's return value on the same edge). The optional ``fast-math-flags`` marker indicates that the phi has one or more :ref:`fast-math-flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math-flags -are only valid for phis that return a floating-point scalar or vector -type, or an array (nested to any depth) of floating-point scalar or vector -types. +are only valid for phis that return a floating-point scalar or vector type, +possibly within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12523,8 +12522,8 @@ class ` type. #. The optional ``fast-math flags`` marker indicates that the select has one or more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for selects that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for selects that return a floating-point scalar or vector type, possibly + within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12762,8 +12761,8 @@ This instruction requires several arguments: #. The optional ``fast-math flags`` marker indicates that the call has one or more :ref:`fast-math flags `, which are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for calls that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for calls that return a floating-point scalar or vector type, possibly within + an array (nested to any depth), or a homogeneous struct literal. #. The optional "cconv" marker indicates which :ref:`calling convention ` the call should use. If none is @@ -20528,7 +20527,8 @@ the explicit vector length. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for selects that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" @@ -20586,7 +20586,8 @@ is the pivot. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for merges that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h index 975c142f1a4572..a24801d8bdf834 100644 --- a/llvm/include/llvm/IR/DerivedTypes.h +++ b/llvm/include/llvm/IR/DerivedTypes.h @@ -301,6 +301,10 @@ class StructType : public Type { /// {, }} bool containsHomogeneousScalableVectorTypes() const; + /// Return true if this struct is non-empty and all element types are the + /// same. + bool containsHomogeneousTypes() const; + /// Return true if this is a named struct that has a non-empty name. bool hasName() const { return SymbolTableEntry != nullptr; } diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h index 88b9bfc0be4b15..22ffcc730e7b68 100644 --- a/llvm/include/llvm/IR/Operator.h +++ b/llvm/include/llvm/IR/Operator.h @@ -15,6 +15,7 @@ #define LLVM_IR_OPERATOR_H #include "llvm/ADT/MapVector.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/IR/Constants.h" #include "llvm/IR/FMF.h" #include "llvm/IR/GEPNo
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/110506 >From 328357f2300ebe55b8385c01f9c655f703933736 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 30 Sep 2024 11:07:45 + Subject: [PATCH 1/5] [IR] Allow fast math flags on calls with homogeneous FP struct types This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`). --- llvm/docs/LangRef.rst | 19 ++-- llvm/include/llvm/IR/DerivedTypes.h| 4 +++ llvm/include/llvm/IR/Operator.h| 14 +++-- llvm/lib/IR/Type.cpp | 13 + llvm/test/Bitcode/compatibility.ll | 20 + llvm/unittests/IR/InstructionsTest.cpp | 40 ++ 6 files changed, 87 insertions(+), 23 deletions(-) diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 3f39d58b322a4f..1eb2982385fda0 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -12472,9 +12472,8 @@ instruction's return value on the same edge). The optional ``fast-math-flags`` marker indicates that the phi has one or more :ref:`fast-math-flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math-flags -are only valid for phis that return a floating-point scalar or vector -type, or an array (nested to any depth) of floating-point scalar or vector -types. +are only valid for phis that return a floating-point scalar or vector type, +possibly within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12523,8 +12522,8 @@ class ` type. #. The optional ``fast-math flags`` marker indicates that the select has one or more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for selects that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for selects that return a floating-point scalar or vector type, possibly + within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12762,8 +12761,8 @@ This instruction requires several arguments: #. The optional ``fast-math flags`` marker indicates that the call has one or more :ref:`fast-math flags `, which are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for calls that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for calls that return a floating-point scalar or vector type, possibly within + an array (nested to any depth), or a homogeneous struct literal. #. The optional "cconv" marker indicates which :ref:`calling convention ` the call should use. If none is @@ -20528,7 +20527,8 @@ the explicit vector length. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for selects that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" @@ -20586,7 +20586,8 @@ is the pivot. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for merges that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h index 975c142f1a4572..a24801d8bdf834 100644 --- a/llvm/include/llvm/IR/DerivedTypes.h +++ b/llvm/include/llvm/IR/DerivedTypes.h @@ -301,6 +301,10 @@ class StructType : public Type { /// {, }} bool containsHomogeneousScalableVectorTypes() const; + /// Return true if this struct is non-empty and all element types are the + /// same. + bool containsHomogeneousTypes() const; + /// Return true if this is a named struct that has a non-empty name. bool hasName() const { return SymbolTableEntry != nullptr; } diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h index 88b9bfc0be4b15..22ffcc730e7b68 100644 --- a/llvm/include/llvm/IR/Operator.h +++ b/llvm/include/llvm/IR/Operator.h @@ -15,6 +15,7 @@ #define LLVM_IR_OPERATOR_H #include "llvm/ADT/MapVector.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/IR/Constants.h" #include "llvm/IR/FMF.h" #include "llvm/IR/GEPNo
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
@@ -1122,6 +1122,26 @@ define void @fastMathFlagsForArrayCalls([2 x float] %f, [2 x double] %d1, [2 x < ret void } +declare { float, float } @fmf_struct_f32() +declare { double, double } @fmf_struct_f64() +declare { <4 x double>, <4 x double> } @fmf_struct_v4f64() + +; CHECK-LABEL: fastMathFlagsForStructCalls( +define void @fastMathFlagsForStructCalls({ float, float } %f, { double, double } %d1, { <4 x double>, <4 x double> } %d2) { + %call.fast = call fast { float, float } @fmf_struct_f32() + ; CHECK: %call.fast = call fast { float, float } @fmf_struct_f32() + + ; Throw in some other attributes to make sure those stay in the right places. + + %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64() + ; CHECK: %call.nsz.arcp = notail call nsz arcp { double, double } @fmf_struct_f64() + + %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64() + ; CHECK: %call.nnan.ninf = tail call nnan ninf fastcc { <4 x double>, <4 x double> } @fmf_struct_v4f64() + MacDue wrote: `nofpclass` used a separate check, so I had to update it to support struct types (in the last commit). Not sure if it should be part of this PR, or moved to a later PR though? https://github.com/llvm/llvm-project/pull/110506 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/110506 >From 328357f2300ebe55b8385c01f9c655f703933736 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 30 Sep 2024 11:07:45 + Subject: [PATCH 1/5] [IR] Allow fast math flags on calls with homogeneous FP struct types This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`). --- llvm/docs/LangRef.rst | 19 ++-- llvm/include/llvm/IR/DerivedTypes.h| 4 +++ llvm/include/llvm/IR/Operator.h| 14 +++-- llvm/lib/IR/Type.cpp | 13 + llvm/test/Bitcode/compatibility.ll | 20 + llvm/unittests/IR/InstructionsTest.cpp | 40 ++ 6 files changed, 87 insertions(+), 23 deletions(-) diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 3f39d58b322a4f..1eb2982385fda0 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -12472,9 +12472,8 @@ instruction's return value on the same edge). The optional ``fast-math-flags`` marker indicates that the phi has one or more :ref:`fast-math-flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math-flags -are only valid for phis that return a floating-point scalar or vector -type, or an array (nested to any depth) of floating-point scalar or vector -types. +are only valid for phis that return a floating-point scalar or vector type, +possibly within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12523,8 +12522,8 @@ class ` type. #. The optional ``fast-math flags`` marker indicates that the select has one or more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for selects that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for selects that return a floating-point scalar or vector type, possibly + within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12762,8 +12761,8 @@ This instruction requires several arguments: #. The optional ``fast-math flags`` marker indicates that the call has one or more :ref:`fast-math flags `, which are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for calls that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for calls that return a floating-point scalar or vector type, possibly within + an array (nested to any depth), or a homogeneous struct literal. #. The optional "cconv" marker indicates which :ref:`calling convention ` the call should use. If none is @@ -20528,7 +20527,8 @@ the explicit vector length. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for selects that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" @@ -20586,7 +20586,8 @@ is the pivot. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for merges that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h index 975c142f1a4572..a24801d8bdf834 100644 --- a/llvm/include/llvm/IR/DerivedTypes.h +++ b/llvm/include/llvm/IR/DerivedTypes.h @@ -301,6 +301,10 @@ class StructType : public Type { /// {, }} bool containsHomogeneousScalableVectorTypes() const; + /// Return true if this struct is non-empty and all element types are the + /// same. + bool containsHomogeneousTypes() const; + /// Return true if this is a named struct that has a non-empty name. bool hasName() const { return SymbolTableEntry != nullptr; } diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h index 88b9bfc0be4b15..22ffcc730e7b68 100644 --- a/llvm/include/llvm/IR/Operator.h +++ b/llvm/include/llvm/IR/Operator.h @@ -15,6 +15,7 @@ #define LLVM_IR_OPERATOR_H #include "llvm/ADT/MapVector.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/IR/Constants.h" #include "llvm/IR/FMF.h" #include "llvm/IR/GEPNo
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/110506 >From 328357f2300ebe55b8385c01f9c655f703933736 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 30 Sep 2024 11:07:45 + Subject: [PATCH 1/9] [IR] Allow fast math flags on calls with homogeneous FP struct types This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`). --- llvm/docs/LangRef.rst | 19 ++-- llvm/include/llvm/IR/DerivedTypes.h| 4 +++ llvm/include/llvm/IR/Operator.h| 14 +++-- llvm/lib/IR/Type.cpp | 13 + llvm/test/Bitcode/compatibility.ll | 20 + llvm/unittests/IR/InstructionsTest.cpp | 40 ++ 6 files changed, 87 insertions(+), 23 deletions(-) diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 3f39d58b322a4f..1eb2982385fda0 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -12472,9 +12472,8 @@ instruction's return value on the same edge). The optional ``fast-math-flags`` marker indicates that the phi has one or more :ref:`fast-math-flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math-flags -are only valid for phis that return a floating-point scalar or vector -type, or an array (nested to any depth) of floating-point scalar or vector -types. +are only valid for phis that return a floating-point scalar or vector type, +possibly within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12523,8 +12522,8 @@ class ` type. #. The optional ``fast-math flags`` marker indicates that the select has one or more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for selects that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for selects that return a floating-point scalar or vector type, possibly + within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12762,8 +12761,8 @@ This instruction requires several arguments: #. The optional ``fast-math flags`` marker indicates that the call has one or more :ref:`fast-math flags `, which are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for calls that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for calls that return a floating-point scalar or vector type, possibly within + an array (nested to any depth), or a homogeneous struct literal. #. The optional "cconv" marker indicates which :ref:`calling convention ` the call should use. If none is @@ -20528,7 +20527,8 @@ the explicit vector length. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for selects that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" @@ -20586,7 +20586,8 @@ is the pivot. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for merges that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h index 975c142f1a4572..a24801d8bdf834 100644 --- a/llvm/include/llvm/IR/DerivedTypes.h +++ b/llvm/include/llvm/IR/DerivedTypes.h @@ -301,6 +301,10 @@ class StructType : public Type { /// {, }} bool containsHomogeneousScalableVectorTypes() const; + /// Return true if this struct is non-empty and all element types are the + /// same. + bool containsHomogeneousTypes() const; + /// Return true if this is a named struct that has a non-empty name. bool hasName() const { return SymbolTableEntry != nullptr; } diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h index 88b9bfc0be4b15..22ffcc730e7b68 100644 --- a/llvm/include/llvm/IR/Operator.h +++ b/llvm/include/llvm/IR/Operator.h @@ -15,6 +15,7 @@ #define LLVM_IR_OPERATOR_H #include "llvm/ADT/MapVector.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/IR/Constants.h" #include "llvm/IR/FMF.h" #include "llvm/IR/GEPNo
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
@@ -326,6 +326,21 @@ class FPMathOperator : public Operator { /// precision. float getFPAccuracy() const; + /// Returns true if `Ty` is a supported floating-point type for phi, select, + /// or call FPMathOperators. + static bool isSupportedFloatingPointType(Type *Ty) { +if (auto *StructTy = dyn_cast(Ty)) { + if (!StructTy->isLiteral() || !StructTy->containsHomogeneousTypes()) +return false; + Ty = StructTy->elements().front(); +} else if (auto *ArrayTy = dyn_cast(Ty)) { + do { +Ty = ArrayTy->getElementType(); + } while ((ArrayTy = dyn_cast(Ty))); +} +return Ty->isFPOrFPVectorTy(); MacDue wrote: Done :+1: https://github.com/llvm/llvm-project/pull/110506 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/110506 >From 328357f2300ebe55b8385c01f9c655f703933736 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 30 Sep 2024 11:07:45 + Subject: [PATCH 1/9] [IR] Allow fast math flags on calls with homogeneous FP struct types This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`). --- llvm/docs/LangRef.rst | 19 ++-- llvm/include/llvm/IR/DerivedTypes.h| 4 +++ llvm/include/llvm/IR/Operator.h| 14 +++-- llvm/lib/IR/Type.cpp | 13 + llvm/test/Bitcode/compatibility.ll | 20 + llvm/unittests/IR/InstructionsTest.cpp | 40 ++ 6 files changed, 87 insertions(+), 23 deletions(-) diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 3f39d58b322a4f..1eb2982385fda0 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -12472,9 +12472,8 @@ instruction's return value on the same edge). The optional ``fast-math-flags`` marker indicates that the phi has one or more :ref:`fast-math-flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math-flags -are only valid for phis that return a floating-point scalar or vector -type, or an array (nested to any depth) of floating-point scalar or vector -types. +are only valid for phis that return a floating-point scalar or vector type, +possibly within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12523,8 +12522,8 @@ class ` type. #. The optional ``fast-math flags`` marker indicates that the select has one or more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for selects that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for selects that return a floating-point scalar or vector type, possibly + within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12762,8 +12761,8 @@ This instruction requires several arguments: #. The optional ``fast-math flags`` marker indicates that the call has one or more :ref:`fast-math flags `, which are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for calls that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for calls that return a floating-point scalar or vector type, possibly within + an array (nested to any depth), or a homogeneous struct literal. #. The optional "cconv" marker indicates which :ref:`calling convention ` the call should use. If none is @@ -20528,7 +20527,8 @@ the explicit vector length. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for selects that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" @@ -20586,7 +20586,8 @@ is the pivot. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for merges that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h index 975c142f1a4572..a24801d8bdf834 100644 --- a/llvm/include/llvm/IR/DerivedTypes.h +++ b/llvm/include/llvm/IR/DerivedTypes.h @@ -301,6 +301,10 @@ class StructType : public Type { /// {, }} bool containsHomogeneousScalableVectorTypes() const; + /// Return true if this struct is non-empty and all element types are the + /// same. + bool containsHomogeneousTypes() const; + /// Return true if this is a named struct that has a non-empty name. bool hasName() const { return SymbolTableEntry != nullptr; } diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h index 88b9bfc0be4b15..22ffcc730e7b68 100644 --- a/llvm/include/llvm/IR/Operator.h +++ b/llvm/include/llvm/IR/Operator.h @@ -15,6 +15,7 @@ #define LLVM_IR_OPERATOR_H #include "llvm/ADT/MapVector.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/IR/Constants.h" #include "llvm/IR/FMF.h" #include "llvm/IR/GEPNo
[clang] [llvm] [IR] Allow fast math flags on calls with homogeneous FP struct types (PR #110506)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/110506 >From 328357f2300ebe55b8385c01f9c655f703933736 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 30 Sep 2024 11:07:45 + Subject: [PATCH 1/9] [IR] Allow fast math flags on calls with homogeneous FP struct types This extends FPMathOperator to allow calls that return literal structs of homogeneous floating-point or vector-of-floating-point types. The intended use case for this is to support FP intrinsics that return multiple values (such as `llvm.sincos`). --- llvm/docs/LangRef.rst | 19 ++-- llvm/include/llvm/IR/DerivedTypes.h| 4 +++ llvm/include/llvm/IR/Operator.h| 14 +++-- llvm/lib/IR/Type.cpp | 13 + llvm/test/Bitcode/compatibility.ll | 20 + llvm/unittests/IR/InstructionsTest.cpp | 40 ++ 6 files changed, 87 insertions(+), 23 deletions(-) diff --git a/llvm/docs/LangRef.rst b/llvm/docs/LangRef.rst index 3f39d58b322a4f..1eb2982385fda0 100644 --- a/llvm/docs/LangRef.rst +++ b/llvm/docs/LangRef.rst @@ -12472,9 +12472,8 @@ instruction's return value on the same edge). The optional ``fast-math-flags`` marker indicates that the phi has one or more :ref:`fast-math-flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math-flags -are only valid for phis that return a floating-point scalar or vector -type, or an array (nested to any depth) of floating-point scalar or vector -types. +are only valid for phis that return a floating-point scalar or vector type, +possibly within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12523,8 +12522,8 @@ class ` type. #. The optional ``fast-math flags`` marker indicates that the select has one or more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for selects that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for selects that return a floating-point scalar or vector type, possibly + within an array (nested to any depth), or a homogeneous struct literal. Semantics: "" @@ -12762,8 +12761,8 @@ This instruction requires several arguments: #. The optional ``fast-math flags`` marker indicates that the call has one or more :ref:`fast-math flags `, which are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid - for calls that return a floating-point scalar or vector type, or an array - (nested to any depth) of floating-point scalar or vector types. + for calls that return a floating-point scalar or vector type, possibly within + an array (nested to any depth), or a homogeneous struct literal. #. The optional "cconv" marker indicates which :ref:`calling convention ` the call should use. If none is @@ -20528,7 +20527,8 @@ the explicit vector length. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for selects that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" @@ -20586,7 +20586,8 @@ is the pivot. more :ref:`fast-math flags `. These are optimization hints to enable otherwise unsafe floating-point optimizations. Fast-math flags are only valid for merges that return a floating-point scalar or vector type, - or an array (nested to any depth) of floating-point scalar or vector types. + possibly within an array (nested to any depth), or a homogeneous struct + literal. Semantics: "" diff --git a/llvm/include/llvm/IR/DerivedTypes.h b/llvm/include/llvm/IR/DerivedTypes.h index 975c142f1a4572..a24801d8bdf834 100644 --- a/llvm/include/llvm/IR/DerivedTypes.h +++ b/llvm/include/llvm/IR/DerivedTypes.h @@ -301,6 +301,10 @@ class StructType : public Type { /// {, }} bool containsHomogeneousScalableVectorTypes() const; + /// Return true if this struct is non-empty and all element types are the + /// same. + bool containsHomogeneousTypes() const; + /// Return true if this is a named struct that has a non-empty name. bool hasName() const { return SymbolTableEntry != nullptr; } diff --git a/llvm/include/llvm/IR/Operator.h b/llvm/include/llvm/IR/Operator.h index 88b9bfc0be4b15..22ffcc730e7b68 100644 --- a/llvm/include/llvm/IR/Operator.h +++ b/llvm/include/llvm/IR/Operator.h @@ -15,6 +15,7 @@ #define LLVM_IR_OPERATOR_H #include "llvm/ADT/MapVector.h" +#include "llvm/ADT/TypeSwitch.h" #include "llvm/IR/Constants.h" #include "llvm/IR/FMF.h" #include "llvm/IR/GEPNo
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/108853 >From 6db9f6d56f0bbd56d017156f858eae68653fbd1b Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:27:23 + Subject: [PATCH 1/5] Precommit math-libcalls-tbaa-indirect-args.c --- .../math-libcalls-tbaa-indirect-args.c| 38 +++ 1 file changed, 38 insertions(+) create mode 100644 clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c new file mode 100644 index 00..dd013dcc8b3ca8 --- /dev/null +++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c @@ -0,0 +1,38 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 -triple=x86_64-w64-mingw32 -fmath-errno -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK + +long double powl(long double a, long double b); + +// Negative test: powl is a floating-point math function that is +// ConstWithoutErrnoAndExceptions, however, for this target long doubles are +// passed indirectly via a pointer. Annotating the call with "int" TBAA metadata +// will cause the setup for the BYVAL arguments to be incorrectly optimized out. + +// CHECK-LABEL: define dso_local void @test_powl( +// CHECK-SAME: ptr dead_on_unwind noalias nocapture writable writeonly sret(x86_fp80) align 16 [[AGG_RESULT:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT:[[TMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP1:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3:[0-9]+]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP]], align 16, !tbaa [[TBAA3:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:[[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:ret void +// +long double test_powl() { + return powl(2.0L, 2.0L); // Don't emit TBAA metadata +} +//. +// CHECK: [[TBAA3]] = !{[[META4:![0-9]+]], [[META4]], i64 0} +// CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0} +// CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0} +// CHECK: [[META6]] = !{!"Simple C/C++ TBAA"} +// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0} +// CHECK: [[META8]] = !{!"int", [[META5]], i64 0} +//. >From 482639f9a8df7785d1b24c723571f477eb5febd7 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:14:01 + Subject: [PATCH 2/5] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args On some targets, an FP libcall with argument types such as long double will be lowered to pass arguments indirectly via pointers. When this is the case we should not mark the libcall with "int" TBAA as it may lead to incorrect optimizations. Currently, this can be seen for long doubles on x86_64-w64-mingw32. The `load x86_fp80` after the call is (incorrectly) marked with "int" TBAA (overwriting the previous metadata for "long double"). Nothing seems to break due to this currently as the metadata is being incorrectly placed on the load and not the call. But if the metadata is moved to the call (which this patch ensures), LLVM will optimize out the setup for the arguments. --- clang/lib/CodeGen/CGBuiltin.cpp | 24 +++ clang/lib/CodeGen/CGExpr.cpp | 6 - clang/lib/CodeGen/CodeGenFunction.h | 3 ++- .../math-libcalls-tbaa-indirect-args.c| 4 +--- 4 files changed, 27 insertions(+), 10 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 27abeba92999b3..5730e7867a648f 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -690,23 +690,37 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, const CallExpr *E, llvm::Constant *calleeValue) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); CGCallee callee = CGCallee::forDirect(calleeValue, GlobalDecl(FD)); + llvm::CallBase *
[clang] [clang][doc] Add release note for changes to `-fveclib={ArmPL,SLEEF}` (PR #113673)
https://github.com/MacDue closed https://github.com/llvm/llvm-project/pull/113673 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
@@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" MacDue wrote: I think it'll also need to track which option/arg caused `-fmath-errno` to be set (otherwise the warning would be somewhat unhelpful in some cases). https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
@@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" MacDue wrote: Added the extra tests for `-fmath-errno` after `-fveclib`. If we want to add a warning there's a few cases to consider where errno could be re-enabled: - `-fveclib=ArmPL -fmath-errno` (the obvious case) - `-fveclib=ArmPL -fno-fast-math` - `-fveclib=ArmPL -ffp-model=strict` Would a warning be expected for all of these? https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue edited https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue edited https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From d8ac47d27ad860a8b11424621ab88cd9267cf866 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/2] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 379e75b197cf96..7965f70e290408 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 3fc39296f44281..7e7f3770cfb62d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 75bcb
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue edited https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From d8ac47d27ad860a8b11424621ab88cd9267cf866 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/3] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 379e75b197cf96..7965f70e290408 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 3fc39296f44281..7e7f3770cfb62d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 75bcb
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/112580 >From d8ac47d27ad860a8b11424621ab88cd9267cf866 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Wed, 2 Oct 2024 10:28:29 + Subject: [PATCH 1/3] [clang] Make -fveclib={ArmPL,SLEEF} imply -fno-math-errno These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see https://github.com/llvm/llvm-project/pull/108980#discussion_r176660). --- clang/include/clang/Driver/Options.td | 3 ++- clang/lib/Driver/ToolChains/Clang.cpp | 8 clang/test/Driver/fveclib.c | 7 +++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td index 379e75b197cf96..7965f70e290408 100644 --- a/clang/include/clang/Driver/Options.td +++ b/clang/include/clang/Driver/Options.td @@ -3410,7 +3410,8 @@ def fno_experimental_isel : Flag<["-"], "fno-experimental-isel">, Group; def fveclib : Joined<["-"], "fveclib=">, Group, Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, -HelpText<"Use the given vector functions library">, +HelpText<"Use the given vector functions library." + "Note: -fveclib={ArmPL,SLEEF} implies -fno-math-errno">, Values<"Accelerate,libmvec,MASSV,SVML,SLEEF,Darwin_libsystem_m,ArmPL,AMDLIBM,none">, NormalizedValuesScope<"llvm::driver::VectorLibrary">, NormalizedValues<["Accelerate", "LIBMVEC", "MASSV", "SVML", "SLEEF", diff --git a/clang/lib/Driver/ToolChains/Clang.cpp b/clang/lib/Driver/ToolChains/Clang.cpp index 3fc39296f44281..7e7f3770cfb62d 100644 --- a/clang/lib/Driver/ToolChains/Clang.cpp +++ b/clang/lib/Driver/ToolChains/Clang.cpp @@ -2854,6 +2854,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, bool OFastEnabled, const ArgList &Args, ArgStringList &CmdArgs, const JobAction &JA) { + // List of veclibs which when used with -fveclib imply -fno-math-errno. + constexpr std::array VecLibImpliesNoMathErrno{llvm::StringLiteral("ArmPL"), +llvm::StringLiteral("SLEEF")}; + // Handle various floating point optimization flags, mapping them to the // appropriate LLVM code generation flags. This is complicated by several // "umbrella" flags, so we do this by stepping through the flags incrementally @@ -3125,6 +3129,10 @@ static void RenderFloatingPointOptions(const ToolChain &TC, const Driver &D, TrappingMathPresent = true; FPExceptionBehavior = "strict"; break; +case options::OPT_fveclib: + if (llvm::is_contained(VecLibImpliesNoMathErrno, A->getValue())) +MathErrno = false; + break; case options::OPT_fno_trapping_math: if (!TrappingMathPresent && !FPExceptionBehavior.empty() && FPExceptionBehavior != "ignore") diff --git a/clang/test/Driver/fveclib.c b/clang/test/Driver/fveclib.c index 9b0f1ce13aa2bd..2a3133541e3b72 100644 --- a/clang/test/Driver/fveclib.c +++ b/clang/test/Driver/fveclib.c @@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" // CHECK-LTO-ARMPL: "-plugin-opt=-vector-library=ArmPL" +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" >From 75bcb
[clang] [clang] Make -fveclib={ArmPL, SLEEF} imply -fno-math-errno (PR #112580)
@@ -36,16 +36,23 @@ /* Verify that the correct vector library is passed to LTO flags. */ // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=LIBMVEC -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-LIBMVEC %s +// CHECK-LTO-LIBMVEC: "-fmath-errno" // CHECK-LTO-LIBMVEC: "-plugin-opt=-vector-library=LIBMVEC-X86" // RUN: %clang -### --target=powerpc64-unknown-linux-gnu -fveclib=MASSV -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-MASSV %s +// CHECK-LTO-MASSV: "-fmath-errno" // CHECK-LTO-MASSV: "-plugin-opt=-vector-library=MASSV" // RUN: %clang -### --target=x86_64-unknown-linux-gnu -fveclib=SVML -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SVML %s +// CHECK-LTO-SVML: "-fmath-errno" // CHECK-LTO-SVML: "-plugin-opt=-vector-library=SVML" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=SLEEF -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-SLEEF %s +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // CHECK-LTO-SLEEF: "-plugin-opt=-vector-library=sleefgnuabi" +// CHECK-LTO-SLEEF-NOT: "-fmath-errno" // RUN: %clang -### --target=aarch64-linux-gnu -fveclib=ArmPL -flto %s 2>&1 | FileCheck --check-prefix=CHECK-LTO-ARMPL %s +// CHECK-LTO-ARMPL-NOT: "-fmath-errno" MacDue wrote: Ah well, I did add it to the diagnostic :sweat_smile: (I don't think it adds much complexity, and may be helpful), but I don't mind removing it :slightly_smiling_face: https://github.com/llvm/llvm-project/pull/112580 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][SME] Ignore flatten for callees with mismatched streaming attributes (PR #116391)
https://github.com/MacDue ready_for_review https://github.com/llvm/llvm-project/pull/116391 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][SME] Ignore flatten for callees mismatched streaming attributes (PR #116391)
https://github.com/MacDue created https://github.com/llvm/llvm-project/pull/116391 If `__attribute__((flatten))` is used on a function don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when `flatten` is used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". >From 90daf9c544bcb776c8a68ad504ba5eda50eafe8a Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 15 Nov 2024 14:35:41 + Subject: [PATCH] [clang][SME] Ignore flatten for callees mismatched streaming attributes If `__attribute__((flatten))` is used on a function don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when `flatten` is used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". --- clang/lib/CodeGen/CGCall.cpp | 11 ++- clang/lib/CodeGen/TargetInfo.h| 9 +++ clang/lib/CodeGen/Targets/AArch64.cpp | 64 +--- .../AArch64/sme-flatten-streaming-attrs.c | 74 +++ 4 files changed, 143 insertions(+), 15 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sme-flatten-streaming-attrs.c diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 8f4f5d3ed81601..b8a968fdf4e9eb 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5112,9 +5112,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // Some architectures (such as x86-64) have the ABI changed based on // attribute-target/features. Give them a chance to diagnose. - CGM.getTargetCodeGenInfo().checkFunctionCallABI( - CGM, Loc, dyn_cast_or_null(CurCodeDecl), - dyn_cast_or_null(TargetDecl), CallArgs, RetTy); + const FunctionDecl *CallerDecl = dyn_cast_or_null(CurCodeDecl); + const FunctionDecl *CalleeDecl = dyn_cast_or_null(TargetDecl); + CGM.getTargetCodeGenInfo().checkFunctionCallABI(CGM, Loc, CallerDecl, + CalleeDecl, CallArgs, RetTy); // 1. Set up the arguments. @@ -5705,7 +5706,9 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // FIXME: should this really take priority over __try, below? if (CurCodeDecl && CurCodeDecl->hasAttr() && !InNoInlineAttributedStmt && - !(TargetDecl && TargetDecl->hasAttr())) { + !(TargetDecl && TargetDecl->hasAttr()) && + !CGM.getTargetCodeGenInfo().wouldInliningViolateFunctionCallABI( + CallerDecl, CalleeDecl)) { Attrs = Attrs.addFnAttribute(getLLVMContext(), llvm::Attribute::AlwaysInline); } diff --git a/clang/lib/CodeGen/TargetInfo.h b/clang/lib/CodeGen/TargetInfo.h index 373f8b8a80fdb1..23ff476b0e33ce 100644 --- a/clang/lib/CodeGen/TargetInfo.h +++ b/clang/lib/CodeGen/TargetInfo.h @@ -98,6 +98,15 @@ class TargetCodeGenInfo { const CallArgList &Args, QualType ReturnType) const {} + /// Returns true if inlining the function call would produce incorrect code + /// for the current target and should be ignored (even with the always_inline + /// or flatten attributes). + virtual bool + wouldInliningViolateFunctionCallABI(const FunctionDecl *Caller, + const FunctionDecl *Callee) const { +return false; + } + /// Determines the size of struct _Unwind_Exception on this platform, /// in 8-bit units. The Itanium ABI defines this as: /// struct _Unwind_Exception { diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp index 9320c6ef06efab..a9ea84b6575f92 100644 --- a/clang/lib/CodeGen/Targets/AArch64.cpp +++ b/clang/lib/CodeGen/Targets/AArch64.cpp @@ -177,6 +177,9 @@ class AArch64TargetCodeGenInfo : public TargetCodeGenInfo { const FunctionDecl *Callee, const CallArgList &Args, QualType ReturnType) const override; + bool wouldInliningViolateFunctionCallABI( + const FunctionDecl *Caller, const FunctionDecl *Callee) const override; + private: // Diagnose calls between functions with incompatible Streaming SVE // attributes. @@ -1143,12 +1146,20 @@ void AArch64TargetCodeGenInfo::checkFunctionABI( } } -void AArch64TargetCodeGenInfo::checkFunctionCallABIStreaming( -CodeGenModule &CGM, SourceLocation CallLoc, const FunctionDecl *Caller, -const FunctionDecl *Callee) const { - if (!Caller || !Callee || !Callee->hasAttr()) -return; +enum class ArmStreamingInlinability : uint8_t { + Ok = 0, + IncompatibleStreamingModes = 1, + MismatchedStreamingCompatibility = 1 << 1, +
[clang] [clang][SME] Ignore flatten for callees with mismatched streaming attributes (PR #116391)
https://github.com/MacDue edited https://github.com/llvm/llvm-project/pull/116391 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][SME] Ignore flatten/clang::always_inline statements for callees with mismatched streaming attributes (PR #116391)
@@ -1143,30 +1146,63 @@ void AArch64TargetCodeGenInfo::checkFunctionABI( } } -void AArch64TargetCodeGenInfo::checkFunctionCallABIStreaming( -CodeGenModule &CGM, SourceLocation CallLoc, const FunctionDecl *Caller, -const FunctionDecl *Callee) const { - if (!Caller || !Callee || !Callee->hasAttr()) -return; +enum class ArmSMEInlinability : uint8_t { + Ok = 0, + MismatchedStreamingCompatibility = 1 << 0, + IncompatibleStreamingModes = 1 << 1, + CalleeRequiresNewZA = 1 << 2, + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/CalleeRequiresNewZA), +}; +/// Determines if there are any Arm SME ABI issues with inlining \p Callee into +/// \p Caller. Returns the issues in the ArmSMEInlinability bit enum (multiple +/// bits can be set). +static ArmSMEInlinability GetArmSMEInlinability(const FunctionDecl *Caller, +const FunctionDecl *Callee) { bool CallerIsStreaming = IsArmStreamingFunction(Caller, /*IncludeLocallyStreaming=*/true); bool CalleeIsStreaming = IsArmStreamingFunction(Callee, /*IncludeLocallyStreaming=*/true); bool CallerIsStreamingCompatible = isStreamingCompatible(Caller); bool CalleeIsStreamingCompatible = isStreamingCompatible(Callee); + ArmSMEInlinability Inlinability = ArmSMEInlinability::Ok; + if (!CalleeIsStreamingCompatible && - (CallerIsStreaming != CalleeIsStreaming || CallerIsStreamingCompatible)) -CGM.getDiags().Report( -CallLoc, CalleeIsStreaming - ? diag::err_function_always_inline_attribute_mismatch - : diag::warn_function_always_inline_attribute_mismatch) -<< Caller->getDeclName() << Callee->getDeclName() << "streaming"; + (CallerIsStreaming != CalleeIsStreaming || CallerIsStreamingCompatible)) { +Inlinability |= ArmSMEInlinability::MismatchedStreamingCompatibility; +if (CalleeIsStreaming) + Inlinability |= ArmSMEInlinability::IncompatibleStreamingModes; + } MacDue wrote: Done :+1: https://github.com/llvm/llvm-project/pull/116391 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][SME] Ignore flatten/clang::always_inline statements for callees with mismatched streaming attributes (PR #116391)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/116391 >From 90daf9c544bcb776c8a68ad504ba5eda50eafe8a Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 15 Nov 2024 14:35:41 + Subject: [PATCH 1/6] [clang][SME] Ignore flatten for callees mismatched streaming attributes If `__attribute__((flatten))` is used on a function don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when `flatten` is used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". --- clang/lib/CodeGen/CGCall.cpp | 11 ++- clang/lib/CodeGen/TargetInfo.h| 9 +++ clang/lib/CodeGen/Targets/AArch64.cpp | 64 +--- .../AArch64/sme-flatten-streaming-attrs.c | 74 +++ 4 files changed, 143 insertions(+), 15 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sme-flatten-streaming-attrs.c diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 8f4f5d3ed81601..b8a968fdf4e9eb 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5112,9 +5112,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // Some architectures (such as x86-64) have the ABI changed based on // attribute-target/features. Give them a chance to diagnose. - CGM.getTargetCodeGenInfo().checkFunctionCallABI( - CGM, Loc, dyn_cast_or_null(CurCodeDecl), - dyn_cast_or_null(TargetDecl), CallArgs, RetTy); + const FunctionDecl *CallerDecl = dyn_cast_or_null(CurCodeDecl); + const FunctionDecl *CalleeDecl = dyn_cast_or_null(TargetDecl); + CGM.getTargetCodeGenInfo().checkFunctionCallABI(CGM, Loc, CallerDecl, + CalleeDecl, CallArgs, RetTy); // 1. Set up the arguments. @@ -5705,7 +5706,9 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // FIXME: should this really take priority over __try, below? if (CurCodeDecl && CurCodeDecl->hasAttr() && !InNoInlineAttributedStmt && - !(TargetDecl && TargetDecl->hasAttr())) { + !(TargetDecl && TargetDecl->hasAttr()) && + !CGM.getTargetCodeGenInfo().wouldInliningViolateFunctionCallABI( + CallerDecl, CalleeDecl)) { Attrs = Attrs.addFnAttribute(getLLVMContext(), llvm::Attribute::AlwaysInline); } diff --git a/clang/lib/CodeGen/TargetInfo.h b/clang/lib/CodeGen/TargetInfo.h index 373f8b8a80fdb1..23ff476b0e33ce 100644 --- a/clang/lib/CodeGen/TargetInfo.h +++ b/clang/lib/CodeGen/TargetInfo.h @@ -98,6 +98,15 @@ class TargetCodeGenInfo { const CallArgList &Args, QualType ReturnType) const {} + /// Returns true if inlining the function call would produce incorrect code + /// for the current target and should be ignored (even with the always_inline + /// or flatten attributes). + virtual bool + wouldInliningViolateFunctionCallABI(const FunctionDecl *Caller, + const FunctionDecl *Callee) const { +return false; + } + /// Determines the size of struct _Unwind_Exception on this platform, /// in 8-bit units. The Itanium ABI defines this as: /// struct _Unwind_Exception { diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp index 9320c6ef06efab..a9ea84b6575f92 100644 --- a/clang/lib/CodeGen/Targets/AArch64.cpp +++ b/clang/lib/CodeGen/Targets/AArch64.cpp @@ -177,6 +177,9 @@ class AArch64TargetCodeGenInfo : public TargetCodeGenInfo { const FunctionDecl *Callee, const CallArgList &Args, QualType ReturnType) const override; + bool wouldInliningViolateFunctionCallABI( + const FunctionDecl *Caller, const FunctionDecl *Callee) const override; + private: // Diagnose calls between functions with incompatible Streaming SVE // attributes. @@ -1143,12 +1146,20 @@ void AArch64TargetCodeGenInfo::checkFunctionABI( } } -void AArch64TargetCodeGenInfo::checkFunctionCallABIStreaming( -CodeGenModule &CGM, SourceLocation CallLoc, const FunctionDecl *Caller, -const FunctionDecl *Callee) const { - if (!Caller || !Callee || !Callee->hasAttr()) -return; +enum class ArmStreamingInlinability : uint8_t { + Ok = 0, + IncompatibleStreamingModes = 1, + MismatchedStreamingCompatibility = 1 << 1, + CalleeHasNewZA = 1 << 2, + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/CalleeHasNewZA), +}; +/// Determines if there are any streaming ABI issues with inlining \p Callee +/// into \p Caller. Returns the issues in the ArmStreamingInlinability bit enum +/// (multiple bits can be set). +static ArmStreamingInlinability +GetArmStreamingInlinability(const FunctionDecl *Caller, +
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: > > > How does this interact with #107598? > > > > > > Though this also changes things to ensure when TBAA data is set, it's > > always set on the call. > > Wasn't already doing that? (setting the TBAA on the call?). It was setting it on `cast(Call.getScalarVal());` not the call (which you can get via an output on `EmitCall()`). At least in this case that meant it was putting the TBAA metadata on the `load x86_fp80` after the call. I'm not sure if there's other cases where something similar could happen. https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/108853 >From 6db9f6d56f0bbd56d017156f858eae68653fbd1b Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:27:23 + Subject: [PATCH 1/3] Precommit math-libcalls-tbaa-indirect-args.c --- .../math-libcalls-tbaa-indirect-args.c| 38 +++ 1 file changed, 38 insertions(+) create mode 100644 clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c new file mode 100644 index 00..dd013dcc8b3ca8 --- /dev/null +++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c @@ -0,0 +1,38 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 -triple=x86_64-w64-mingw32 -fmath-errno -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK + +long double powl(long double a, long double b); + +// Negative test: powl is a floating-point math function that is +// ConstWithoutErrnoAndExceptions, however, for this target long doubles are +// passed indirectly via a pointer. Annotating the call with "int" TBAA metadata +// will cause the setup for the BYVAL arguments to be incorrectly optimized out. + +// CHECK-LABEL: define dso_local void @test_powl( +// CHECK-SAME: ptr dead_on_unwind noalias nocapture writable writeonly sret(x86_fp80) align 16 [[AGG_RESULT:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT:[[TMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP1:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3:[0-9]+]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP]], align 16, !tbaa [[TBAA3:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:[[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:ret void +// +long double test_powl() { + return powl(2.0L, 2.0L); // Don't emit TBAA metadata +} +//. +// CHECK: [[TBAA3]] = !{[[META4:![0-9]+]], [[META4]], i64 0} +// CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0} +// CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0} +// CHECK: [[META6]] = !{!"Simple C/C++ TBAA"} +// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0} +// CHECK: [[META8]] = !{!"int", [[META5]], i64 0} +//. >From 482639f9a8df7785d1b24c723571f477eb5febd7 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:14:01 + Subject: [PATCH 2/3] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args On some targets, an FP libcall with argument types such as long double will be lowered to pass arguments indirectly via pointers. When this is the case we should not mark the libcall with "int" TBAA as it may lead to incorrect optimizations. Currently, this can be seen for long doubles on x86_64-w64-mingw32. The `load x86_fp80` after the call is (incorrectly) marked with "int" TBAA (overwriting the previous metadata for "long double"). Nothing seems to break due to this currently as the metadata is being incorrectly placed on the load and not the call. But if the metadata is moved to the call (which this patch ensures), LLVM will optimize out the setup for the arguments. --- clang/lib/CodeGen/CGBuiltin.cpp | 24 +++ clang/lib/CodeGen/CGExpr.cpp | 6 - clang/lib/CodeGen/CodeGenFunction.h | 3 ++- .../math-libcalls-tbaa-indirect-args.c| 4 +--- 4 files changed, 27 insertions(+), 10 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 27abeba92999b3..5730e7867a648f 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -690,23 +690,37 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, const CallExpr *E, llvm::Constant *calleeValue) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); CGCallee callee = CGCallee::forDirect(calleeValue, GlobalDecl(FD)); + llvm::CallBase *
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: > > > > > How does this interact with #107598? > > > > > > > > > > > > Though this also changes things to ensure when TBAA data is set, it's > > > > always set on the call. > > > > > > > > > Wasn't already doing that? (setting the TBAA on the call?). > > > > > > It was setting it on `cast(Call.getScalarVal())` not > > necessarily the call (which you can get via an output on `EmitCall()`). At > > least in this case that meant it was putting the TBAA metadata on the `load > > x86_fp80` after the call. I'm not sure if there's other cases where > > something similar could happen. > > > > > > How does this interact with #107598? > > > > > > > > > > > > Though this also changes things to ensure when TBAA data is set, it's > > > > always set on the call. > > > > > > > > > Wasn't already doing that? (setting the TBAA on the call?). > > > > > > It was setting it on `cast(Call.getScalarVal())` not > > necessarily the call (which you can get via an output on `EmitCall()`). At > > least in this case that meant it was putting the TBAA metadata on the `load > > x86_fp80` after the call. I'm not sure if there's other cases where > > something similar could happen. > > Without this patch and without (#107598) the function `pow` doesn't generate > `int TBAA` info on the call, but it does on a call to `cargl` with `-triple > aarch64-unknown-unknown`. > > `# | %call = tail call fp128 @cargl([2 x fp128] noundef alignstack(16) undef) > #1, !tbaa !2` I'm not sue I follow, but the issue I spotted was "int" TBAA metadata was being set on the load following the `pow` call, but it has the same root cause as the `cargl` issue. See the diff from the first commit: ```diff diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c index dd013dcc8b3ca8..56c6b3ec00bc7e 100644 --- a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c +++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c @@ -19,7 +19,7 @@ long double powl(long double a, long double b); // CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] // CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]] // CHECK-NEXT:call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] -// CHECK-NEXT:[[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]] +// CHECK-NEXT:[[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA3]] // CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]] // CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] // CHECK-NEXT:store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]] @@ -33,6 +33,4 @@ long double test_powl() { // CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0} // CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0} // CHECK: [[META6]] = !{!"Simple C/C++ TBAA"} -// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0} -// CHECK: [[META8]] = !{!"int", [[META5]], i64 0} //. ``` --- > but it does on a call to cargl with `-triple aarch64-unknown-unknown`. `> `# | %call = tail call fp128 @cargl([2 x fp128] noundef alignstack(16) undef) #1, !tbaa !2` That looks okay though? It's not passing or returning values via pointers, so it should be safe to set the "int" TBAA (which indicates the only pointer it could read is errno). https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: > Not really! `int TBAA` in in our downstream compiler is interpreted as > describing the function arguments (they are not int) and the `load/store` of > the argument before the library call are begin eliminated which result in > unexpected behavior. > I am not objecting to anything; I am just wondering what difference it makes > to have the TBAA attached to the call instead of how it is now. I believe the intention of both #96025 and #100302 was to set the metadata on the call. The case I mentioned where it was attached to the load after the call just seemed like an unintended consequence of attaching the metadata to the result value. It sounds like both those changes break your downstream compiler? https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: > Not really! int TBAA in in our downstream compiler is interpreted as > describing the function arguments (they are not int) and the load/store of > the argument before the library call are begin eliminated which result in > unexpected behavior. Oh wait, then you say "not really" you're only referring to the case where the arguments are passed _indirectly_ (i.e via pointers). Which is what this patch is preventing (not the case where they're passed via floating-point registers). https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/108853 >From 6db9f6d56f0bbd56d017156f858eae68653fbd1b Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:27:23 + Subject: [PATCH 1/4] Precommit math-libcalls-tbaa-indirect-args.c --- .../math-libcalls-tbaa-indirect-args.c| 38 +++ 1 file changed, 38 insertions(+) create mode 100644 clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c diff --git a/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c new file mode 100644 index 00..dd013dcc8b3ca8 --- /dev/null +++ b/clang/test/CodeGen/math-libcalls-tbaa-indirect-args.c @@ -0,0 +1,38 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5 +// RUN: %clang_cc1 -triple=x86_64-w64-mingw32 -fmath-errno -O3 -emit-llvm -o - %s | FileCheck %s -check-prefixes=CHECK + +long double powl(long double a, long double b); + +// Negative test: powl is a floating-point math function that is +// ConstWithoutErrnoAndExceptions, however, for this target long doubles are +// passed indirectly via a pointer. Annotating the call with "int" TBAA metadata +// will cause the setup for the BYVAL arguments to be incorrectly optimized out. + +// CHECK-LABEL: define dso_local void @test_powl( +// CHECK-SAME: ptr dead_on_unwind noalias nocapture writable writeonly sret(x86_fp80) align 16 [[AGG_RESULT:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: [[ENTRY:.*:]] +// CHECK-NEXT:[[TMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:[[BYVAL_TEMP1:%.*]] = alloca x86_fp80, align 16 +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3:[0-9]+]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP]], align 16, !tbaa [[TBAA3:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.start.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 0xK40008000, ptr [[BYVAL_TEMP1]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:call void @powl(ptr dead_on_unwind nonnull writable sret(x86_fp80) align 16 [[TMP]], ptr noundef nonnull [[BYVAL_TEMP]], ptr noundef nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:[[TMP0:%.*]] = load x86_fp80, ptr [[TMP]], align 16, !tbaa [[TBAA7:![0-9]+]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP]]) #[[ATTR3]] +// CHECK-NEXT:call void @llvm.lifetime.end.p0(i64 16, ptr nonnull [[BYVAL_TEMP1]]) #[[ATTR3]] +// CHECK-NEXT:store x86_fp80 [[TMP0]], ptr [[AGG_RESULT]], align 16, !tbaa [[TBAA3]] +// CHECK-NEXT:ret void +// +long double test_powl() { + return powl(2.0L, 2.0L); // Don't emit TBAA metadata +} +//. +// CHECK: [[TBAA3]] = !{[[META4:![0-9]+]], [[META4]], i64 0} +// CHECK: [[META4]] = !{!"long double", [[META5:![0-9]+]], i64 0} +// CHECK: [[META5]] = !{!"omnipotent char", [[META6:![0-9]+]], i64 0} +// CHECK: [[META6]] = !{!"Simple C/C++ TBAA"} +// CHECK: [[TBAA7]] = !{[[META8:![0-9]+]], [[META8]], i64 0} +// CHECK: [[META8]] = !{!"int", [[META5]], i64 0} +//. >From 482639f9a8df7785d1b24c723571f477eb5febd7 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 16 Sep 2024 16:14:01 + Subject: [PATCH 2/4] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args On some targets, an FP libcall with argument types such as long double will be lowered to pass arguments indirectly via pointers. When this is the case we should not mark the libcall with "int" TBAA as it may lead to incorrect optimizations. Currently, this can be seen for long doubles on x86_64-w64-mingw32. The `load x86_fp80` after the call is (incorrectly) marked with "int" TBAA (overwriting the previous metadata for "long double"). Nothing seems to break due to this currently as the metadata is being incorrectly placed on the load and not the call. But if the metadata is moved to the call (which this patch ensures), LLVM will optimize out the setup for the arguments. --- clang/lib/CodeGen/CGBuiltin.cpp | 24 +++ clang/lib/CodeGen/CGExpr.cpp | 6 - clang/lib/CodeGen/CodeGenFunction.h | 3 ++- .../math-libcalls-tbaa-indirect-args.c| 4 +--- 4 files changed, 27 insertions(+), 10 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 27abeba92999b3..5730e7867a648f 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -690,23 +690,37 @@ static RValue emitLibraryCall(CodeGenFunction &CGF, const FunctionDecl *FD, const CallExpr *E, llvm::Constant *calleeValue) { CodeGenFunction::CGFPOptionsRAII FPOptsRAII(CGF, E); CGCallee callee = CGCallee::forDirect(calleeValue, GlobalDecl(FD)); + llvm::CallBase *
[clang] [clang][codegen] Don't mark "int" TBAA on FP libcalls with indirect args (PR #108853)
MacDue wrote: > Yes exactly! Have you tried running the test case in patch #107598 to see > what IR it generates? I've updated my test (based on yours), and I think you can now see the TBAA metadata is only set on the call when the arguments are not being passed via pointers. https://github.com/llvm/llvm-project/pull/108853 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [flang] [clang][driver] When -fveclib=ArmPL flang is in use, always link against libamath (PR #116432)
MacDue wrote: Typo in PR title `flang` -> `flag` https://github.com/llvm/llvm-project/pull/116432 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][SME] Ignore flatten/clang::always_inline statements for callees with mismatched streaming attributes (PR #116391)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/116391 >From 90daf9c544bcb776c8a68ad504ba5eda50eafe8a Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 15 Nov 2024 14:35:41 + Subject: [PATCH 1/5] [clang][SME] Ignore flatten for callees mismatched streaming attributes If `__attribute__((flatten))` is used on a function don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when `flatten` is used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". --- clang/lib/CodeGen/CGCall.cpp | 11 ++- clang/lib/CodeGen/TargetInfo.h| 9 +++ clang/lib/CodeGen/Targets/AArch64.cpp | 64 +--- .../AArch64/sme-flatten-streaming-attrs.c | 74 +++ 4 files changed, 143 insertions(+), 15 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sme-flatten-streaming-attrs.c diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 8f4f5d3ed81601..b8a968fdf4e9eb 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5112,9 +5112,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // Some architectures (such as x86-64) have the ABI changed based on // attribute-target/features. Give them a chance to diagnose. - CGM.getTargetCodeGenInfo().checkFunctionCallABI( - CGM, Loc, dyn_cast_or_null(CurCodeDecl), - dyn_cast_or_null(TargetDecl), CallArgs, RetTy); + const FunctionDecl *CallerDecl = dyn_cast_or_null(CurCodeDecl); + const FunctionDecl *CalleeDecl = dyn_cast_or_null(TargetDecl); + CGM.getTargetCodeGenInfo().checkFunctionCallABI(CGM, Loc, CallerDecl, + CalleeDecl, CallArgs, RetTy); // 1. Set up the arguments. @@ -5705,7 +5706,9 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // FIXME: should this really take priority over __try, below? if (CurCodeDecl && CurCodeDecl->hasAttr() && !InNoInlineAttributedStmt && - !(TargetDecl && TargetDecl->hasAttr())) { + !(TargetDecl && TargetDecl->hasAttr()) && + !CGM.getTargetCodeGenInfo().wouldInliningViolateFunctionCallABI( + CallerDecl, CalleeDecl)) { Attrs = Attrs.addFnAttribute(getLLVMContext(), llvm::Attribute::AlwaysInline); } diff --git a/clang/lib/CodeGen/TargetInfo.h b/clang/lib/CodeGen/TargetInfo.h index 373f8b8a80fdb1..23ff476b0e33ce 100644 --- a/clang/lib/CodeGen/TargetInfo.h +++ b/clang/lib/CodeGen/TargetInfo.h @@ -98,6 +98,15 @@ class TargetCodeGenInfo { const CallArgList &Args, QualType ReturnType) const {} + /// Returns true if inlining the function call would produce incorrect code + /// for the current target and should be ignored (even with the always_inline + /// or flatten attributes). + virtual bool + wouldInliningViolateFunctionCallABI(const FunctionDecl *Caller, + const FunctionDecl *Callee) const { +return false; + } + /// Determines the size of struct _Unwind_Exception on this platform, /// in 8-bit units. The Itanium ABI defines this as: /// struct _Unwind_Exception { diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp index 9320c6ef06efab..a9ea84b6575f92 100644 --- a/clang/lib/CodeGen/Targets/AArch64.cpp +++ b/clang/lib/CodeGen/Targets/AArch64.cpp @@ -177,6 +177,9 @@ class AArch64TargetCodeGenInfo : public TargetCodeGenInfo { const FunctionDecl *Callee, const CallArgList &Args, QualType ReturnType) const override; + bool wouldInliningViolateFunctionCallABI( + const FunctionDecl *Caller, const FunctionDecl *Callee) const override; + private: // Diagnose calls between functions with incompatible Streaming SVE // attributes. @@ -1143,12 +1146,20 @@ void AArch64TargetCodeGenInfo::checkFunctionABI( } } -void AArch64TargetCodeGenInfo::checkFunctionCallABIStreaming( -CodeGenModule &CGM, SourceLocation CallLoc, const FunctionDecl *Caller, -const FunctionDecl *Callee) const { - if (!Caller || !Callee || !Callee->hasAttr()) -return; +enum class ArmStreamingInlinability : uint8_t { + Ok = 0, + IncompatibleStreamingModes = 1, + MismatchedStreamingCompatibility = 1 << 1, + CalleeHasNewZA = 1 << 2, + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/CalleeHasNewZA), +}; +/// Determines if there are any streaming ABI issues with inlining \p Callee +/// into \p Caller. Returns the issues in the ArmStreamingInlinability bit enum +/// (multiple bits can be set). +static ArmStreamingInlinability +GetArmStreamingInlinability(const FunctionDecl *Caller, +
[clang] [clang][SME] Ignore flatten/clang::always_inline statements for callees with mismatched streaming attributes (PR #116391)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/116391 >From 90daf9c544bcb776c8a68ad504ba5eda50eafe8a Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 15 Nov 2024 14:35:41 + Subject: [PATCH 1/5] [clang][SME] Ignore flatten for callees mismatched streaming attributes If `__attribute__((flatten))` is used on a function don't inline any callees with incompatible streaming attributes. Without this check, clang may produce incorrect code when `flatten` is used in code with streaming functions. Note: The docs for flatten say it can be ignored when inlining is impossible: "causes calls within the attributed function to be inlined unless it is impossible to do so". --- clang/lib/CodeGen/CGCall.cpp | 11 ++- clang/lib/CodeGen/TargetInfo.h| 9 +++ clang/lib/CodeGen/Targets/AArch64.cpp | 64 +--- .../AArch64/sme-flatten-streaming-attrs.c | 74 +++ 4 files changed, 143 insertions(+), 15 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sme-flatten-streaming-attrs.c diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 8f4f5d3ed81601..b8a968fdf4e9eb 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5112,9 +5112,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // Some architectures (such as x86-64) have the ABI changed based on // attribute-target/features. Give them a chance to diagnose. - CGM.getTargetCodeGenInfo().checkFunctionCallABI( - CGM, Loc, dyn_cast_or_null(CurCodeDecl), - dyn_cast_or_null(TargetDecl), CallArgs, RetTy); + const FunctionDecl *CallerDecl = dyn_cast_or_null(CurCodeDecl); + const FunctionDecl *CalleeDecl = dyn_cast_or_null(TargetDecl); + CGM.getTargetCodeGenInfo().checkFunctionCallABI(CGM, Loc, CallerDecl, + CalleeDecl, CallArgs, RetTy); // 1. Set up the arguments. @@ -5705,7 +5706,9 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // FIXME: should this really take priority over __try, below? if (CurCodeDecl && CurCodeDecl->hasAttr() && !InNoInlineAttributedStmt && - !(TargetDecl && TargetDecl->hasAttr())) { + !(TargetDecl && TargetDecl->hasAttr()) && + !CGM.getTargetCodeGenInfo().wouldInliningViolateFunctionCallABI( + CallerDecl, CalleeDecl)) { Attrs = Attrs.addFnAttribute(getLLVMContext(), llvm::Attribute::AlwaysInline); } diff --git a/clang/lib/CodeGen/TargetInfo.h b/clang/lib/CodeGen/TargetInfo.h index 373f8b8a80fdb1..23ff476b0e33ce 100644 --- a/clang/lib/CodeGen/TargetInfo.h +++ b/clang/lib/CodeGen/TargetInfo.h @@ -98,6 +98,15 @@ class TargetCodeGenInfo { const CallArgList &Args, QualType ReturnType) const {} + /// Returns true if inlining the function call would produce incorrect code + /// for the current target and should be ignored (even with the always_inline + /// or flatten attributes). + virtual bool + wouldInliningViolateFunctionCallABI(const FunctionDecl *Caller, + const FunctionDecl *Callee) const { +return false; + } + /// Determines the size of struct _Unwind_Exception on this platform, /// in 8-bit units. The Itanium ABI defines this as: /// struct _Unwind_Exception { diff --git a/clang/lib/CodeGen/Targets/AArch64.cpp b/clang/lib/CodeGen/Targets/AArch64.cpp index 9320c6ef06efab..a9ea84b6575f92 100644 --- a/clang/lib/CodeGen/Targets/AArch64.cpp +++ b/clang/lib/CodeGen/Targets/AArch64.cpp @@ -177,6 +177,9 @@ class AArch64TargetCodeGenInfo : public TargetCodeGenInfo { const FunctionDecl *Callee, const CallArgList &Args, QualType ReturnType) const override; + bool wouldInliningViolateFunctionCallABI( + const FunctionDecl *Caller, const FunctionDecl *Callee) const override; + private: // Diagnose calls between functions with incompatible Streaming SVE // attributes. @@ -1143,12 +1146,20 @@ void AArch64TargetCodeGenInfo::checkFunctionABI( } } -void AArch64TargetCodeGenInfo::checkFunctionCallABIStreaming( -CodeGenModule &CGM, SourceLocation CallLoc, const FunctionDecl *Caller, -const FunctionDecl *Callee) const { - if (!Caller || !Callee || !Callee->hasAttr()) -return; +enum class ArmStreamingInlinability : uint8_t { + Ok = 0, + IncompatibleStreamingModes = 1, + MismatchedStreamingCompatibility = 1 << 1, + CalleeHasNewZA = 1 << 2, + LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/CalleeHasNewZA), +}; +/// Determines if there are any streaming ABI issues with inlining \p Callee +/// into \p Caller. Returns the issues in the ArmStreamingInlinability bit enum +/// (multiple bits can be set). +static ArmStreamingInlinability +GetArmStreamingInlinability(const FunctionDecl *Caller, +
[clang] [AArch64][SVE] Fold svrev(svrev(v)) to v (PR #116422)
MacDue wrote: Is this intentionally left as a draft? https://github.com/llvm/llvm-project/pull/116422 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][AArch64] Fix C++11 style initialization of typedef'd vectors (PR #118956)
https://github.com/MacDue created https://github.com/llvm/llvm-project/pull/118956 Previously, this hit an `llvm_unreachable()` assertion as the type of `vec_t` did not exactly match the return type of `svdup_s8`, as it was wrapped in a typedef. Comparing the canonical types instead allows the types to match correctly and avoids the crash. Fixes #107609 >From cb9857aad6f84e4ac473f572a828ea5db6d4fd58 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 6 Dec 2024 11:42:11 + Subject: [PATCH] [clang][AArch64] Fix C++11 style initialization of typedef'd vectors Previously, this hit an `llvm_unreachable()` assertion as the type of `vec_t` did not exactly match the return type of `svdup_s8`, as it was wrapped in a typedef. Comparing the canonical types instead allows the types to match correctly and avoids the crash. Fixes #107609 --- clang/lib/CodeGen/CGExprScalar.cpp| 3 ++- .../aarch64-sve-vector-init-typedef.cpp | 23 +++ 2 files changed, 25 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 4ae8a2b22b1bba..bbf68a4c66192a 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -2102,7 +2102,8 @@ Value *ScalarExprEmitter::VisitInitListExpr(InitListExpr *E) { Expr *InitVector = E->getInit(0); // Initialize from another scalable vector of the same type. - if (InitVector->getType() == E->getType()) + if (InitVector->getType().getCanonicalType() == + E->getType().getCanonicalType()) return Visit(InitVector); } diff --git a/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp b/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp new file mode 100644 index 00..3ac0fc5f39a566 --- /dev/null +++ b/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp @@ -0,0 +1,23 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2 +// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64-none-linux-gnu -target-feature +sve -emit-llvm -o - %s | FileCheck %s + +#include + +using vec_t = svint8_t; + +/// From: https://github.com/llvm/llvm-project/issues/107609 +/// The type of `vec` is a typedef of svint8_t, while svdup_s8 returns the non-typedef'd type. + +// CHECK-LABEL: define dso_local @_Z20sve_init_dup_typedefv +// CHECK-SAME: () #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[VEC:%.*]] = alloca , align 16 +// CHECK-NEXT:[[TMP0:%.*]] = call @llvm.aarch64.sve.dup.x.nxv16i8(i8 2) +// CHECK-NEXT:store [[TMP0]], ptr [[VEC]], align 16 +// CHECK-NEXT:[[TMP1:%.*]] = load , ptr [[VEC]], align 16 +// CHECK-NEXT:ret [[TMP1]] +// +vec_t sve_init_dup_typedef() { + vec_t vec{svdup_s8(2)}; + return vec; +} ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][AArch64] Fix C++11 style initialization of typedef'd vectors (PR #118956)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/118956 >From cb9857aad6f84e4ac473f572a828ea5db6d4fd58 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 6 Dec 2024 11:42:11 + Subject: [PATCH 1/2] [clang][AArch64] Fix C++11 style initialization of typedef'd vectors Previously, this hit an `llvm_unreachable()` assertion as the type of `vec_t` did not exactly match the return type of `svdup_s8`, as it was wrapped in a typedef. Comparing the canonical types instead allows the types to match correctly and avoids the crash. Fixes #107609 --- clang/lib/CodeGen/CGExprScalar.cpp| 3 ++- .../aarch64-sve-vector-init-typedef.cpp | 23 +++ 2 files changed, 25 insertions(+), 1 deletion(-) create mode 100644 clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp diff --git a/clang/lib/CodeGen/CGExprScalar.cpp b/clang/lib/CodeGen/CGExprScalar.cpp index 4ae8a2b22b1bba..bbf68a4c66192a 100644 --- a/clang/lib/CodeGen/CGExprScalar.cpp +++ b/clang/lib/CodeGen/CGExprScalar.cpp @@ -2102,7 +2102,8 @@ Value *ScalarExprEmitter::VisitInitListExpr(InitListExpr *E) { Expr *InitVector = E->getInit(0); // Initialize from another scalable vector of the same type. - if (InitVector->getType() == E->getType()) + if (InitVector->getType().getCanonicalType() == + E->getType().getCanonicalType()) return Visit(InitVector); } diff --git a/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp b/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp new file mode 100644 index 00..3ac0fc5f39a566 --- /dev/null +++ b/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp @@ -0,0 +1,23 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2 +// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64-none-linux-gnu -target-feature +sve -emit-llvm -o - %s | FileCheck %s + +#include + +using vec_t = svint8_t; + +/// From: https://github.com/llvm/llvm-project/issues/107609 +/// The type of `vec` is a typedef of svint8_t, while svdup_s8 returns the non-typedef'd type. + +// CHECK-LABEL: define dso_local @_Z20sve_init_dup_typedefv +// CHECK-SAME: () #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[VEC:%.*]] = alloca , align 16 +// CHECK-NEXT:[[TMP0:%.*]] = call @llvm.aarch64.sve.dup.x.nxv16i8(i8 2) +// CHECK-NEXT:store [[TMP0]], ptr [[VEC]], align 16 +// CHECK-NEXT:[[TMP1:%.*]] = load , ptr [[VEC]], align 16 +// CHECK-NEXT:ret [[TMP1]] +// +vec_t sve_init_dup_typedef() { + vec_t vec{svdup_s8(2)}; + return vec; +} >From 1f6b3958fc05f8630012097c89d00492675c6b9b Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Fri, 6 Dec 2024 14:05:16 + Subject: [PATCH 2/2] Fixups --- .../aarch64-sve-vector-init-typedef.cpp | 23 --- .../CodeGenCXX/aarch64-sve-vector-init.cpp| 21 + 2 files changed, 21 insertions(+), 23 deletions(-) delete mode 100644 clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp diff --git a/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp b/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp deleted file mode 100644 index 3ac0fc5f39a566..00 --- a/clang/test/CodeGenCXX/aarch64-sve-vector-init-typedef.cpp +++ /dev/null @@ -1,23 +0,0 @@ -// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2 -// RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64-none-linux-gnu -target-feature +sve -emit-llvm -o - %s | FileCheck %s - -#include - -using vec_t = svint8_t; - -/// From: https://github.com/llvm/llvm-project/issues/107609 -/// The type of `vec` is a typedef of svint8_t, while svdup_s8 returns the non-typedef'd type. - -// CHECK-LABEL: define dso_local @_Z20sve_init_dup_typedefv -// CHECK-SAME: () #[[ATTR0:[0-9]+]] { -// CHECK-NEXT: entry: -// CHECK-NEXT:[[VEC:%.*]] = alloca , align 16 -// CHECK-NEXT:[[TMP0:%.*]] = call @llvm.aarch64.sve.dup.x.nxv16i8(i8 2) -// CHECK-NEXT:store [[TMP0]], ptr [[VEC]], align 16 -// CHECK-NEXT:[[TMP1:%.*]] = load , ptr [[VEC]], align 16 -// CHECK-NEXT:ret [[TMP1]] -// -vec_t sve_init_dup_typedef() { - vec_t vec{svdup_s8(2)}; - return vec; -} diff --git a/clang/test/CodeGenCXX/aarch64-sve-vector-init.cpp b/clang/test/CodeGenCXX/aarch64-sve-vector-init.cpp index f9068364d0dcbb..407510e957f88c 100644 --- a/clang/test/CodeGenCXX/aarch64-sve-vector-init.cpp +++ b/clang/test/CodeGenCXX/aarch64-sve-vector-init.cpp @@ -1,6 +1,8 @@ // NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 2 // RUN: %clang_cc1 -fclang-abi-compat=latest -triple aarch64-none-linux-gnu -target-feature +sve -emit-llvm -o - %s | FileCheck %s +#include + // CHECK-LABEL: define dso_local void @_Z11test_localsv // CHECK-SAME: () #[[ATTR0:[0-9]+]] { // CHECK-NEXT: entry: @@ -1212,3 +1214,22 @@
[clang] [clang] Lower non-builtin sincos[f|l] calls to llvm.sincos.* when -fno-math-errno is set (PR #121763)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/121763 >From 2cadacae4359f8d67bdff850738441fba455a8bc Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 6 Jan 2025 11:49:48 + Subject: [PATCH 1/2] [clang] Lower non-builtin sincos[f|l] calls to llvm.sincos.* when -fno-math-errno is set This will allow vectorizing these calls (after a few more patches). This should not change the codegen for targets that enable the use of AA during the codegen (in `TargetSubtargetInfo::useAA()`). This includes targets such as AArch64. This notably does not include x86 but can be worked around by passing `-mllvm -combiner-global-alias-analysis=true` to clang. --- clang/lib/CodeGen/CGBuiltin.cpp | 3 +++ clang/test/CodeGen/AArch64/sincos.c | 24 +++- 2 files changed, 22 insertions(+), 5 deletions(-) diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index c419fb0cc055e0..9a859e7a22f5e3 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -3264,6 +3264,9 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, return RValue::get(emitUnaryMaybeConstrainedFPBuiltin( *this, E, Intrinsic::sinh, Intrinsic::experimental_constrained_sinh)); +case Builtin::BIsincos: +case Builtin::BIsincosf: +case Builtin::BIsincosl: case Builtin::BI__builtin_sincos: case Builtin::BI__builtin_sincosf: case Builtin::BI__builtin_sincosf16: diff --git a/clang/test/CodeGen/AArch64/sincos.c b/clang/test/CodeGen/AArch64/sincos.c index b77d98ceab4869..fde277716b 100644 --- a/clang/test/CodeGen/AArch64/sincos.c +++ b/clang/test/CodeGen/AArch64/sincos.c @@ -1,5 +1,19 @@ -// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - | FileCheck --check-prefix=NO-MATH-ERRNO %s -// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - | FileCheck --check-prefix=MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - -DUSE_BUILTIN | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - -DUSE_BUILTIN | FileCheck --check-prefix=MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - -DUSE_C_DECL | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - -DUSE_C_DECL | FileCheck --check-prefix=MATH-ERRNO %s + +#if defined(USE_BUILTIN) + #define sincos __builtin_sincos + #define sincosf __builtin_sincosf + #define sincosl __builtin_sincosl +#elif defined(USE_C_DECL) + void sincos(double, double*, double*); + void sincosf(float, float*, float*); + void sincosl(long double, long double*, long double*); +#else +#error Expected USE_BUILTIN or USE_C_DECL to be defined. +#endif // NO-MATH-ERRNO-LABEL: @sincos_f32 // NO-MATH-ERRNO:[[SINCOS:%.*]] = tail call { float, float } @llvm.sincos.f32(float {{.*}}) @@ -12,7 +26,7 @@ // MATH-ERRNO:call void @sincosf( // void sincos_f32(float x, float* fp0, float* fp1) { - __builtin_sincosf(x, fp0, fp1); + sincosf(x, fp0, fp1); } // NO-MATH-ERRNO-LABEL: @sincos_f64 @@ -26,7 +40,7 @@ void sincos_f32(float x, float* fp0, float* fp1) { // MATH-ERRNO:call void @sincos( // void sincos_f64(double x, double* dp0, double* dp1) { - __builtin_sincos(x, dp0, dp1); + sincos(x, dp0, dp1); } // NO-MATH-ERRNO-LABEL: @sincos_f128 @@ -40,5 +54,5 @@ void sincos_f64(double x, double* dp0, double* dp1) { // MATH-ERRNO:call void @sincosl( // void sincos_f128(long double x, long double* ldp0, long double* ldp1) { - __builtin_sincosl(x, ldp0, ldp1); + sincosl(x, ldp0, ldp1); } >From 46aa023e76f9cd86f6753fb306f6e26a95c98357 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 6 Jan 2025 17:06:01 + Subject: [PATCH 2/2] Tweak tests --- clang/test/CodeGen/AArch64/sincos.c | 62 ++--- 1 file changed, 47 insertions(+), 15 deletions(-) diff --git a/clang/test/CodeGen/AArch64/sincos.c b/clang/test/CodeGen/AArch64/sincos.c index fde277716b..736c0892ed7418 100644 --- a/clang/test/CodeGen/AArch64/sincos.c +++ b/clang/test/CodeGen/AArch64/sincos.c @@ -1,19 +1,9 @@ -// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - -DUSE_BUILTIN | FileCheck --check-prefix=NO-MATH-ERRNO %s -// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - -DUSE_BUILTIN | FileCheck --check-prefix=MATH-ERRNO %s -// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - -DUSE_C_DECL | FileCheck --check-prefix=NO-MATH-ERRNO %s -// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - -DUSE_C_DECL | FileCheck --check-prefix=MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -e
[clang] [clang] Lower non-builtin sincos[f|l] calls to llvm.sincos.* when -fno-math-errno is set (PR #121763)
@@ -1,5 +1,19 @@ -// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - | FileCheck --check-prefix=NO-MATH-ERRNO %s -// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - | FileCheck --check-prefix=MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - -DUSE_BUILTIN | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - -DUSE_BUILTIN | FileCheck --check-prefix=MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -O1 %s -o - -DUSE_C_DECL | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - -DUSE_C_DECL | FileCheck --check-prefix=MATH-ERRNO %s + +#if defined(USE_BUILTIN) MacDue wrote: Sure, done :+1: https://github.com/llvm/llvm-project/pull/121763 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [AArch64][SME] Add diagnostics to CheckConstexprFunctionDefinition (PR #121777)
@@ -1328,4 +1328,57 @@ void SemaARM::handleInterruptAttr(Decl *D, const ParsedAttr &AL) { ARMInterruptAttr(getASTContext(), AL, Kind)); } +// Check if the function definition uses any AArch64 SME features without +// having the '+sme' feature enabled and warn user if sme locally streaming +// function returns or uses arguments with VL-based types. +void SemaARM::CheckSMEFunctionDefAttributes(const FunctionDecl *FD) { + const auto *Attr = FD->getAttr(); + bool UsesSM = FD->hasAttr(); + bool UsesZA = Attr && Attr->isNewZA(); + bool UsesZT0 = Attr && Attr->isNewZT0(); + + if (FD->hasAttr()) { +if (FD->getReturnType()->isSizelessVectorType()) + Diag(FD->getLocation(), + diag::warn_sme_locally_streaming_has_vl_args_returns) + << /*IsArg=*/false; +if (llvm::any_of(FD->parameters(), [](ParmVarDecl *P) { + return P->getOriginalType()->isSizelessVectorType(); +})) + Diag(FD->getLocation(), + diag::warn_sme_locally_streaming_has_vl_args_returns) + << /*IsArg=*/true; + } + if (const auto *FPT = FD->getType()->getAs()) { +FunctionProtoType::ExtProtoInfo EPI = FPT->getExtProtoInfo(); +UsesSM |= EPI.AArch64SMEAttributes & FunctionType::SME_PStateSMEnabledMask; +UsesZA |= FunctionType::getArmZAState(EPI.AArch64SMEAttributes) != + FunctionType::ARM_None; +UsesZT0 |= FunctionType::getArmZT0State(EPI.AArch64SMEAttributes) != + FunctionType::ARM_None; + } + + ASTContext &Context = getASTContext(); + if (UsesSM || UsesZA) { +llvm::StringMap FeatureMap; +Context.getFunctionFeatureMap(FeatureMap, FD); +if (!FeatureMap.contains("sme")) { + if (UsesSM) +Diag(FD->getLocation(), + diag::err_sme_definition_using_sm_in_non_sme_target); + else +Diag(FD->getLocation(), + diag::err_sme_definition_using_za_in_non_sme_target); +} + } + if (UsesZT0) { +llvm::StringMap FeatureMap; +Context.getFunctionFeatureMap(FeatureMap, FD); +if (!FeatureMap.contains("sme2")) { + Diag(FD->getLocation(), + diag::err_sme_definition_using_zt0_in_non_sme2_target); +} + } +} MacDue wrote: ```suggestion if (UsesSM || UsesZA || UsesZT0) { llvm::StringMap FeatureMap; Context.getFunctionFeatureMap(FeatureMap, FD); if (!FeatureMap.contains("sme")) { if (UsesSM) Diag(FD->getLocation(), diag::err_sme_definition_using_sm_in_non_sme_target); else Diag(FD->getLocation(), diag::err_sme_definition_using_za_in_non_sme_target); } if (!FeatureMap.contains("sme2")) { Diag(FD->getLocation(), diag::err_sme_definition_using_zt0_in_non_sme2_target); } } } ``` https://github.com/llvm/llvm-project/pull/121777 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [AArch64][SME] Add diagnostics to CheckConstexprFunctionDefinition (PR #121777)
@@ -1328,4 +1328,57 @@ void SemaARM::handleInterruptAttr(Decl *D, const ParsedAttr &AL) { ARMInterruptAttr(getASTContext(), AL, Kind)); } +// Check if the function definition uses any AArch64 SME features without +// having the '+sme' feature enabled and warn user if sme locally streaming +// function returns or uses arguments with VL-based types. +void SemaARM::CheckSMEFunctionDefAttributes(const FunctionDecl *FD) { + const auto *Attr = FD->getAttr(); + bool UsesSM = FD->hasAttr(); + bool UsesZA = Attr && Attr->isNewZA(); + bool UsesZT0 = Attr && Attr->isNewZT0(); + + if (FD->hasAttr()) { +if (FD->getReturnType()->isSizelessVectorType()) + Diag(FD->getLocation(), + diag::warn_sme_locally_streaming_has_vl_args_returns) + << /*IsArg=*/false; +if (llvm::any_of(FD->parameters(), [](ParmVarDecl *P) { + return P->getOriginalType()->isSizelessVectorType(); +})) + Diag(FD->getLocation(), + diag::warn_sme_locally_streaming_has_vl_args_returns) + << /*IsArg=*/true; + } + if (const auto *FPT = FD->getType()->getAs()) { +FunctionProtoType::ExtProtoInfo EPI = FPT->getExtProtoInfo(); +UsesSM |= EPI.AArch64SMEAttributes & FunctionType::SME_PStateSMEnabledMask; +UsesZA |= FunctionType::getArmZAState(EPI.AArch64SMEAttributes) != + FunctionType::ARM_None; +UsesZT0 |= FunctionType::getArmZT0State(EPI.AArch64SMEAttributes) != + FunctionType::ARM_None; + } + + ASTContext &Context = getASTContext(); + if (UsesSM || UsesZA) { +llvm::StringMap FeatureMap; +Context.getFunctionFeatureMap(FeatureMap, FD); +if (!FeatureMap.contains("sme")) { + if (UsesSM) +Diag(FD->getLocation(), + diag::err_sme_definition_using_sm_in_non_sme_target); + else +Diag(FD->getLocation(), + diag::err_sme_definition_using_za_in_non_sme_target); +} + } + if (UsesZT0) { +llvm::StringMap FeatureMap; +Context.getFunctionFeatureMap(FeatureMap, FD); +if (!FeatureMap.contains("sme2")) { + Diag(FD->getLocation(), + diag::err_sme_definition_using_zt0_in_non_sme2_target); +} + } +} MacDue wrote: Ignore me :sweat_smile: https://github.com/llvm/llvm-project/pull/121777 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SME] Disable inlining of callees with new ZT0 state (PR #121338)
https://github.com/MacDue approved this pull request. LGTM :+1: You could maybe add a test to `clang/test/CodeGen/AArch64/sme-inline-callees-streaming-attrs.c` too (which tests `flatten`/`always_inline` statements. https://github.com/llvm/llvm-project/pull/121338 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
https://github.com/MacDue updated https://github.com/llvm/llvm-project/pull/114086 >From 5a5495d010fb1715b0e711376767d5ff3cd5cc98 Mon Sep 17 00:00:00 2001 From: Benjamin Maxwell Date: Mon, 9 Sep 2024 10:15:20 + Subject: [PATCH 1/6] [clang] Add sincos builtin using `llvm.sincos` intrinsic This registers `sincos[f|l]` as a clang builtin and updates GCBuiltin to emit the `llvm.sincos.*` intrinsic when `-fno-math-errno` is set. --- clang/include/clang/Basic/Builtins.td| 13 +++ clang/lib/CodeGen/CGBuiltin.cpp | 43 clang/test/CodeGen/AArch64/sincos.c | 33 ++ clang/test/CodeGen/X86/math-builtins.c | 35 +++ clang/test/OpenMP/declare_simd_aarch64.c | 4 +-- 5 files changed, 126 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/AArch64/sincos.c diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td index b5b47ae2746011..468c16050e2bf0 100644 --- a/clang/include/clang/Basic/Builtins.td +++ b/clang/include/clang/Basic/Builtins.td @@ -3568,6 +3568,19 @@ def Frexp : FPMathTemplate, LibBuiltin<"math.h"> { let AddBuiltinPrefixedAlias = 1; } +def Sincos : FPMathTemplate, GNULibBuiltin<"math.h"> { + let Spellings = ["sincos"]; + let Attributes = [NoThrow]; + let Prototype = "void(T, T*, T*)"; + let AddBuiltinPrefixedAlias = 1; +} + +def SincosF16F128 : F16F128MathTemplate, Builtin { + let Spellings = ["__builtin_sincos"]; + let Attributes = [FunctionWithBuiltinPrefix, NoThrow]; + let Prototype = "void(T, T*, T*)"; +} + def Ldexp : FPMathTemplate, LibBuiltin<"math.h"> { let Spellings = ["ldexp"]; let Attributes = [NoThrow, ConstIgnoringErrnoAndExceptions]; diff --git a/clang/lib/CodeGen/CGBuiltin.cpp b/clang/lib/CodeGen/CGBuiltin.cpp index 4d4b7428abd505..6986cbc59e23bf 100644 --- a/clang/lib/CodeGen/CGBuiltin.cpp +++ b/clang/lib/CodeGen/CGBuiltin.cpp @@ -835,6 +835,38 @@ static Value *emitFrexpBuiltin(CodeGenFunction &CGF, const CallExpr *E, return CGF.Builder.CreateExtractValue(Call, 0); } +static void emitSincosBuiltin(CodeGenFunction &CGF, const CallExpr *E, + llvm::Intrinsic::ID IntrinsicID) { + llvm::Value *Val = CGF.EmitScalarExpr(E->getArg(0)); + llvm::Value *Dest0 = CGF.EmitScalarExpr(E->getArg(1)); + llvm::Value *Dest1 = CGF.EmitScalarExpr(E->getArg(2)); + + llvm::Function *F = CGF.CGM.getIntrinsic(IntrinsicID, {Val->getType()}); + llvm::Value *Call = CGF.Builder.CreateCall(F, Val); + + llvm::Value *SinResult = CGF.Builder.CreateExtractValue(Call, 0); + llvm::Value *CosResult = CGF.Builder.CreateExtractValue(Call, 1); + + QualType DestPtrType = E->getArg(1)->getType()->getPointeeType(); + LValue SinLV = CGF.MakeNaturalAlignAddrLValue(Dest0, DestPtrType); + LValue CosLV = CGF.MakeNaturalAlignAddrLValue(Dest1, DestPtrType); + + llvm::StoreInst *StoreSin = + CGF.Builder.CreateStore(SinResult, SinLV.getAddress()); + llvm::StoreInst *StoreCos = + CGF.Builder.CreateStore(CosResult, CosLV.getAddress()); + + // Mark the two stores as non-aliasing with eachother. The order of stores + // emitted by this builtin is arbitrary, enforcing a particular order will + // prevent optimizations later on. + llvm::MDBuilder MDHelper(CGF.getLLVMContext()); + MDNode *Domain = MDHelper.createAnonymousAliasScopeDomain(); + MDNode *AliasScope = MDHelper.createAnonymousAliasScope(Domain); + MDNode *AliasScopeList = MDNode::get(Call->getContext(), AliasScope); + StoreSin->setMetadata(LLVMContext::MD_alias_scope, AliasScopeList); + StoreCos->setMetadata(LLVMContext::MD_noalias, AliasScopeList); +} + /// EmitFAbs - Emit a call to @llvm.fabs(). static Value *EmitFAbs(CodeGenFunction &CGF, Value *V) { Function *F = CGF.CGM.getIntrinsic(Intrinsic::fabs, V->getType()); @@ -3232,6 +3264,17 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, return RValue::get(emitUnaryMaybeConstrainedFPBuiltin( *this, E, Intrinsic::sinh, Intrinsic::experimental_constrained_sinh)); +case Builtin::BIsincos: +case Builtin::BIsincosf: +case Builtin::BIsincosl: +case Builtin::BI__builtin_sincos: +case Builtin::BI__builtin_sincosf: +case Builtin::BI__builtin_sincosl: +case Builtin::BI__builtin_sincosf128: +case Builtin::BI__builtin_sincosf16: + emitSincosBuiltin(*this, E, Intrinsic::sincos); + return RValue::get(nullptr); + case Builtin::BIsqrt: case Builtin::BIsqrtf: case Builtin::BIsqrtl: diff --git a/clang/test/CodeGen/AArch64/sincos.c b/clang/test/CodeGen/AArch64/sincos.c new file mode 100644 index 00..240d921b2b7034 --- /dev/null +++ b/clang/test/CodeGen/AArch64/sincos.c @@ -0,0 +1,33 @@ +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm %s -o - | FileCheck --check-prefix=NO-MATH-ERRNO %s +// RUN: %clang_cc1 -triple=aarch64-gnu-linux -emit-llvm -fmath-errno %s -o - | FileCheck --check-prefix=MATH-
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
https://github.com/MacDue edited https://github.com/llvm/llvm-project/pull/114086 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
@@ -3232,6 +3264,22 @@ RValue CodeGenFunction::EmitBuiltinExpr(const GlobalDecl GD, unsigned BuiltinID, return RValue::get(emitUnaryMaybeConstrainedFPBuiltin( *this, E, Intrinsic::sinh, Intrinsic::experimental_constrained_sinh)); +case Builtin::BIsincos: +case Builtin::BIsincosf: +case Builtin::BIsincosl: +case Builtin::BI__builtin_sincos: +case Builtin::BI__builtin_sincosf: +case Builtin::BI__builtin_sincosl: + // Only use the llvm.sincos.* builtin on AArch64 with optimizations. + // Currently, getting codegen that is no worse than the direct call + // requires using AA during codegen. This is not done at optlevel=none, + // and not all targets support this (AArch64 is one of the few known to). + if (!getTarget().getTriple().isAArch64() || + CGM.getCodeGenOpts().OptimizationLevel == 0) +break; MacDue wrote: Given the lack of reply, I'm going to limit this initial patch to only lower to `llvm.sincos.*` for the `__builtin` variants (similar to `frexp`). I'll consider broadening this in later patches. https://github.com/llvm/llvm-project/pull/114086 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (PR #121788)
@@ -4593,9 +4593,14 @@ class FunctionType : public Type { SME_ZT0Shift = 5, SME_ZT0Mask = 0b111 << SME_ZT0Shift, +// A bit to tell whether a function is agnostic about sme ZA state. +SME_AgnosticZAStateShift = 8, +SME_AgnosticZAStateMask = 1 << SME_AgnosticZAStateShift, + SME_AttributeMask = -0b111'111'11 // We can't support more than 8 bits because of - // the bitmask in FunctionTypeExtraBitfields. +0b1'111'111'11 // We can't support more than 16 bits because of + // the bitmask in FunctionTypeArmAttributes + // and ExtProtoInfo. MacDue wrote: Looks like it's 9 bits (in FunctionTypeArmAttributes and ExtProtoInfo), not 16 bits? https://github.com/llvm/llvm-project/pull/121788 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (PR #121788)
@@ -7745,6 +7746,38 @@ static bool checkMutualExclusion(TypeProcessingState &state, return true; } +static bool handleArmAgnosticAttribute(Sema &S, + FunctionProtoType::ExtProtoInfo &EPI, + ParsedAttr &Attr) { + if (!Attr.getNumArgs()) { +S.Diag(Attr.getLoc(), diag::err_missing_arm_state) << Attr; +Attr.setInvalid(); +return true; + } + + for (unsigned I = 0; I < Attr.getNumArgs(); ++I) { +StringRef StateName; +SourceLocation LiteralLoc; +if (!S.checkStringLiteralArgumentAttr(Attr, I, StateName, &LiteralLoc)) + return true; + +if (StateName == "sme_za_state") { + if (EPI.AArch64SMEAttributes & + (FunctionType::SME_ZAMask | FunctionType::SME_ZT0Mask)) { +S.Diag(Attr.getLoc(), diag::err_conflicting_attributes_arm_agnostic); +Attr.setInvalid(); +return true; + } + EPI.setArmSMEAttribute(FunctionType::SME_AgnosticZAStateMask); +} else { + S.Diag(LiteralLoc, diag::err_unknown_arm_state) << StateName; + Attr.setInvalid(); + return true; MacDue wrote: nit: Flip the condition to `StateName != "sme_za_state` and exit early (avoids one level of nesting). https://github.com/llvm/llvm-project/pull/121788 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (PR #121788)
@@ -7559,6 +7559,26 @@ The attributes ``__arm_in(S)``, ``__arm_out(S)``, ``__arm_inout(S)`` and }]; } +def ArmAgnosticDocs : Documentation { + let Category = DocCatArmSmeAttributes; + let Content = [{ +The ``__arm_agnostic`` keyword applies to prototyped function types and MacDue wrote: Probably just me but "prototyped function types" reads a little oddly, I think I'd normally just say "function prototypes". https://github.com/llvm/llvm-project/pull/121788 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (PR #121788)
@@ -3835,6 +3835,9 @@ def err_sme_unimplemented_za_save_restore : Error< "call to a function that shares state other than 'za' from a " "function that has live 'za' state requires a spill/fill of ZA, which is not yet " "implemented">; +def err_sme_unimplemented_agnostic_new : Error< + "support to handle __arm_agnostic(\"sme_za_state\") together with " + "__arm_new(\"za\") or __arm_new(\"zt0\") is not yet implemented">; MacDue wrote: nit: Little shorter/more common phrasing ```suggestion "__arm_agnostic(\"sme_za_state\") is not supported together with " "__arm_new(\"za\") or __arm_new(\"zt0\")">; ``` https://github.com/llvm/llvm-project/pull/121788 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [AArch64][Clang] Add support for __arm_agnostic("sme_za_state") (PR #121788)
@@ -7559,6 +7559,26 @@ The attributes ``__arm_in(S)``, ``__arm_out(S)``, ``__arm_inout(S)`` and }]; } +def ArmAgnosticDocs : Documentation { + let Category = DocCatArmSmeAttributes; + let Content = [{ +The ``__arm_agnostic`` keyword applies to prototyped function types and +specifies that the function is agnostic about the given state S and +returns with state S unchanged if state S exists. + +The attribute takes string arguments to instruct the compiler which state +the function is agnostic about. The supported states for S are: + +* ``"sme_za_state"`` for any state enabled by PSTATE.ZA (including the + bit itself) + +The attributes ``__arm_agnostic("sme_za_state") cannot be used in conjunction MacDue wrote: Missing: \`\` ```suggestion The attributes ``__arm_agnostic("sme_za_state")`` cannot be used in conjunction ``` https://github.com/llvm/llvm-project/pull/121788 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang] Add sincos builtin using `llvm.sincos` intrinsic (PR #114086)
https://github.com/MacDue closed https://github.com/llvm/llvm-project/pull/114086 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SME] Disable inlining of callees with new ZT0 state (PR #121338)
https://github.com/MacDue edited https://github.com/llvm/llvm-project/pull/121338 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [AArch64][SME] Disable inlining of callees with new ZT0 state (PR #121338)
MacDue wrote: > > LGTM 👍 You could maybe add a test to > > `clang/test/CodeGen/AArch64/sme-inline-callees-streaming-attrs.c` too > > (which tests `flatten`/`always_inline` statements). > > Thanks for approving the changes! I did already add some tests to > sme-inline-callees-streaming-attrs.c using `__arm_new("zt0")` in the latest > commit, I can add more if there is something I've missed though? Oh sorry, I must have missed it on my look over the patch earlier :sweat_smile: Looks ready to land to me :+1: https://github.com/llvm/llvm-project/pull/121338 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits