Re: r298956 - Default enable the rtm feature only on skylake and later for now because Intel disabled the feature on some haswell and broadwell processors:
SGTM Sent from my iPhone > On Mar 28, 2017, at 5:25 PM, Eric Christopher wrote: > > > >> On Tue, Mar 28, 2017 at 4:31 PM Craig Topper wrote: >> So if you use -march=hsw the backend will think rtm is enabled, but clang >> will block the intrinsics in the frontend? > > Yeah. I've come to the conclusion that I think we should remove the feature > from the backend too. Using the option/feature in the frontend (or during > codegen of any sort) will turn on the instruction in the backend and we > should just rely on the option/feature. > >> >> Not sure what you mean by split the haswell and broadwell cpu. > > Into something similar to how the skylake cpu is configured in the front end > with different feature sets. I don't really think it's a good idea here, but > I thought I'd raise it. > > Thoughts? > > -eric > >> >> ~Craig >> >> On Tue, Mar 28, 2017 at 4:18 PM, Eric Christopher wrote: >> Hi Craig, Quentin, Jim, >> >> Just bringing this patch to your attention here. I haven't turned it off in >> the backend since some processors do support it and we want to allow the >> code generation, with this change we simply make the user use -mrtm to get >> the functionality for now. >> >> Question: Do we want to split the haswell and broadwell cpu to handle this >> different feature set or just leave it as an optional "enable it yourself" >> since it's an errata fix? >> >> I went ahead and committed with this question outstanding so that users >> won't get illegal instructions if they happen to have the affected units. >> >> -eric >> >> >> On Tue, Mar 28, 2017 at 4:15 PM Eric Christopher via cfe-commits >> wrote: >> Author: echristo >> Date: Tue Mar 28 18:03:19 2017 >> New Revision: 298956 >> >> URL: http://llvm.org/viewvc/llvm-project?rev=298956&view=rev >> Log: >> Default enable the rtm feature only on skylake and later for now because >> Intel disabled the feature on some haswell and broadwell processors: >> >> http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/core-m-processor-family-spec-update.pdf >> >> the -mrtm option will still work normally. >> >> Modified: >> cfe/trunk/lib/Basic/Targets.cpp >> cfe/trunk/test/Preprocessor/predefined-arch-macros.c >> >> Modified: cfe/trunk/lib/Basic/Targets.cpp >> URL: >> http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Basic/Targets.cpp?rev=298956&r1=298955&r2=298956&view=diff >> == >> --- cfe/trunk/lib/Basic/Targets.cpp (original) >> +++ cfe/trunk/lib/Basic/Targets.cpp Tue Mar 28 18:03:19 2017 >> @@ -3194,6 +3194,7 @@ bool X86TargetInfo::initFeatureMap( >> setFeatureEnabledImpl(Features, "mpx", true); >> setFeatureEnabledImpl(Features, "sgx", true); >> setFeatureEnabledImpl(Features, "clflushopt", true); >> +setFeatureEnabledImpl(Features, "rtm", true); >> LLVM_FALLTHROUGH; >>case CK_Broadwell: >> setFeatureEnabledImpl(Features, "rdseed", true); >> @@ -3204,7 +3205,6 @@ bool X86TargetInfo::initFeatureMap( >> setFeatureEnabledImpl(Features, "lzcnt", true); >> setFeatureEnabledImpl(Features, "bmi", true); >> setFeatureEnabledImpl(Features, "bmi2", true); >> -setFeatureEnabledImpl(Features, "rtm", true); >> setFeatureEnabledImpl(Features, "fma", true); >> setFeatureEnabledImpl(Features, "movbe", true); >> LLVM_FALLTHROUGH; >> >> Modified: cfe/trunk/test/Preprocessor/predefined-arch-macros.c >> URL: >> http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Preprocessor/predefined-arch-macros.c?rev=298956&r1=298955&r2=298956&view=diff >> == >> --- cfe/trunk/test/Preprocessor/predefined-arch-macros.c (original) >> +++ cfe/trunk/test/Preprocessor/predefined-arch-macros.c Tue Mar 28 18:03:19 >> 2017 >> @@ -525,7 +525,6 @@ >> // CHECK_CORE_AVX2_M32: #define __PCLMUL__ 1 >> // CHECK_CORE_AVX2_M32: #define __POPCNT__ 1 >> // CHECK_CORE_AVX2_M32: #define __RDRND__ 1 >> -// CHECK_CORE_AVX2_M32: #define __RTM__ 1 >> // CHECK_CORE_AVX2_M32: #define __SSE2__ 1 >> // CHECK_CORE_AVX2_M32: #define __SSE3__ 1 >> // CHECK_CORE_AVX2_M32: #define __SSE4_1__ 1 >> @@ -555,7 +554,6 @@ >> // CHECK_CORE_AVX2_M64: #define __PCLMUL__ 1 >> // CHECK_CORE_AVX2_M64: #define __POPCNT__ 1 >> // CHECK_CORE_AVX2_M64: #define __RDRND__ 1 >> -// CHECK_CORE_AVX2_M64: #define __RTM__ 1 >> // CHECK_CORE_AVX2_M64: #define __SSE2_MATH__ 1 >> // CHECK_CORE_AVX2_M64: #define __SSE2__ 1 >> // CHECK_CORE_AVX2_M64: #define __SSE3__ 1 >> @@ -591,7 +589,6 @@ >> // CHECK_BROADWELL_M32: #define __POPCNT__ 1 >> // CHECK_BROADWELL_M32: #define __RDRND__ 1 >> // CHECK_BROADWELL_M32: #define __RDSEED__ 1 >> -// CHECK_BROADWELL_M32: #define __RTM__ 1 >> // CHECK_BROADWELL_M32: #define __SSE2__ 1 >> // CHECK_BROADWELL_M32: #define __SSE3__ 1 >> // CHECK_BROADWELL_M32: #define __SSE4_1__ 1 >> @@ -623,7 +620,6 @@ >> //
Re: [PATCH] D10414: Attach function attribute "arm-restrict-it" instead of passing arm-restrict-it as a backend-option
> On Sep 10, 2015, at 1:24 AM, James Molloy wrote: > > jmolloy added a subscriber: jmolloy. > jmolloy added a comment. > > Hi Akira, > > I'm sorry to be contrary (and I missed the discussion on Tuesday because I > was away on vacation) but I think there *is* a usecase for -mno-restrict-it > to work, and I would hate to see it broken. > > Non-restricted IT blocks are indeed deprecated for ARMv8 in the ARMARM. But > there are circumstances where you may still want to emit them - the biggest > example being you're compiling for a CPU microarchitecture that you *know* > doesn't have a performance penalty on non-restricted IT blocks. Restricted IT > blocks can pessimize code quite badly in some circumstances, and allowing > people to turn it off for their target if needed is very important, IMO. If such microarchitectures exist, shouldn’t they be represented properly as a CPU in the backend and get the right setting by default? > > Cheers, > > James > > > http://reviews.llvm.org/D10414 > > > ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D10414: Attach function attribute "arm-restrict-it" instead of passing arm-restrict-it as a backend-option
grosbach added a comment. In http://reviews.llvm.org/D10414#243056, @jmolloy wrote: > Non-restricted IT blocks are indeed deprecated for ARMv8 in the ARMARM. But > there are circumstances where you may still want to emit them - the biggest > example being you're compiling for a CPU microarchitecture that you *know* > doesn't have a performance penalty on non-restricted IT blocks. Restricted IT > blocks can pessimize code quite badly in some circumstances, and allowing > people to turn it off for their target if needed is very important, IMO. Bother, email response isn't showing up in Phab. Reposting here so there's a record. Sorry for the duplication on-list. If such microarchitectures exist, shouldn’t they be represented properly as a CPU in the backend and get the right setting by default? http://reviews.llvm.org/D10414 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D10414: Attach function attribute "arm-restrict-it" instead of passing arm-restrict-it as a backend-option
grosbach added a comment. In http://reviews.llvm.org/D10414#243302, @jmolloy wrote: > In an ideal world, yes. However there's no guarantee that all ARM > implementors will (a) be able to commit to LLVM or (b) use ToT. Perhaps > they're building a project that uses clang, or a specific version of clang, > and this tuning option makes things go faster on their architecture. > > I'm not arguing about the defaults, just about the breakage of > -mno-restrict-it. Hmmm. I'm not sure I follow? In either case, why does what we do in clang matter at all? Especially for case (a); if they're not using LLVM, clang is completely irrelevant to them, right? For (b), are you thinking of internal use by implementors when they're trying to decide what the right defaults should be for their tuning of the backend? In which case, would not an -mllvm option suffice? The clang -m[no-]restrict-it is for end-users of clang, and that really doesn't seem the right thing to me. This is not a tuning parameter. It's a "what does the architecture I'm targeting support?" parameter. We only have even the backend option available for our own testing or we could (and should) get rid of that, too. http://reviews.llvm.org/D10414 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: r250473 - Add an error when calling a builtin that requires features that don't
Woot! Thanks! > On Oct 15, 2015, at 4:47 PM, Eric Christopher via cfe-commits > wrote: > > Author: echristo > Date: Thu Oct 15 18:47:11 2015 > New Revision: 250473 > > URL: http://llvm.org/viewvc/llvm-project?rev=250473&view=rev > Log: > Add an error when calling a builtin that requires features that don't > match the feature set of the function that they're being called from. > > This ensures that we can effectively diagnose some[1] code that would > instead ICE in the backend with a failure to select message. > > Example: > > __m128d foo(__m128d a, __m128d b) { > return __builtin_ia32_addsubps(b, a); > } > > compiled for normal x86_64 via: > > clang -target x86_64-linux-gnu -c > > would fail to compile in the back end because the normal subtarget > features for x86_64 only include sse2 and the builtin requires sse3. > > [1] We're still not erroring on: > > __m128i bar(__m128i const *p) { return _mm_lddqu_si128(p); } > > where we should fail and error on an always_inline function being > inlined into a function that doesn't support the subtarget features > required. > > Added: >cfe/trunk/test/CodeGen/target-builtin-error-2.c >cfe/trunk/test/CodeGen/target-builtin-error.c >cfe/trunk/test/CodeGen/target-builtin-noerror.c > Modified: >cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td >cfe/trunk/lib/CodeGen/CGBuiltin.cpp >cfe/trunk/lib/CodeGen/CGCall.cpp >cfe/trunk/lib/CodeGen/CodeGenFunction.h > > Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td > URL: > http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=250473&r1=250472&r2=250473&view=diff > == > --- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original) > +++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Thu Oct 15 18:47:11 > 2015 > @@ -430,6 +430,7 @@ def warn_redecl_library_builtin : Warnin > def err_builtin_definition : Error<"definition of builtin function %0">; > def err_arm_invalid_specialreg : Error<"invalid special register for > builtin">; > def err_invalid_cpu_supports : Error<"invalid cpu feature string for > builtin">; > +def err_builtin_needs_feature : Error<"%0 needs target feature %1">; > def warn_builtin_unknown : Warning<"use of unknown builtin %0">, > InGroup, DefaultError; > def warn_dyn_class_memaccess : Warning< > > Modified: cfe/trunk/lib/CodeGen/CGBuiltin.cpp > URL: > http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGBuiltin.cpp?rev=250473&r1=250472&r2=250473&view=diff > == > --- cfe/trunk/lib/CodeGen/CGBuiltin.cpp (original) > +++ cfe/trunk/lib/CodeGen/CGBuiltin.cpp Thu Oct 15 18:47:11 2015 > @@ -21,6 +21,7 @@ > #include "clang/Basic/TargetBuiltins.h" > #include "clang/Basic/TargetInfo.h" > #include "clang/CodeGen/CGFunctionInfo.h" > +#include "clang/Sema/SemaDiagnostic.h" > #include "llvm/ADT/StringExtras.h" > #include "llvm/IR/CallSite.h" > #include "llvm/IR/DataLayout.h" > @@ -288,6 +289,62 @@ Value *CodeGenFunction::EmitVAStartEnd(V > return Builder.CreateCall(CGM.getIntrinsic(inst), ArgValue); > } > > +// Returns true if we have a valid set of target features. > +bool CodeGenFunction::checkBuiltinTargetFeatures( > +const FunctionDecl *TargetDecl) { > + // Early exit if this is an indirect call. > + if (!TargetDecl) > +return true; > + > + // Get the current enclosing function if it exists. > + if (const FunctionDecl *FD = dyn_cast_or_null(CurFuncDecl)) { > +unsigned BuiltinID = TargetDecl->getBuiltinID(); > +const char *FeatureList = > +CGM.getContext().BuiltinInfo.getRequiredFeatures(BuiltinID); > +if (FeatureList && StringRef(FeatureList) != "") { > + StringRef TargetCPU = Target.getTargetOpts().CPU; > + llvm::StringMap FeatureMap; > + > + if (const auto *TD = FD->getAttr()) { > +// If we have a TargetAttr build up the feature map based on that. > +TargetAttr::ParsedTargetAttr ParsedAttr = TD->parse(); > + > +// Make a copy of the features as passed on the command line into the > +// beginning of the additional features from the function to > override. > +ParsedAttr.first.insert( > +ParsedAttr.first.begin(), > +Target.getTargetOpts().FeaturesAsWritten.begin(), > +Target.getTargetOpts().FeaturesAsWritten.end()); > + > +if (ParsedAttr.second != "") > + TargetCPU = ParsedAttr.second; > + > +// Now populate the feature map, first with the TargetCPU which is > +// either > +// the default or a new one from the target attribute string. Then > we'll > +// use the passed in features (FeaturesAsWritten) along with the new > +// ones > +// from the attribute. > +Target.initFeatureMap(FeatureMap, CGM.getDiags(), TargetCPU, > +