[clang] [llvm] Reland [clang][AArch64] Add getHostCPUFeatures to query for enabled f… (PR #115467)

2024-11-11 Thread Sjoerd Meijer via cfe-commits
@@ -210,6 +210,9 @@ def have_host_clang_repl_cuda(): config.substitutions.append(("%host_cc", config.host_cc)) config.substitutions.append(("%host_cxx", config.host_cxx)) +# Determine whether the test target is compatible with execution on the host. +if config.host_arch in con

[clang] [llvm] Reland [clang][AArch64] Add getHostCPUFeatures to query for enabled f… (PR #115467)

2024-11-11 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer approved this pull request. Looks like a good fix to me. Please wait a day in case @davemgreen has comments, and it is better to commit in the morning anyway. :) For convenience, the diff with the previous version is this first line in the test: ``` // REQUIR

[clang] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-05 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer created https://github.com/llvm/llvm-project/pull/125830 This introduces options `-floop-interchange` and `-fno-loop-interchange` to enable/disable the loop-interchange pass. This is part of the work that tries to get that pass enabled by default (#124911), wher

[clang] [llvm] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-06 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/125830 >From 1a655ebd2bb01f54af6c42a373fb19e91dc56e5a Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Thu, 6 Feb 2025 03:00:24 -0800 Subject: [PATCH] [Clang][Driver] Add an option to control loop-interchange T

[clang] [llvm] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-06 Thread Sjoerd Meijer via cfe-commits
@@ -316,6 +312,7 @@ PipelineTuningOptions::PipelineTuningOptions() { LoopVectorization = true; SLPVectorization = false; LoopUnrolling = true; + LoopInterchange = false; sjoerdmeijer wrote: Good idea, that's now fixed. https://github.com/llvm/llvm-proj

[clang] [llvm] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-06 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/125830 >From 72c1ccc8cbe7fa9734496a2f9b79fb9f73f126ab Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Thu, 6 Feb 2025 03:00:24 -0800 Subject: [PATCH] [Clang][Driver] Add an option to control loop-interchange T

[clang] [llvm] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-06 Thread Sjoerd Meijer via cfe-commits
sjoerdmeijer wrote: > The other optimzation pass options (unrolll, vectorize, ...) are implemented > in `PipelineTuningOptions` and `CodeGenOptions.def`. Do it the same way? Thanks for the review and suggestion @Meinersbur, that's now implemented in the latest revision. https://github.com/ll

[clang] [llvm] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-06 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/125830 >From da944d743f9fb97ddb1a40f58d43b0262f58205a Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Thu, 6 Feb 2025 03:00:24 -0800 Subject: [PATCH] [Clang][Driver] Add an option to control loop-interchange T

[clang] [llvm] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-07 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/125830 >From 45aa8d52ef8391fd15d81fb55a39c34f5aec233b Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Thu, 6 Feb 2025 03:00:24 -0800 Subject: [PATCH] [Clang][Driver] Add an option to control loop-interchange T

[clang] [llvm] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-07 Thread Sjoerd Meijer via cfe-commits
sjoerdmeijer wrote: > This is a bit off topic, but do you have any opinion on adding a pragma for > interchange like other loop optimizations do? I think it can sometimes be > useful if we can enable/disable the interchange for each loop, but I think > there are a few things to consider if we

[clang] [llvm] [Clang][Driver] Add an option to control loop-interchange (PR #125830)

2025-02-07 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer closed https://github.com/llvm/llvm-project/pull/125830 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (PR #120363)

2025-01-23 Thread Sjoerd Meijer via cfe-commits
sjoerdmeijer wrote: Forgot to add that a similar problems occur for another test in that same directory: `vmulh_lane_f16_1.c`. https://github.com/llvm/llvm-project/pull/120363 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llv

[clang] [llvm] [AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (PR #120363)

2025-01-23 Thread Sjoerd Meijer via cfe-commits
sjoerdmeijer wrote: Hey @davemgreen, we are looking at a runtime failure in a test from the GCC test-suite: `./testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmash_lane_f16_1.c` I think to reproduce this, this will work: clang vfmash_lane_f16_1.c -mcpu=neoverse-v2 -O0 -lm -o ./vfmash_la

[clang] [llvm] [AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (PR #120363)

2025-01-20 Thread Sjoerd Meijer via cfe-commits
@@ -4064,31 +4072,59 @@ static Value *upgradeX86IntrinsicCall(StringRef Name, CallBase *CI, Function *F, static Value *upgradeAArch64IntrinsicCall(StringRef Name, CallBase *CI, Function *F, IRBuilder<> &Builder) { - Intrinsic::ID New

[clang] [llvm] [AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (PR #120363)

2025-01-20 Thread Sjoerd Meijer via cfe-commits
@@ -9053,22 +9053,19 @@ class SIMDThreeSameVectorBF16MatrixMul let mayRaiseFPException = 1, Uses = [FPCR] in class SIMD_BFCVTN - : BaseSIMDMixedTwoVector<0, 0, 0b10, 0b10110, V128, V128, + : BaseSIMDMixedTwoVector<0, 0, 0b10, 0b10110, V128, V64, sjoerdmeijer

[clang] [llvm] [AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (PR #120363)

2025-01-20 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer approved this pull request. Thanks, LGTM https://github.com/llvm/llvm-project/pull/120363 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Improve bcvtn2 and remove aarch64_neon_bfcvt intrinsics (PR #120363)

2025-01-16 Thread Sjoerd Meijer via cfe-commits
@@ -323,9 +321,10 @@ bfloat16x8_t test_vcvtq_low_bf16_f32(float32x4_t a) { // CHECK-A64-NEXT: entry: // CHECK-A64-NEXT:[[TMP0:%.*]] = bitcast <8 x bfloat> [[INACTIVE:%.*]] to <16 x i8> // CHECK-A64-NEXT:[[TMP1:%.*]] = bitcast <4 x float> [[A:%.*]] to <16 x i8> -// CHE

[clang] [llvm] [AArch64] Add initial support for -mcpu=olympus. (PR #132368)

2025-03-21 Thread Sjoerd Meijer via cfe-commits
sjoerdmeijer wrote: I don't think it is strictly necessary, but do we have a public NVIDIA announcement of this core? If so, a link to that would be nice to include in the description. Will take a look at the patch. https://github.com/llvm/llvm-project/pull/132368 ___

[clang] [llvm] [AArch64] Add FEAT_FPAC to Grace (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
@@ -1067,7 +1067,8 @@ def ProcessorFeatures { FeatureDotProd, FeatureFPARMv8, FeatureMatMulInt8, FeatureSSBS, FeatureCCIDX, FeatureJS, FeatureLSE, FeatureRAS, Featur

[clang] [llvm] [AArch64] Add FEAT_FPAC to Neoverse V2 (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/133054 >From b1619f7b2835acafb4d76e6a16e678b17ddbe8b3 Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Wed, 26 Mar 2025 04:38:48 -0700 Subject: [PATCH] [AArch64] Add FEAT_FPAC to Neoverse V2 This feature is sup

[clang] [llvm] [AArch64] Add FEAT_FPAC to Neoverse V2 (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer edited https://github.com/llvm/llvm-project/pull/133054 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Add FEAT_FPAC to Grace (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/133054 >From a65f5c9b7937b21bdf43dea532e951b3ce462ec5 Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Wed, 26 Mar 2025 01:38:46 -0700 Subject: [PATCH] [AArch64] Add FEAT_FPAC to Neoverse V2 This feature is sup

[clang] [llvm] [AArch64] Add FEAT_FPAC to Neoverse V2 (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
@@ -19,6 +19,7 @@ // CHECK-NEXT: FEAT_ETE Enable Embedded Trace Extension // CHECK-NEXT: FEAT_FCMA Enable Armv8.3-A Floating-point complex number support // CHECK-NEXT: FEAT

[clang] [llvm] [AArch64] Add FEAT_FPAC to Neoverse V2 (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer closed https://github.com/llvm/llvm-project/pull/133054 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Add FEAT_FPAC to Grace (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer created https://github.com/llvm/llvm-project/pull/133054 This feature is supported in Grace, but wasn't specified in the CPU definition. This is not important for codegen, but is good for completeness, and good for other tools that could query the CPU definition

[clang] [llvm] [AArch64] Add FEAT_FPAC to Neoverse V2 (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/133054 >From e6af1a3ef41cce2a27b1f0719f58bf82ce55b0db Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Wed, 26 Mar 2025 01:38:46 -0700 Subject: [PATCH] [AArch64] Add FEAT_FPAC to Neoverse V2 This feature is sup

[clang] [llvm] [AArch64] Add FEAT_FPAC to Neoverse V2 (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
@@ -555,7 +555,8 @@ def TuneNeoverseV2 : SubtargetFeature<"neoversev2", "ARMProcFamily", "NeoverseV2 FeatureEnableSelectOptimize, FeatureUseFixedOverScalableIfEqualCost,

[clang] [llvm] [AArch64] Add FEAT_FPAC to Neoverse V2 (PR #133054)

2025-03-26 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/133054 >From 39ac7e676ced6be75d12adbae4644a232e471f6e Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Wed, 26 Mar 2025 04:38:48 -0700 Subject: [PATCH] [AArch64] Add FEAT_FPAC to Neoverse V2 This feature is sup

[clang] [llvm] [AArch64] Add optional extensions enabled on Grace (PR #127620)

2025-02-19 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer approved this pull request. LGTM, but worth looking at Dave's suggestion before merging this: > It currently uses a bit of a mixture of specifying features individually > (FeatureAES and FeatureSVEAES) and relying on the dependencies > (FeatureSVE2SHA3 will impl

[clang] [llvm] [AArch64] Add optional extensions enabled on Grace (PR #127620)

2025-02-18 Thread Sjoerd Meijer via cfe-commits
@@ -92,7 +92,7 @@ // COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-n2" // RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck -check-prefix=GRACE %s -// GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2" --

[clang] [llvm] [AArch64] Add optional extensions enabled on Grace (PR #127620)

2025-02-18 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer edited https://github.com/llvm/llvm-project/pull/127620 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64] Add optional extensions enabled on Grace (PR #127620)

2025-02-18 Thread Sjoerd Meijer via cfe-commits
@@ -92,7 +92,7 @@ // COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-n2" // RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck -check-prefix=GRACE %s -// GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2" --

[clang] [llvm] [AArch64] Add optional extensions enabled on Grace (PR #127620)

2025-02-18 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer commented: It would be good to mention in the description: - that Grace is no longer an alias, but is a separate CPU definition. - which optional extensions are now enabled. https://github.com/llvm/llvm-project/pull/127620 ___

[clang] [Clang][doc] -ffp-contract options and standard compliance (PR #127621)

2025-02-18 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer created https://github.com/llvm/llvm-project/pull/127621 We had an internal discussion about -ffp-contract, how it compared to GCC which defaults to fast, and standard compliance. Looking at our docs, I think most information is there, but also thought it could

[clang] [Clang][doc] -ffp-contract options and standard compliance (PR #127621)

2025-02-19 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer updated https://github.com/llvm/llvm-project/pull/127621 >From d7483fc138c0834fbed84bb43521ce8caed84528 Mon Sep 17 00:00:00 2001 From: Sjoerd Meijer Date: Tue, 18 Feb 2025 03:37:08 -0800 Subject: [PATCH] [Clang][doc] -ffp-contract options and standard compliance

[clang] [Clang][doc] -ffp-contract options and standard compliance (PR #127621)

2025-02-19 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer edited https://github.com/llvm/llvm-project/pull/127621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][doc] -ffp-contract options and standard compliance (PR #127621)

2025-02-19 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer commented: Thanks for the review. I have addressed the comments, I think. https://github.com/llvm/llvm-project/pull/127621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/l

[clang] [Clang][doc] -ffp-contract options and standard compliance (PR #127621)

2025-02-19 Thread Sjoerd Meijer via cfe-commits
@@ -1681,19 +1681,25 @@ for more details. permitted to produce more precise results than performing the same operations separately. - The C standard permits intermediate floating-point results within an + The C/C++ standard permits intermediate floating-point results

[clang] [Clang][doc] -ffp-contract options and standard compliance (PR #127621)

2025-02-19 Thread Sjoerd Meijer via cfe-commits
@@ -1681,19 +1681,25 @@ for more details. permitted to produce more precise results than performing the same operations separately. - The C standard permits intermediate floating-point results within an + The C/C++ standard permits intermediate floating-point results

[clang] [Clang][doc] -ffp-contract options and standard compliance (PR #127621)

2025-02-20 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer closed https://github.com/llvm/llvm-project/pull/127621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][doc] -ffp-contract options and standard compliance (PR #127621)

2025-02-20 Thread Sjoerd Meijer via cfe-commits
sjoerdmeijer wrote: Thanks for your reviews! I will make those changes before merging this. https://github.com/llvm/llvm-project/pull/127621 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-com

[clang] [llvm] [AArch64] Add initial support for -mcpu=olympus. (PR #132368)

2025-03-24 Thread Sjoerd Meijer via cfe-commits
https://github.com/sjoerdmeijer approved this pull request. LGTM https://github.com/llvm/llvm-project/pull/132368 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182)

2025-05-16 Thread Sjoerd Meijer via cfe-commits
sjoerdmeijer wrote: For more context, this is part of our loop-interchange enablement story, see our RFC here: https://discourse.llvm.org/t/enabling-loop-interchange/82589. We have fixed all the compile-time issues and loop-interchange issues that we are aware of, and would like to enable this

[clang] [flang] [flang] add -floop-interchange and enable it with opt levels (PR #140182)

2025-05-17 Thread Sjoerd Meijer via cfe-commits
sjoerdmeijer wrote: > Thanks for this PR. Do you have any compilation time and performance data? This information is a bit spread out in the other tickets that I linked earlier, so to summarise that, compile times look really good and increases very minimal after the work that Madhur did. In

<    1   2