@@ -210,6 +210,9 @@ def have_host_clang_repl_cuda():
config.substitutions.append(("%host_cc", config.host_cc))
config.substitutions.append(("%host_cxx", config.host_cxx))
+# Determine whether the test target is compatible with execution on the host.
+if config.host_arch in con
https://github.com/sjoerdmeijer approved this pull request.
Looks like a good fix to me.
Please wait a day in case @davemgreen has comments, and it is better to commit
in the morning anyway. :)
For convenience, the diff with the previous version is this first line in the
test:
```
// REQUIR
https://github.com/sjoerdmeijer created
https://github.com/llvm/llvm-project/pull/125830
This introduces options `-floop-interchange` and `-fno-loop-interchange` to
enable/disable the loop-interchange pass. This is part of the work that tries
to get that pass enabled by default (#124911), wher
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/125830
>From 1a655ebd2bb01f54af6c42a373fb19e91dc56e5a Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Thu, 6 Feb 2025 03:00:24 -0800
Subject: [PATCH] [Clang][Driver] Add an option to control loop-interchange
T
@@ -316,6 +312,7 @@ PipelineTuningOptions::PipelineTuningOptions() {
LoopVectorization = true;
SLPVectorization = false;
LoopUnrolling = true;
+ LoopInterchange = false;
sjoerdmeijer wrote:
Good idea, that's now fixed.
https://github.com/llvm/llvm-proj
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/125830
>From 72c1ccc8cbe7fa9734496a2f9b79fb9f73f126ab Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Thu, 6 Feb 2025 03:00:24 -0800
Subject: [PATCH] [Clang][Driver] Add an option to control loop-interchange
T
sjoerdmeijer wrote:
> The other optimzation pass options (unrolll, vectorize, ...) are implemented
> in `PipelineTuningOptions` and `CodeGenOptions.def`. Do it the same way?
Thanks for the review and suggestion @Meinersbur, that's now implemented in the
latest revision.
https://github.com/ll
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/125830
>From da944d743f9fb97ddb1a40f58d43b0262f58205a Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Thu, 6 Feb 2025 03:00:24 -0800
Subject: [PATCH] [Clang][Driver] Add an option to control loop-interchange
T
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/125830
>From 45aa8d52ef8391fd15d81fb55a39c34f5aec233b Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Thu, 6 Feb 2025 03:00:24 -0800
Subject: [PATCH] [Clang][Driver] Add an option to control loop-interchange
T
sjoerdmeijer wrote:
> This is a bit off topic, but do you have any opinion on adding a pragma for
> interchange like other loop optimizations do? I think it can sometimes be
> useful if we can enable/disable the interchange for each loop, but I think
> there are a few things to consider if we
https://github.com/sjoerdmeijer closed
https://github.com/llvm/llvm-project/pull/125830
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
sjoerdmeijer wrote:
Forgot to add that a similar problems occur for another test in that same
directory: `vmulh_lane_f16_1.c`.
https://github.com/llvm/llvm-project/pull/120363
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llv
sjoerdmeijer wrote:
Hey @davemgreen, we are looking at a runtime failure in a test from the GCC
test-suite:
`./testsuite/gcc.target/aarch64/advsimd-intrinsics/vfmash_lane_f16_1.c`
I think to reproduce this, this will work:
clang vfmash_lane_f16_1.c -mcpu=neoverse-v2 -O0 -lm -o
./vfmash_la
@@ -4064,31 +4072,59 @@ static Value *upgradeX86IntrinsicCall(StringRef Name,
CallBase *CI, Function *F,
static Value *upgradeAArch64IntrinsicCall(StringRef Name, CallBase *CI,
Function *F, IRBuilder<> &Builder) {
- Intrinsic::ID New
@@ -9053,22 +9053,19 @@ class SIMDThreeSameVectorBF16MatrixMul
let mayRaiseFPException = 1, Uses = [FPCR] in
class SIMD_BFCVTN
- : BaseSIMDMixedTwoVector<0, 0, 0b10, 0b10110, V128, V128,
+ : BaseSIMDMixedTwoVector<0, 0, 0b10, 0b10110, V128, V64,
sjoerdmeijer
https://github.com/sjoerdmeijer approved this pull request.
Thanks, LGTM
https://github.com/llvm/llvm-project/pull/120363
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -323,9 +321,10 @@ bfloat16x8_t test_vcvtq_low_bf16_f32(float32x4_t a) {
// CHECK-A64-NEXT: entry:
// CHECK-A64-NEXT:[[TMP0:%.*]] = bitcast <8 x bfloat> [[INACTIVE:%.*]] to
<16 x i8>
// CHECK-A64-NEXT:[[TMP1:%.*]] = bitcast <4 x float> [[A:%.*]] to <16 x i8>
-// CHE
sjoerdmeijer wrote:
I don't think it is strictly necessary, but do we have a public NVIDIA
announcement of this core? If so, a link to that would be nice to include in
the description.
Will take a look at the patch.
https://github.com/llvm/llvm-project/pull/132368
___
@@ -1067,7 +1067,8 @@ def ProcessorFeatures {
FeatureDotProd, FeatureFPARMv8,
FeatureMatMulInt8,
FeatureSSBS, FeatureCCIDX,
FeatureJS, FeatureLSE, FeatureRAS,
Featur
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/133054
>From b1619f7b2835acafb4d76e6a16e678b17ddbe8b3 Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Wed, 26 Mar 2025 04:38:48 -0700
Subject: [PATCH] [AArch64] Add FEAT_FPAC to Neoverse V2
This feature is sup
https://github.com/sjoerdmeijer edited
https://github.com/llvm/llvm-project/pull/133054
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/133054
>From a65f5c9b7937b21bdf43dea532e951b3ce462ec5 Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Wed, 26 Mar 2025 01:38:46 -0700
Subject: [PATCH] [AArch64] Add FEAT_FPAC to Neoverse V2
This feature is sup
@@ -19,6 +19,7 @@
// CHECK-NEXT: FEAT_ETE
Enable Embedded Trace Extension
// CHECK-NEXT: FEAT_FCMA
Enable Armv8.3-A Floating-point complex number support
// CHECK-NEXT: FEAT
https://github.com/sjoerdmeijer closed
https://github.com/llvm/llvm-project/pull/133054
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/sjoerdmeijer created
https://github.com/llvm/llvm-project/pull/133054
This feature is supported in Grace, but wasn't specified in the CPU definition.
This is not important for codegen, but is good for completeness, and good for
other tools that could query the CPU definition
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/133054
>From e6af1a3ef41cce2a27b1f0719f58bf82ce55b0db Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Wed, 26 Mar 2025 01:38:46 -0700
Subject: [PATCH] [AArch64] Add FEAT_FPAC to Neoverse V2
This feature is sup
@@ -555,7 +555,8 @@ def TuneNeoverseV2 : SubtargetFeature<"neoversev2",
"ARMProcFamily", "NeoverseV2
FeatureEnableSelectOptimize,
FeatureUseFixedOverScalableIfEqualCost,
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/133054
>From 39ac7e676ced6be75d12adbae4644a232e471f6e Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Wed, 26 Mar 2025 04:38:48 -0700
Subject: [PATCH] [AArch64] Add FEAT_FPAC to Neoverse V2
This feature is sup
https://github.com/sjoerdmeijer approved this pull request.
LGTM, but worth looking at Dave's suggestion before merging this:
> It currently uses a bit of a mixture of specifying features individually
> (FeatureAES and FeatureSVEAES) and relying on the dependencies
> (FeatureSVE2SHA3 will impl
@@ -92,7 +92,7 @@
// COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu"
"neoverse-n2"
// RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck
-check-prefix=GRACE %s
-// GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2"
--
https://github.com/sjoerdmeijer edited
https://github.com/llvm/llvm-project/pull/127620
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -92,7 +92,7 @@
// COBALT-100: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu"
"neoverse-n2"
// RUN: %clang --target=aarch64 -mcpu=grace -### -c %s 2>&1 | FileCheck
-check-prefix=GRACE %s
-// GRACE: "-cc1"{{.*}} "-triple" "aarch64{{.*}}" "-target-cpu" "neoverse-v2"
--
https://github.com/sjoerdmeijer commented:
It would be good to mention in the description:
- that Grace is no longer an alias, but is a separate CPU definition.
- which optional extensions are now enabled.
https://github.com/llvm/llvm-project/pull/127620
___
https://github.com/sjoerdmeijer created
https://github.com/llvm/llvm-project/pull/127621
We had an internal discussion about -ffp-contract, how it compared to GCC which
defaults to fast, and standard compliance. Looking at our docs, I think most
information is there, but also thought it could
https://github.com/sjoerdmeijer updated
https://github.com/llvm/llvm-project/pull/127621
>From d7483fc138c0834fbed84bb43521ce8caed84528 Mon Sep 17 00:00:00 2001
From: Sjoerd Meijer
Date: Tue, 18 Feb 2025 03:37:08 -0800
Subject: [PATCH] [Clang][doc] -ffp-contract options and standard compliance
https://github.com/sjoerdmeijer edited
https://github.com/llvm/llvm-project/pull/127621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/sjoerdmeijer commented:
Thanks for the review. I have addressed the comments, I think.
https://github.com/llvm/llvm-project/pull/127621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/l
@@ -1681,19 +1681,25 @@ for more details.
permitted to produce more precise results than performing the same
operations separately.
- The C standard permits intermediate floating-point results within an
+ The C/C++ standard permits intermediate floating-point results
@@ -1681,19 +1681,25 @@ for more details.
permitted to produce more precise results than performing the same
operations separately.
- The C standard permits intermediate floating-point results within an
+ The C/C++ standard permits intermediate floating-point results
https://github.com/sjoerdmeijer closed
https://github.com/llvm/llvm-project/pull/127621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
sjoerdmeijer wrote:
Thanks for your reviews! I will make those changes before merging this.
https://github.com/llvm/llvm-project/pull/127621
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-com
https://github.com/sjoerdmeijer approved this pull request.
LGTM
https://github.com/llvm/llvm-project/pull/132368
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
sjoerdmeijer wrote:
For more context, this is part of our loop-interchange enablement story, see
our RFC here: https://discourse.llvm.org/t/enabling-loop-interchange/82589.
We have fixed all the compile-time issues and loop-interchange issues that we
are aware of, and would like to enable this
sjoerdmeijer wrote:
> Thanks for this PR. Do you have any compilation time and performance data?
This information is a bit spread out in the other tickets that I linked
earlier, so to summarise that, compile times look really good and increases
very minimal after the work that Madhur did. In
101 - 144 of 144 matches
Mail list logo