[clang] [clang-tools-extra] [mlir] [llvm] [mlir][linalg] Implement common interface for depthwise convolution ops (PR #75017)

2023-12-14 Thread via cfe-commits

https://github.com/srcarroll edited 
https://github.com/llvm/llvm-project/pull/75017
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [clang] [llvm] [CanonicalizeFreezeInLoops] fix duplicate removal (PR #74716)

2023-12-14 Thread Wei Tao via cfe-commits

https://github.com/Friedrich20 updated 
https://github.com/llvm/llvm-project/pull/74716

>From 5ef11803b597fec44b64239824a4c9d3280cfea6 Mon Sep 17 00:00:00 2001
From: Wei Tao 
Date: Thu, 7 Dec 2023 21:33:40 +0800
Subject: [PATCH] [LLVM][CanonicalizeFreezeInLoops] fix duplicate removal

---
 .../Transforms/Utils/CanonicalizeFreezeInLoops.cpp  | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp 
b/llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
index fb4d8288537725..ff1eb17e0c2488 100644
--- a/llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
+++ b/llvm/lib/Transforms/Utils/CanonicalizeFreezeInLoops.cpp
@@ -151,11 +151,18 @@ bool CanonicalizeFreezeInLoopsImpl::run() {
   }
 }
 
+bool Exist = false;
 auto Visit = [&](User *U) {
   if (auto *FI = dyn_cast(U)) {
-LLVM_DEBUG(dbgs() << "canonfr: found: " << *FI << "\n");
-Info.FI = FI;
-Candidates.push_back(Info);
+for (const auto &Candidate : Candidates) {
+  auto *FI_cand = Candidate.FI;
+  Exist = (FI_cand == FI) ? true : Exist;
+}
+if (!Exist) {
+  LLVM_DEBUG(dbgs() << "canonfr: found: " << *FI << "\n");
+  Info.FI = FI;
+  Candidates.push_back(Info);
+}
   }
 };
 for_each(PHI.users(), Visit);

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [AArch64][SME] Warn when using a streaming builtin from a non-streaming function (PR #74064)

2023-12-14 Thread Sander de Smalen via cfe-commits

sdesmalen-arm wrote:

> On my system, this increases the compilation time of SemaChecking.cpp from 7 
> seconds to 2 minutes 46 seconds (using clang as a host compiler). That seems 
> excessive. Let's please find a way to not make compilation so slow, and let's 
> consider reverting this until a faster approach is found.

I see the same with GCC. It seems that changing the generated table from:
```
case SVE::BI__builtin_sve_svaba_n_s16: 
  BuiltinType = ArmStreamingCompatible;
  break;
case SVE::BI__builtin_sve_svaba_n_s32: 
  BuiltinType = ArmStreamingCompatible;
  break;
case SVE::BI__builtin_sve_svaba_n_s64: 
  BuiltinType = ArmStreamingCompatible;
  break;
...
```
to
```
case SVE::BI__builtin_sve_svacge_n_f16:
case SVE::BI__builtin_sve_svacge_n_f32:
case SVE::BI__builtin_sve_svacge_n_f64:
...
  BuiltinType = ArmStreamingCompatible;
  break;
```
resolves most of the issue without changing behaviours.

Additionally, it might be good to make the most common streaming-mode for this 
file the default (which for arm_sve.h is streaming-compatible), so that the 
table only has to capture the intrinsics which are explicitly non-streaming or 
streaming.

https://github.com/llvm/llvm-project/pull/74064
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Sema] Provide `-fno-/-fvisibility-global-new-delete` option (PR #75364)

2023-12-14 Thread via cfe-commits

https://github.com/bd1976bris updated 
https://github.com/llvm/llvm-project/pull/75364

>From 97efed8c73aed4fdca5510013c844e84953ec256 Mon Sep 17 00:00:00 2001
From: Ben Dunbobbin 
Date: Tue, 12 Dec 2023 08:07:17 +
Subject: [PATCH 1/2] [Sema] Provide `-fno-/-fvisibility-global-new-delete`
 option

By default the implicitly declared replaceable global new and delete
operators are given a `default` visibility attribute. Previous work,
see: https://reviews.llvm.org/D53787, added
`-fvisibility-global-new-delete-hidden` to change this to a `hidden`
visibility attribute.

This change adds: `-fno/-fvisibility-global-new-delete` which controls
whether or not to add a visibility attribute to the implicit
declarations for these functions. Without the attribute the replaceable
global new and delete operators behave normally (like other functions)
with respect to visibility attributes, pragmas and options.

The command line help for these options is rendered as:

  -fvisibility-global-new-delete
  Add a visibility attribute to the implicit
  global C++ operator new and delete declarations

  -fno-visibility-global-new-delete
  Do not add a visibility attribute to the implicit
  global C++ operator new and delete declarations

The motivation is to allow users to specify
`-fno-visibility-global-new-delete` when they intend to replace these
functions either for a single linkage unit or set of linkage units.

`-fno-visibility-global-new-delete` can be applied globally to the
compilations in a build where `-fvisibility-global-new-delete-hidden`
cannot; as it conflicts with a common pattern where these functions are
dynamically imported.

`-fno-visibility-global-new-delete` makes sense as the default for PS5.
Users that want the normal toolchain behaviour will be able to supply
`-fvisibility-global-new-delete`.
---
 clang/include/clang/Basic/LangOptions.def |  3 +-
 clang/include/clang/Driver/Options.td |  6 +++
 clang/lib/Driver/ToolChains/Clang.cpp | 12 +
 clang/lib/Driver/ToolChains/PS4CPU.cpp|  6 +++
 clang/lib/Sema/SemaExprCXX.cpp|  9 ++--
 .../visibility-global-new-delete.cpp  | 13 +
 .../Driver/visibility-global-new-delete.cl| 47 +++
 7 files changed, 91 insertions(+), 5 deletions(-)
 create mode 100644 clang/test/CodeGenCXX/visibility-global-new-delete.cpp
 create mode 100644 clang/test/Driver/visibility-global-new-delete.cl

diff --git a/clang/include/clang/Basic/LangOptions.def 
b/clang/include/clang/Basic/LangOptions.def
index c3d5399905a3fd..1471fc11e11663 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -306,7 +306,8 @@ BENIGN_LANGOPT(IgnoreXCOFFVisibility, 1, 0, "All the 
visibility attributes that
 BENIGN_LANGOPT(VisibilityInlinesHiddenStaticLocalVar, 1, 0,
"hidden visibility for static local variables in inline C++ "
"methods when -fvisibility-inlines hidden is enabled")
-LANGOPT(GlobalAllocationFunctionVisibilityHidden , 1, 0, "hidden visibility 
for global operator new and delete declaration")
+LANGOPT(GlobalAllocationFunctionVisibility, 1, 1, "add a visibility attribute 
to the implicit global operator new and delete declarations")
+LANGOPT(GlobalAllocationFunctionVisibilityHidden, 1, 0, "hidden visibility for 
global operator new and delete declarations")
 LANGOPT(NewInfallible , 1, 0, "Treats throwing global C++ operator new as 
always returning valid memory (annotates with __attribute__((returns_nonnull)) 
and throw()). This is detectable in source.")
 BENIGN_LANGOPT(ParseUnknownAnytype, 1, 0, "__unknown_anytype")
 BENIGN_LANGOPT(DebuggerSupport , 1, 0, "debugger support")
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index db2190318c931a..a9f43b18df6fbf 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3863,6 +3863,12 @@ defm visibility_inlines_hidden_static_local_var : 
BoolFOption<"visibility-inline
 def fvisibility_ms_compat : Flag<["-"], "fvisibility-ms-compat">, 
Group,
   HelpText<"Give global types 'default' visibility and global functions and "
"variables 'hidden' visibility by default">;
+defm visibility_global_new_delete : BoolFOption<"visibility-global-new-delete",
+  LangOpts<"GlobalAllocationFunctionVisibility">, DefaultTrue,
+  PosFlag,
+  NegFlag,
+  BothFlags<[], [ClangOption, CC1Option],
+  " a visibility attribute to the implicit global C++ operator new and 
delete declarations">>;
 def fvisibility_global_new_delete_hidden : Flag<["-"], 
"fvisibility-global-new-delete-hidden">, Group,
   HelpText<"Give global C++ operator new and delete declarations hidden 
visibility">,
   Visibility<[ClangOption, CC1Option]>,
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
inde

[clang] [llvm] [Clang][IR] add TBAA metadata on pointer, union and array types. (PR #75177)

2023-12-14 Thread Vlad Serebrennikov via cfe-commits


@@ -1,7 +1,7 @@
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++98 %s -O3 -disable-llvm-passes 
-pedantic-errors -emit-llvm -o - | FileCheck %s

Endilll wrote:

Can you explain why `-disable-llvm-passes` is there, and why is can be removed 
now?

https://github.com/llvm/llvm-project/pull/75177
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][wasm] Resolve assertion errors caused by converting ComplexTy… (PR #70496)

2023-12-14 Thread Vlad Serebrennikov via cfe-commits

Endilll wrote:

Can you add a test for this?

https://github.com/llvm/llvm-project/pull/70496
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][wasm] Resolve assertion errors caused by converting ComplexTy… (PR #70496)

2023-12-14 Thread via cfe-commits

knightXun wrote:

> Can you add a test for this?

oh, I'll write one.

https://github.com/llvm/llvm-project/pull/70496
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][sema] forbid vector_size attr when specify `-mgeneral-regs-only` on x86 (PR #75350)

2023-12-14 Thread via cfe-commits


@@ -8251,6 +8251,25 @@ static void HandleVectorSizeAttr(QualType &CurType, 
const ParsedAttr &Attr,
 return;
   }
 
+  // check -mgeneral-regs-only is specified
+  const TargetInfo &targetInfo = S.getASTContext().getTargetInfo();
+  llvm::Triple::ArchType arch = targetInfo.getTriple().getArch();
+  const auto HasFeature = [](const clang::TargetOptions &targetOpts,
+ const std::string &feature) {
+return std::find(targetOpts.Features.begin(), targetOpts.Features.end(),
+ feature) != targetOpts.Features.end();
+  };
+  if (CurType->isSpecificBuiltinType(BuiltinType::LongDouble)) {

knightXun wrote:

that's great advice!

https://github.com/llvm/llvm-project/pull/75350
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [AArch64] Disable large global group relocation (PR #75445)

2023-12-14 Thread via cfe-commits

https://github.com/wc00862805aj created 
https://github.com/llvm/llvm-project/pull/75445

None

>From b1109d297690b3b162eab21dfc41ce03a898a908 Mon Sep 17 00:00:00 2001
From: wcleungaj 
Date: Thu, 14 Dec 2023 16:54:37 +0800
Subject: [PATCH] [AArch64] Disable large global group relocation

---
 clang/include/clang/Driver/Options.td |  3 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  5 +
 .../AArch64/GISel/AArch64InstructionSelector.cpp  |  5 +
 .../AArch64/GlobalISel/select-blockaddress.mir| 11 +++
 4 files changed, 24 insertions(+)

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..827e14a071f436 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4663,6 +4663,9 @@ def mno_fix_cortex_a53_835769 : Flag<["-"], 
"mno-fix-cortex-a53-835769">,
 def mmark_bti_property : Flag<["-"], "mmark-bti-property">,
   Group,
   HelpText<"Add .note.gnu.property with BTI to assembly files (AArch64 only)">;
+def mno_large_group_reloc: Flag<["-"], "mno-large-global-group-reloc">, 
+  Group,
+  HelpText<"Disable group relocation type for global value and symbol when 
code model is large">;
 def mno_bti_at_return_twice : Flag<["-"], "mno-bti-at-return-twice">,
   Group,
   HelpText<"Do not add a BTI instruction after a setjmp or other"
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index de9fd5eaa1e020..8edfe00358a066 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4977,6 +4977,11 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   if (Args.getLastArg(options::OPT_save_temps_EQ))
 Args.AddLastArg(CmdArgs, options::OPT_save_temps_EQ);
 
+  if (Args.getLastArg(options::OPT_mno_large_global_group_reloc)){
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-mno-large-global-group-reloc");
+  }
+
   auto *MemProfArg = Args.getLastArg(options::OPT_fmemory_profile,
  options::OPT_fmemory_profile_EQ,
  options::OPT_fno_memory_profile);
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index bdaae4dd724d53..95669104739db4 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -66,6 +66,10 @@ namespace {
 #include "AArch64GenGlobalISel.inc"
 #undef GET_GLOBALISEL_PREDICATE_BITSET
 
+static cl::opt DisableLargeGlobalGroupReloc(
+  "mno-large-global-group-reloc",
+  cl::desc("Disable group relocation type for global value and symbol when 
code model is large"),
+  cl::init(false));
 
 class AArch64InstructionSelector : public InstructionSelector {
 public:
@@ -2850,6 +2854,7 @@ bool AArch64InstructionSelector::select(MachineInstr &I) {
   I.setDesc(TII.get(AArch64::LOADgot));
   I.getOperand(1).setTargetFlags(OpFlags);
 } else if (TM.getCodeModel() == CodeModel::Large &&
+   !DisableLargeGlobalGroupReloc &&
!TM.isPositionIndependent()) {
   // Materialize the global using movz/movk instructions.
   materializeLargeCMVal(I, GV, OpFlags);
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
index 28d279d7421642..dadde2d8f33426 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
@@ -2,6 +2,7 @@
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select %s | FileCheck %s
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large %s | FileCheck %s 
--check-prefix=LARGE
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -relocation-model=pic %s | 
FileCheck %s --check-prefix=LARGE-PIC
+# RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -mno-large-global-group-reloc %s 
| FileCheck %s --check-prefix=NO-LARGE-GLOBAL-GROUP-RELOC
 --- |
   source_filename = "blockaddress.ll"
   target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
@@ -62,6 +63,16 @@ body: |
   ; LARGE-PIC-NEXT:   BR [[MOVaddrBA]]
   ; LARGE-PIC-NEXT: {{  $}}
   ; LARGE-PIC-NEXT: bb.1.block (ir-block-address-taken %ir-block.block):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC-LABEL: name: test_blockaddress
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: bb.0 (%ir-block.0):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVZXi:%[0-9]+]]:gpr64 = MOVZXi 
target-flags(aarch64-g0, aarch64-nc) blockaddress(@test_blockaddress, 
%ir-block.block), 0
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVKXi:%[0-9]+]]:gpr64 = MO

[clang] [llvm] [AArch64] Disable large global group relocation (PR #75445)

2023-12-14 Thread via cfe-commits

github-actions[bot] wrote:

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this 
page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using `@` followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from 
other developers.

If you have further questions, they may be answered by the [LLVM GitHub User 
Guide](https://llvm.org/docs/GitHub.html).

You can also ask questions in a comment on this PR, on the [LLVM 
Discord](https://discord.com/invite/xS7Z362) or on the 
[forums](https://discourse.llvm.org/).

https://github.com/llvm/llvm-project/pull/75445
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [AArch64] Disable large global group relocation (PR #75445)

2023-12-14 Thread via cfe-commits

llvmbot wrote:



@llvm/pr-subscribers-clang-driver

@llvm/pr-subscribers-llvm-globalisel

Author: None (wc00862805aj)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/75445.diff


4 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+3) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+5) 
- (modified) llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp (+5) 
- (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir (+11) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..827e14a071f436 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4663,6 +4663,9 @@ def mno_fix_cortex_a53_835769 : Flag<["-"], 
"mno-fix-cortex-a53-835769">,
 def mmark_bti_property : Flag<["-"], "mmark-bti-property">,
   Group,
   HelpText<"Add .note.gnu.property with BTI to assembly files (AArch64 only)">;
+def mno_large_group_reloc: Flag<["-"], "mno-large-global-group-reloc">, 
+  Group,
+  HelpText<"Disable group relocation type for global value and symbol when 
code model is large">;
 def mno_bti_at_return_twice : Flag<["-"], "mno-bti-at-return-twice">,
   Group,
   HelpText<"Do not add a BTI instruction after a setjmp or other"
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index de9fd5eaa1e020..8edfe00358a066 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4977,6 +4977,11 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   if (Args.getLastArg(options::OPT_save_temps_EQ))
 Args.AddLastArg(CmdArgs, options::OPT_save_temps_EQ);
 
+  if (Args.getLastArg(options::OPT_mno_large_global_group_reloc)){
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-mno-large-global-group-reloc");
+  }
+
   auto *MemProfArg = Args.getLastArg(options::OPT_fmemory_profile,
  options::OPT_fmemory_profile_EQ,
  options::OPT_fno_memory_profile);
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index bdaae4dd724d53..95669104739db4 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -66,6 +66,10 @@ namespace {
 #include "AArch64GenGlobalISel.inc"
 #undef GET_GLOBALISEL_PREDICATE_BITSET
 
+static cl::opt DisableLargeGlobalGroupReloc(
+  "mno-large-global-group-reloc",
+  cl::desc("Disable group relocation type for global value and symbol when 
code model is large"),
+  cl::init(false));
 
 class AArch64InstructionSelector : public InstructionSelector {
 public:
@@ -2850,6 +2854,7 @@ bool AArch64InstructionSelector::select(MachineInstr &I) {
   I.setDesc(TII.get(AArch64::LOADgot));
   I.getOperand(1).setTargetFlags(OpFlags);
 } else if (TM.getCodeModel() == CodeModel::Large &&
+   !DisableLargeGlobalGroupReloc &&
!TM.isPositionIndependent()) {
   // Materialize the global using movz/movk instructions.
   materializeLargeCMVal(I, GV, OpFlags);
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
index 28d279d7421642..dadde2d8f33426 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
@@ -2,6 +2,7 @@
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select %s | FileCheck %s
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large %s | FileCheck %s 
--check-prefix=LARGE
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -relocation-model=pic %s | 
FileCheck %s --check-prefix=LARGE-PIC
+# RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -mno-large-global-group-reloc %s 
| FileCheck %s --check-prefix=NO-LARGE-GLOBAL-GROUP-RELOC
 --- |
   source_filename = "blockaddress.ll"
   target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
@@ -62,6 +63,16 @@ body: |
   ; LARGE-PIC-NEXT:   BR [[MOVaddrBA]]
   ; LARGE-PIC-NEXT: {{  $}}
   ; LARGE-PIC-NEXT: bb.1.block (ir-block-address-taken %ir-block.block):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC-LABEL: name: test_blockaddress
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: bb.0 (%ir-block.0):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVZXi:%[0-9]+]]:gpr64 = MOVZXi 
target-flags(aarch64-g0, aarch64-nc) blockaddress(@test_blockaddress, 
%ir-block.block), 0
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVKXi:%[0-9]+]]:gpr64 = MOVKXi [[MOVZXi]], 
target-flags(aarch64-g1, aarch64-nc) blockaddress(@test_blockaddre

[clang] [llvm] [AArch64] Disable large global group relocation (PR #75445)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: None (wc00862805aj)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/75445.diff


4 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+3) 
- (modified) clang/lib/Driver/ToolChains/Clang.cpp (+5) 
- (modified) llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp (+5) 
- (modified) llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir (+11) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..827e14a071f436 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4663,6 +4663,9 @@ def mno_fix_cortex_a53_835769 : Flag<["-"], 
"mno-fix-cortex-a53-835769">,
 def mmark_bti_property : Flag<["-"], "mmark-bti-property">,
   Group,
   HelpText<"Add .note.gnu.property with BTI to assembly files (AArch64 only)">;
+def mno_large_group_reloc: Flag<["-"], "mno-large-global-group-reloc">, 
+  Group,
+  HelpText<"Disable group relocation type for global value and symbol when 
code model is large">;
 def mno_bti_at_return_twice : Flag<["-"], "mno-bti-at-return-twice">,
   Group,
   HelpText<"Do not add a BTI instruction after a setjmp or other"
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index de9fd5eaa1e020..8edfe00358a066 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4977,6 +4977,11 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   if (Args.getLastArg(options::OPT_save_temps_EQ))
 Args.AddLastArg(CmdArgs, options::OPT_save_temps_EQ);
 
+  if (Args.getLastArg(options::OPT_mno_large_global_group_reloc)){
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-mno-large-global-group-reloc");
+  }
+
   auto *MemProfArg = Args.getLastArg(options::OPT_fmemory_profile,
  options::OPT_fmemory_profile_EQ,
  options::OPT_fno_memory_profile);
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index bdaae4dd724d53..95669104739db4 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -66,6 +66,10 @@ namespace {
 #include "AArch64GenGlobalISel.inc"
 #undef GET_GLOBALISEL_PREDICATE_BITSET
 
+static cl::opt DisableLargeGlobalGroupReloc(
+  "mno-large-global-group-reloc",
+  cl::desc("Disable group relocation type for global value and symbol when 
code model is large"),
+  cl::init(false));
 
 class AArch64InstructionSelector : public InstructionSelector {
 public:
@@ -2850,6 +2854,7 @@ bool AArch64InstructionSelector::select(MachineInstr &I) {
   I.setDesc(TII.get(AArch64::LOADgot));
   I.getOperand(1).setTargetFlags(OpFlags);
 } else if (TM.getCodeModel() == CodeModel::Large &&
+   !DisableLargeGlobalGroupReloc &&
!TM.isPositionIndependent()) {
   // Materialize the global using movz/movk instructions.
   materializeLargeCMVal(I, GV, OpFlags);
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
index 28d279d7421642..dadde2d8f33426 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
@@ -2,6 +2,7 @@
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select %s | FileCheck %s
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large %s | FileCheck %s 
--check-prefix=LARGE
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -relocation-model=pic %s | 
FileCheck %s --check-prefix=LARGE-PIC
+# RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -mno-large-global-group-reloc %s 
| FileCheck %s --check-prefix=NO-LARGE-GLOBAL-GROUP-RELOC
 --- |
   source_filename = "blockaddress.ll"
   target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
@@ -62,6 +63,16 @@ body: |
   ; LARGE-PIC-NEXT:   BR [[MOVaddrBA]]
   ; LARGE-PIC-NEXT: {{  $}}
   ; LARGE-PIC-NEXT: bb.1.block (ir-block-address-taken %ir-block.block):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC-LABEL: name: test_blockaddress
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: bb.0 (%ir-block.0):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVZXi:%[0-9]+]]:gpr64 = MOVZXi 
target-flags(aarch64-g0, aarch64-nc) blockaddress(@test_blockaddress, 
%ir-block.block), 0
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVKXi:%[0-9]+]]:gpr64 = MOVKXi [[MOVZXi]], 
target-flags(aarch64-g1, aarch64-nc) blockaddress(@test_blockaddress, 
%ir-block.block), 16
+  ; NO-

[clang] [llvm] [AArch64] Disable large global group relocation (PR #75445)

2023-12-14 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff a2691e363232c011fdaace9fcc094f3cd210f78b 
b1109d297690b3b162eab21dfc41ce03a898a908 -- 
clang/lib/Driver/ToolChains/Clang.cpp 
llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index 8edfe00358..6b01cca9d1 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4977,7 +4977,7 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   if (Args.getLastArg(options::OPT_save_temps_EQ))
 Args.AddLastArg(CmdArgs, options::OPT_save_temps_EQ);
 
-  if (Args.getLastArg(options::OPT_mno_large_global_group_reloc)){
+  if (Args.getLastArg(options::OPT_mno_large_global_group_reloc)) {
 CmdArgs.push_back("-mllvm");
 CmdArgs.push_back("-mno-large-global-group-reloc");
   }
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index 9566910473..9ffb545dac 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -67,9 +67,10 @@ namespace {
 #undef GET_GLOBALISEL_PREDICATE_BITSET
 
 static cl::opt DisableLargeGlobalGroupReloc(
-  "mno-large-global-group-reloc",
-  cl::desc("Disable group relocation type for global value and symbol when 
code model is large"),
-  cl::init(false));
+"mno-large-global-group-reloc",
+cl::desc("Disable group relocation type for global value and symbol when "
+ "code model is large"),
+cl::init(false));
 
 class AArch64InstructionSelector : public InstructionSelector {
 public:
@@ -2854,8 +2855,7 @@ bool AArch64InstructionSelector::select(MachineInstr &I) {
   I.setDesc(TII.get(AArch64::LOADgot));
   I.getOperand(1).setTargetFlags(OpFlags);
 } else if (TM.getCodeModel() == CodeModel::Large &&
-   !DisableLargeGlobalGroupReloc &&
-   !TM.isPositionIndependent()) {
+   !DisableLargeGlobalGroupReloc && !TM.isPositionIndependent()) {
   // Materialize the global using movz/movk instructions.
   materializeLargeCMVal(I, GV, OpFlags);
   I.eraseFromParent();

``




https://github.com/llvm/llvm-project/pull/75445
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libunwind] [libcxx] [libcxxabi] [runtimes] Don't link against compiler-rt explicitly when we use -nostdlib++ (PR #75089)

2023-12-14 Thread Petr Hosek via cfe-commits

petrhosek wrote:

> > I'm trying to implement support for building libunwind, libc++abi and 
> > libc++ against LLVM libc in which case we won't be able to rely on 
> > `-nostdlib++`, we'll need to use `-nostdlib` to avoid linking the C 
> > library. We can still use `-nostdlib++` when LLVM libc isn't being used 
> > used, but a lot of this logic will need to be refactored to support the new 
> > use case. With that in mind, I'm fine with change as an interim solution.
> 
> Is there a reason why `-nostdlib` also drops compiler-rt? If `-nostdlib++` 
> affects only the C++ library, it would make sense that `-nostdlib` affects 
> only the C library?

The reason is matching the behavior of GCC, but given that I don't think we can 
change the current semantics.

https://github.com/llvm/llvm-project/pull/75089
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AArch64] Disable large global group relocation (PR #75445)

2023-12-14 Thread via cfe-commits

https://github.com/wc00862805aj updated 
https://github.com/llvm/llvm-project/pull/75445

>From e7e61ffa07a4b0ad40b91243545e9194dc217385 Mon Sep 17 00:00:00 2001
From: wcleungaj 
Date: Thu, 14 Dec 2023 16:54:37 +0800
Subject: [PATCH] [AArch64] Disable large global group relocation

---
 clang/include/clang/Driver/Options.td |  3 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  5 +
 .../AArch64/GISel/AArch64InstructionSelector.cpp  |  5 +
 .../AArch64/GlobalISel/select-blockaddress.mir| 11 +++
 4 files changed, 24 insertions(+)

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..592358d0935853 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4663,6 +4663,9 @@ def mno_fix_cortex_a53_835769 : Flag<["-"], 
"mno-fix-cortex-a53-835769">,
 def mmark_bti_property : Flag<["-"], "mmark-bti-property">,
   Group,
   HelpText<"Add .note.gnu.property with BTI to assembly files (AArch64 only)">;
+def mno_large_global_group_reloc: Flag<["-"], "mno-large-global-group-reloc">, 
+  Group,
+  HelpText<"Disable group relocation type for global value and symbol when 
code model is large">;
 def mno_bti_at_return_twice : Flag<["-"], "mno-bti-at-return-twice">,
   Group,
   HelpText<"Do not add a BTI instruction after a setjmp or other"
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index de9fd5eaa1e020..8edfe00358a066 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4977,6 +4977,11 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   if (Args.getLastArg(options::OPT_save_temps_EQ))
 Args.AddLastArg(CmdArgs, options::OPT_save_temps_EQ);
 
+  if (Args.getLastArg(options::OPT_mno_large_global_group_reloc)){
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-mno-large-global-group-reloc");
+  }
+
   auto *MemProfArg = Args.getLastArg(options::OPT_fmemory_profile,
  options::OPT_fmemory_profile_EQ,
  options::OPT_fno_memory_profile);
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index bdaae4dd724d53..95669104739db4 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -66,6 +66,10 @@ namespace {
 #include "AArch64GenGlobalISel.inc"
 #undef GET_GLOBALISEL_PREDICATE_BITSET
 
+static cl::opt DisableLargeGlobalGroupReloc(
+  "mno-large-global-group-reloc",
+  cl::desc("Disable group relocation type for global value and symbol when 
code model is large"),
+  cl::init(false));
 
 class AArch64InstructionSelector : public InstructionSelector {
 public:
@@ -2850,6 +2854,7 @@ bool AArch64InstructionSelector::select(MachineInstr &I) {
   I.setDesc(TII.get(AArch64::LOADgot));
   I.getOperand(1).setTargetFlags(OpFlags);
 } else if (TM.getCodeModel() == CodeModel::Large &&
+   !DisableLargeGlobalGroupReloc &&
!TM.isPositionIndependent()) {
   // Materialize the global using movz/movk instructions.
   materializeLargeCMVal(I, GV, OpFlags);
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
index 28d279d7421642..dadde2d8f33426 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
@@ -2,6 +2,7 @@
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select %s | FileCheck %s
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large %s | FileCheck %s 
--check-prefix=LARGE
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -relocation-model=pic %s | 
FileCheck %s --check-prefix=LARGE-PIC
+# RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -mno-large-global-group-reloc %s 
| FileCheck %s --check-prefix=NO-LARGE-GLOBAL-GROUP-RELOC
 --- |
   source_filename = "blockaddress.ll"
   target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
@@ -62,6 +63,16 @@ body: |
   ; LARGE-PIC-NEXT:   BR [[MOVaddrBA]]
   ; LARGE-PIC-NEXT: {{  $}}
   ; LARGE-PIC-NEXT: bb.1.block (ir-block-address-taken %ir-block.block):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC-LABEL: name: test_blockaddress
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: bb.0 (%ir-block.0):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVZXi:%[0-9]+]]:gpr64 = MOVZXi 
target-flags(aarch64-g0, aarch64-nc) blockaddress(@test_blockaddress, 
%ir-block.block), 0
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVKXi:%[0-9]+]]:gpr64 = M

[clang] [Clang][ARM] support arm target attribute, and warning for bad typo (PR #74812)

2023-12-14 Thread via cfe-commits

https://github.com/hstk30-hw updated 
https://github.com/llvm/llvm-project/pull/74812

>From f9c6d46e73b612c261db5fdfebf49bb28003cf0d Mon Sep 17 00:00:00 2001
From: hstk30-hw 
Date: Fri, 8 Dec 2023 14:29:33 +0800
Subject: [PATCH] feat: support arm target attribute, and warning for bad typo

---
 clang/lib/Basic/Targets/AArch64.cpp   |  6 +-
 clang/lib/Basic/Targets/ARM.cpp   | 87 +++
 clang/lib/Basic/Targets/ARM.h |  2 +
 clang/lib/Sema/SemaDeclAttr.cpp   |  3 +
 clang/test/CodeGen/arm-targetattr.c   | 13 +++
 .../arm-ignore-branch-protection-option.c |  4 +-
 .../Sema/arm-branch-protection-attr-warn.c| 10 +--
 clang/test/Sema/arm-branch-protection.c   | 32 +++
 clang/test/Sema/attr-target.c |  8 ++
 9 files changed, 141 insertions(+), 24 deletions(-)
 create mode 100644 clang/test/CodeGen/arm-targetattr.c

diff --git a/clang/lib/Basic/Targets/AArch64.cpp 
b/clang/lib/Basic/Targets/AArch64.cpp
index def16c032c869e..66f0c0e159c2ae 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -1085,13 +1085,17 @@ ParsedTargetAttr 
AArch64TargetInfo::parseTargetAttr(StringRef Features) const {
   FoundArch = true;
   std::pair Split =
   Feature.split("=").second.trim().split("+");
+  if (Split.first == "")
+continue;
   const std::optional AI =
   llvm::AArch64::parseArch(Split.first);
 
   // Parse the architecture version, adding the required features to
   // Ret.Features.
-  if (!AI)
+  if (!AI) {
+Ret.Features.push_back("+UNKNOWN");
 continue;
+  }
   Ret.Features.push_back(AI->ArchFeature.str());
   // Add any extra features, after the +
   SplitAndAddFeatures(Split.second, Ret.Features);
diff --git a/clang/lib/Basic/Targets/ARM.cpp b/clang/lib/Basic/Targets/ARM.cpp
index ce7e4d4639ceac..7ed64f4f44a9b5 100644
--- a/clang/lib/Basic/Targets/ARM.cpp
+++ b/clang/lib/Basic/Targets/ARM.cpp
@@ -11,6 +11,7 @@
 
//===--===//
 
 #include "ARM.h"
+#include "clang/AST/Attr.h"
 #include "clang/Basic/Builtins.h"
 #include "clang/Basic/Diagnostic.h"
 #include "clang/Basic/TargetBuiltins.h"
@@ -639,6 +640,92 @@ bool 
ARMTargetInfo::handleTargetFeatures(std::vector &Features,
   return true;
 }
 
+// Parse ARM Target attributes, which are a comma separated list of:
+//  "arch=" - parsed to features as per -march=..
+//  "cpu=" - parsed to features as per -mcpu=.., with CPU set to 
+//  "tune=" - TuneCPU set to 
+//  "feature", "no-feature" - Add (or remove) feature.
+//  "+feature", "+nofeature" - Add (or remove) feature.
+ParsedTargetAttr ARMTargetInfo::parseTargetAttr(StringRef Features) const {
+  ParsedTargetAttr Ret;
+  if (Features == "default")
+return Ret;
+  SmallVector AttrFeatures;
+  Features.split(AttrFeatures, ",");
+  bool FoundArch = false;
+
+  auto SplitAndAddFeatures = [](StringRef FeatString,
+std::vector &Features) {
+SmallVector SplitFeatures;
+FeatString.split(SplitFeatures, StringRef("+"), -1, false);
+for (StringRef Feature : SplitFeatures) {
+  StringRef FeatureName = llvm::ARM::getArchExtFeature(Feature);
+  if (!FeatureName.empty())
+Features.push_back(FeatureName.str());
+  else
+// Pushing the original feature string to give a sema error later on
+// when they get checked.
+Features.push_back(Feature.str());
+}
+  };
+
+  for (auto &Feature : AttrFeatures) {
+Feature = Feature.trim();
+if (Feature.startswith("fpmath="))
+  continue;
+
+if (Feature.startswith("branch-protection=")) {
+  Ret.BranchProtection = Feature.split('=').second.trim();
+  continue;
+}
+
+if (Feature.startswith("arch=")) {
+  if (FoundArch)
+Ret.Duplicate = "arch=";
+  FoundArch = true;
+  std::pair Split =
+  Feature.split("=").second.trim().split("+");
+  if (Split.first == "")
+continue;
+  llvm::ARM::ArchKind ArchKind = llvm::ARM::parseArch(Split.first);
+
+  // Parse the architecture version, adding the required features to
+  // Ret.Features.
+  std::vector FeatureStrs;
+  if (ArchKind == llvm::ARM::ArchKind::INVALID) {
+Ret.Features.push_back("+UNKNOWN");
+continue;
+  }
+  std::string ArchFeature = ("+" + llvm::ARM::getArchName(ArchKind)).str();
+  Ret.Features.push_back(ArchFeature);
+  // Add any extra features, after the +
+  SplitAndAddFeatures(Split.second, Ret.Features);
+} else if (Feature.startswith("cpu=")) {
+  if (!Ret.CPU.empty())
+Ret.Duplicate = "cpu=";
+  else {
+// Split the cpu string into "cpu=", "cortex-a710" and any remaining
+// "+feat" features.
+std::pair Split =
+Feature.split("=").second.trim().split("+");
+Ret.CPU 

[clang] [clang][Interp] IndirectMember initializers (PR #69900)

2023-12-14 Thread Timm Baeder via cfe-commits

tbaederr wrote:

Ping

https://github.com/llvm/llvm-project/pull/69900
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][Interp] Add inline descriptor to global variables (PR #72892)

2023-12-14 Thread Timm Baeder via cfe-commits
Timm =?utf-8?q?B=C3=A4der?= ,
Timm =?utf-8?q?B=C3=A4der?= 
Message-ID:
In-Reply-To: 


tbaederr wrote:

Ping

https://github.com/llvm/llvm-project/pull/72892
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][Interp] Use array filler expression (PR #72865)

2023-12-14 Thread Timm Baeder via cfe-commits

tbaederr wrote:

Ping

https://github.com/llvm/llvm-project/pull/72865
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang-tools-extra] [mlir] [clang] Make clang report garbage target versions. (PR #75373)

2023-12-14 Thread via cfe-commits

zmodem wrote:

Thanks for looking at this. Silently ignoring bad inputs is usually not a good 
idea.

However, it would seem better to emit a proper diagnostic from the driver 
rather than calling exit in getOSDefines(). That way the regular diagnostic 
mechanisms can be used to decide what to do with it (treat as error, ignore, 
etc.).

https://github.com/llvm/llvm-project/pull/75373
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][fatlto] Don't set ThinLTO module flag with FatLTO (PR #75079)

2023-12-14 Thread Nikita Popov via cfe-commits

https://github.com/nikic approved this pull request.

LGTM -- I think this change is clearly right, independently of the ModuleID 
issue.

https://github.com/llvm/llvm-project/pull/75079
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [X86] Add ABI handling for __float128 (PR #75156)

2023-12-14 Thread Simon Pilgrim via cfe-commits


@@ -0,0 +1,35 @@
+// RUN: %clang_cc1 -triple x86_64-linux -emit-llvm  -target-feature +sse2 < %s 
| FileCheck %s --check-prefixes=CHECK

RKSimon wrote:

Worth adding a non-SSE RUN?

https://github.com/llvm/llvm-project/pull/75156
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus created 
https://github.com/llvm/llvm-project/pull/75448

Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of vector GEPs
need to be computed differently than offsets of array GEPs.

This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of  `GTI.getSequentialElementStride(DL)`, 
which is a new helper function added in this PR.

This changes behavior for GEPs into vectors with element types for which the
(bit) size and alloc size is different. This includes two cases:

* Types with a bit size that is not a multiple of a byte, e.g. i1.
  GEPs into such vectors are questionable to begin with, as some elements
  are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.

Existing tests are unaffected, but a miscompilation of a new precommitted test 
is fixed.

>From 2c367fba42b716d803ee088af45c1b57fe4bcbcd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:24:51 +0100
Subject: [PATCH 1/3] [InstCombine] Precommit test exhibiting miscompile

InstCombine is determining incorrect byte offsets for GEPs
into vectors of overaligned elements.
Add a new testcase showing this behavior, serving as precommit
for a fix.
---
 .../test/Transforms/InstCombine/getelementptr.ll | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll 
b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bc7fdc9352df6c..7c0d95973d5cf3 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
 
-target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64"
+target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64-f16:32"
 
 %intstruct = type { i32 }
 %pair = type { i32, i32 }
@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:
+; CHECK-NEXT:store i8 [[B:%.*]], ptr getelementptr inbounds ([10 x i8], 
ptr @Global, i64 0, i64 4), align 1
+; CHECK-NEXT:ret void
+;
+  %A = getelementptr <2 x half>, ptr @Global, i64 0, i64 1
+  store i8 %B, ptr %A
+  ret void
+}
+
 define ptr @test7(ptr %I, i64 %C, i64 %D) {
 ; CHECK-LABEL: @test7(
 ; CHECK-NEXT:[[A:%.*]] = getelementptr i32, ptr [[I:%.*]], i64 [[C:%.*]]

>From 2e80e5846d23946304bbb9930d0a12098a2f16bd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:29:59 +0100
Subject: [PATCH 2/3] [IR]: Add
 generic_gep_type_iterator::getSequentialElementStride

This prepares a fix to GEP offset computations on vectors
of overaligned elements.

We have many places that analyze GEP offsets using GEP iterators
with the following pattern:

  GTI = gep_type_begin(ElemTy, Indices),
  GTE = gep_type_end(ElemTy, Indices);
  for (; GTI != GTE; ++GTI) {
if (StructType *STy = GTI.getStructTypeOrNull()) {
   // handle struct
   [..]
} else {
  // handle sequential (outmost index, array, vector):
  auto Stride = DL.getTypeAllocSize(GTI.getIndexedType());
  Offset += Index * Size;
}
  }

This is incorrect for vectors of types whose bit size does not
equal its alloc size (e.g. overaligned types), as vectors
always bit-pack their elements.

This patch introduces new functions generic_gep_type_iterator::isVector()
and generic_gep_type_iterator::getSequentialElementStride(const DataLayout &)
to fix these patterns without having to teach all these places about
the specifics of vector bit layouts. With these helpers, the pattern
above can be fixed by replacing the stride computation:

  auto Stride = GTI.getSequentialElementStride(DL);
---
 .../llvm/IR/GetElementPtrTypeIterator.h   | 57 ++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterat

[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread via cfe-commits

llvmbot wrote:



@llvm/pr-subscribers-llvm-globalisel
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-aarch64

Author: Jannik Silvanus (jasilvanus)


Changes

Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of vector GEPs
need to be computed differently than offsets of array GEPs.

This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of  `GTI.getSequentialElementStride(DL)`, 
which is a new helper function added in this PR.

This changes behavior for GEPs into vectors with element types for which the
(bit) size and alloc size is different. This includes two cases:

* Types with a bit size that is not a multiple of a byte, e.g. i1.
  GEPs into such vectors are questionable to begin with, as some elements
  are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.

Existing tests are unaffected, but a miscompilation of a new precommitted test 
is fixed.

---

Patch is 28.39 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/75448.diff


28 Files Affected:

- (modified) clang/lib/CodeGen/CGExprScalar.cpp (+1-1) 
- (modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (+1-1) 
- (modified) llvm/include/llvm/IR/GetElementPtrTypeIterator.h (+54-3) 
- (modified) llvm/lib/Analysis/BasicAliasAnalysis.cpp (+2-2) 
- (modified) llvm/lib/Analysis/InlineCost.cpp (+1-1) 
- (modified) llvm/lib/Analysis/Local.cpp (+1-1) 
- (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+4-1) 
- (modified) llvm/lib/Analysis/ValueTracking.cpp (+2-2) 
- (modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+1-1) 
- (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+1-1) 
- (modified) llvm/lib/CodeGen/SelectionDAG/FastISel.cpp (+2-4) 
- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+1-1) 
- (modified) llvm/lib/ExecutionEngine/Interpreter/Execution.cpp (+1-1) 
- (modified) llvm/lib/IR/DataLayout.cpp (+2-3) 
- (modified) llvm/lib/IR/Operator.cpp (+5-7) 
- (modified) llvm/lib/IR/Value.cpp (+1-1) 
- (modified) llvm/lib/Target/AArch64/AArch64FastISel.cpp (+4-6) 
- (modified) llvm/lib/Target/ARM/ARMFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/Mips/MipsFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/PowerPC/PPCFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVGatherScatterLowering.cpp (+1-1) 
- (modified) llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/X86/X86FastISel.cpp (+1-1) 
- (modified) llvm/lib/Transforms/Scalar/SROA.cpp (+2-4) 
- (modified) llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp (+3-3) 
- (modified) llvm/lib/Transforms/Scalar/StraightLineStrengthReduce.cpp (+1-1) 
- (modified) llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+11-1) 


``diff
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 41ad2ddac30d2d..f3eabd1ce224b8 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -5293,7 +5293,7 @@ static GEPOffsetAndOverflow EmitGEPOffsetInBytes(Value 
*BasePtr, Value *GEPVal,
   // Otherwise this is array-like indexing. The local offset is the index
   // multiplied by the element size.
   auto *ElementSize = llvm::ConstantInt::get(
-  IntPtrTy, DL.getTypeAllocSize(GTI.getIndexedType()));
+  IntPtrTy, GTI.getSequentialElementStride(DL)));
   auto *IndexS = Builder.CreateIntCast(Index, IntPtrTy, /*isSigned=*/true);
   LocalOffset = eval(BO_Mul, ElementSize, IndexS);
 }
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h 
b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 1d8f523e9792ba..140838ff7c7c2d 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -1041,7 +1041,7 @@ class TargetTransformInfoImplCRTPBase : public 
TargetTransformInfoImplBase {
 if (TargetType->isScalableTy())
   return TTI::TCC_Basic;
 int64_t ElementSize =
-DL.getTypeAllocSize(GTI.getIndexedType()).getFixedValue();
+GTI.getSequentialElementStride(DL).getFixedValue();
 if (ConstIdx) {
   BaseOffset +=
   ConstIdx->getValue().sextOrTrunc(PtrSizeBits) * ElementSize;
diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/Use

[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Jannik Silvanus (jasilvanus)


Changes

Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of vector GEPs
need to be computed differently than offsets of array GEPs.

This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of  `GTI.getSequentialElementStride(DL)`, 
which is a new helper function added in this PR.

This changes behavior for GEPs into vectors with element types for which the
(bit) size and alloc size is different. This includes two cases:

* Types with a bit size that is not a multiple of a byte, e.g. i1.
  GEPs into such vectors are questionable to begin with, as some elements
  are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.

Existing tests are unaffected, but a miscompilation of a new precommitted test 
is fixed.

---

Patch is 28.39 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/75448.diff


28 Files Affected:

- (modified) clang/lib/CodeGen/CGExprScalar.cpp (+1-1) 
- (modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (+1-1) 
- (modified) llvm/include/llvm/IR/GetElementPtrTypeIterator.h (+54-3) 
- (modified) llvm/lib/Analysis/BasicAliasAnalysis.cpp (+2-2) 
- (modified) llvm/lib/Analysis/InlineCost.cpp (+1-1) 
- (modified) llvm/lib/Analysis/Local.cpp (+1-1) 
- (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+4-1) 
- (modified) llvm/lib/Analysis/ValueTracking.cpp (+2-2) 
- (modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+1-1) 
- (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+1-1) 
- (modified) llvm/lib/CodeGen/SelectionDAG/FastISel.cpp (+2-4) 
- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+1-1) 
- (modified) llvm/lib/ExecutionEngine/Interpreter/Execution.cpp (+1-1) 
- (modified) llvm/lib/IR/DataLayout.cpp (+2-3) 
- (modified) llvm/lib/IR/Operator.cpp (+5-7) 
- (modified) llvm/lib/IR/Value.cpp (+1-1) 
- (modified) llvm/lib/Target/AArch64/AArch64FastISel.cpp (+4-6) 
- (modified) llvm/lib/Target/ARM/ARMFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/Mips/MipsFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/PowerPC/PPCFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVGatherScatterLowering.cpp (+1-1) 
- (modified) llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/X86/X86FastISel.cpp (+1-1) 
- (modified) llvm/lib/Transforms/Scalar/SROA.cpp (+2-4) 
- (modified) llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp (+3-3) 
- (modified) llvm/lib/Transforms/Scalar/StraightLineStrengthReduce.cpp (+1-1) 
- (modified) llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+11-1) 


``diff
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 41ad2ddac30d2d..f3eabd1ce224b8 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -5293,7 +5293,7 @@ static GEPOffsetAndOverflow EmitGEPOffsetInBytes(Value 
*BasePtr, Value *GEPVal,
   // Otherwise this is array-like indexing. The local offset is the index
   // multiplied by the element size.
   auto *ElementSize = llvm::ConstantInt::get(
-  IntPtrTy, DL.getTypeAllocSize(GTI.getIndexedType()));
+  IntPtrTy, GTI.getSequentialElementStride(DL)));
   auto *IndexS = Builder.CreateIntCast(Index, IntPtrTy, /*isSigned=*/true);
   LocalOffset = eval(BO_Mul, ElementSize, IndexS);
 }
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h 
b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 1d8f523e9792ba..140838ff7c7c2d 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -1041,7 +1041,7 @@ class TargetTransformInfoImplCRTPBase : public 
TargetTransformInfoImplBase {
 if (TargetType->isScalableTy())
   return TTI::TCC_Basic;
 int64_t ElementSize =
-DL.getTypeAllocSize(GTI.getIndexedType()).getFixedValue();
+GTI.getSequentialElementStride(DL).getFixedValue();
 if (ConstIdx) {
   BaseOffset +=
   ConstIdx->getValue().sextOrTrunc(PtrSizeBits) * ElementSize;
diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterator {
 
   It

[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-arm

Author: Jannik Silvanus (jasilvanus)


Changes

Vectors are always bit-packed and don't respect the elements' alignment
requirements. This is different from arrays. This means offsets of vector GEPs
need to be computed differently than offsets of array GEPs.

This PR fixes many places that rely on an incorrect pattern
that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`.
We replace these by usages of  `GTI.getSequentialElementStride(DL)`, 
which is a new helper function added in this PR.

This changes behavior for GEPs into vectors with element types for which the
(bit) size and alloc size is different. This includes two cases:

* Types with a bit size that is not a multiple of a byte, e.g. i1.
  GEPs into such vectors are questionable to begin with, as some elements
  are not even addressable.
* Overaligned types, e.g. i16 with 32-bit alignment.

Existing tests are unaffected, but a miscompilation of a new precommitted test 
is fixed.

---

Patch is 28.39 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/75448.diff


28 Files Affected:

- (modified) clang/lib/CodeGen/CGExprScalar.cpp (+1-1) 
- (modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (+1-1) 
- (modified) llvm/include/llvm/IR/GetElementPtrTypeIterator.h (+54-3) 
- (modified) llvm/lib/Analysis/BasicAliasAnalysis.cpp (+2-2) 
- (modified) llvm/lib/Analysis/InlineCost.cpp (+1-1) 
- (modified) llvm/lib/Analysis/Local.cpp (+1-1) 
- (modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+4-1) 
- (modified) llvm/lib/Analysis/ValueTracking.cpp (+2-2) 
- (modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+1-1) 
- (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+1-1) 
- (modified) llvm/lib/CodeGen/SelectionDAG/FastISel.cpp (+2-4) 
- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+1-1) 
- (modified) llvm/lib/ExecutionEngine/Interpreter/Execution.cpp (+1-1) 
- (modified) llvm/lib/IR/DataLayout.cpp (+2-3) 
- (modified) llvm/lib/IR/Operator.cpp (+5-7) 
- (modified) llvm/lib/IR/Value.cpp (+1-1) 
- (modified) llvm/lib/Target/AArch64/AArch64FastISel.cpp (+4-6) 
- (modified) llvm/lib/Target/ARM/ARMFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/Mips/MipsFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/PowerPC/PPCFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVGatherScatterLowering.cpp (+1-1) 
- (modified) llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp (+1-1) 
- (modified) llvm/lib/Target/X86/X86FastISel.cpp (+1-1) 
- (modified) llvm/lib/Transforms/Scalar/SROA.cpp (+2-4) 
- (modified) llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp (+3-3) 
- (modified) llvm/lib/Transforms/Scalar/StraightLineStrengthReduce.cpp (+1-1) 
- (modified) llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp (+1-1) 
- (modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+11-1) 


``diff
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 41ad2ddac30d2d..f3eabd1ce224b8 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -5293,7 +5293,7 @@ static GEPOffsetAndOverflow EmitGEPOffsetInBytes(Value 
*BasePtr, Value *GEPVal,
   // Otherwise this is array-like indexing. The local offset is the index
   // multiplied by the element size.
   auto *ElementSize = llvm::ConstantInt::get(
-  IntPtrTy, DL.getTypeAllocSize(GTI.getIndexedType()));
+  IntPtrTy, GTI.getSequentialElementStride(DL)));
   auto *IndexS = Builder.CreateIntCast(Index, IntPtrTy, /*isSigned=*/true);
   LocalOffset = eval(BO_Mul, ElementSize, IndexS);
 }
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h 
b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 1d8f523e9792ba..140838ff7c7c2d 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -1041,7 +1041,7 @@ class TargetTransformInfoImplCRTPBase : public 
TargetTransformInfoImplBase {
 if (TargetType->isScalableTy())
   return TTI::TCC_Basic;
 int64_t ElementSize =
-DL.getTypeAllocSize(GTI.getIndexedType()).getFixedValue();
+GTI.getSequentialElementStride(DL).getFixedValue();
 if (ConstIdx) {
   BaseOffset +=
   ConstIdx->getValue().sextOrTrunc(PtrSizeBits) * ElementSize;
diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterator {
 

[clang] Revert "[AArch64][SME] Warn when using a streaming builtin from a non-streaming function" (PR #75449)

2023-12-14 Thread Sam Tebbs via cfe-commits

https://github.com/SamTebbs33 closed 
https://github.com/llvm/llvm-project/pull/75449
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Revert "[AArch64][SME] Warn when using a streaming builtin from a non-streaming function" (PR #75449)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Sam Tebbs (SamTebbs33)


Changes

Reverts llvm/llvm-project#74064

---

Patch is 308.50 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/75449.diff


19 Files Affected:

- (modified) clang/include/clang/Basic/CMakeLists.txt (-6) 
- (modified) clang/include/clang/Basic/arm_sve.td (+582-582) 
- (modified) clang/include/clang/Sema/Sema.h (-1) 
- (modified) clang/lib/Sema/SemaChecking.cpp (+4-52) 
- (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_add-i32.c 
(+8-8) 
- (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_add-i64.c 
(+8-8) 
- (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mopa-za32.c 
(+7-7) 
- (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mopa-za64.c 
(+5-5) 
- (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mops-za32.c 
(+7-7) 
- (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mops-za64.c 
(+5-5) 
- (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c (+96-96) 
- (modified) clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_write.c 
(+96-96) 
- (modified) clang/test/Sema/aarch64-incompat-sm-builtin-calls.c (-77) 
- (modified) clang/test/Sema/aarch64-sme-intrinsics/acle_sme_imm.cpp (+7-7) 
- (modified) clang/test/Sema/aarch64-sme-intrinsics/acle_sme_target.c (+4-5) 
- (modified) clang/utils/TableGen/NeonEmitter.cpp (-28) 
- (modified) clang/utils/TableGen/SveEmitter.cpp (-62) 
- (modified) clang/utils/TableGen/TableGen.cpp (-12) 
- (modified) clang/utils/TableGen/TableGenBackends.h (-2) 


``diff
diff --git a/clang/include/clang/Basic/CMakeLists.txt 
b/clang/include/clang/Basic/CMakeLists.txt
index 73fd521aeeec31..085e316fcc671d 100644
--- a/clang/include/clang/Basic/CMakeLists.txt
+++ b/clang/include/clang/Basic/CMakeLists.txt
@@ -88,9 +88,6 @@ clang_tablegen(arm_sve_typeflags.inc -gen-arm-sve-typeflags
 clang_tablegen(arm_sve_sema_rangechecks.inc -gen-arm-sve-sema-rangechecks
   SOURCE arm_sve.td
   TARGET ClangARMSveSemaRangeChecks)
-clang_tablegen(arm_sve_streaming_attrs.inc -gen-arm-sve-streaming-attrs
-  SOURCE arm_sve.td
-  TARGET ClangARMSveStreamingAttrs)
 clang_tablegen(arm_sme_builtins.inc -gen-arm-sme-builtins
   SOURCE arm_sme.td
   TARGET ClangARMSmeBuiltins)
@@ -100,9 +97,6 @@ clang_tablegen(arm_sme_builtin_cg.inc 
-gen-arm-sme-builtin-codegen
 clang_tablegen(arm_sme_sema_rangechecks.inc -gen-arm-sme-sema-rangechecks
   SOURCE arm_sme.td
   TARGET ClangARMSmeSemaRangeChecks)
-clang_tablegen(arm_sme_streaming_attrs.inc -gen-arm-sme-streaming-attrs
-  SOURCE arm_sme.td
-  TARGET ClangARMSmeStreamingAttrs)
 clang_tablegen(arm_cde_builtins.inc -gen-arm-cde-builtin-def
   SOURCE arm_cde.td
   TARGET ClangARMCdeBuiltinsDef)
diff --git a/clang/include/clang/Basic/arm_sve.td 
b/clang/include/clang/Basic/arm_sve.td
index 3df77c931998f6..db6f17d1c493af 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -19,27 +19,27 @@ include "arm_sve_sme_incl.td"
 // Loads
 
 // Load one vector (scalar base)
-def SVLD1   : MInst<"svld1[_{2}]", "dPc", "csilUcUsUiUlhfd", [IsLoad, 
IsStreamingCompatible],   MemEltTyDefault, "aarch64_sve_ld1">;
-def SVLD1SB : MInst<"svld1sb_{d}", "dPS", "silUsUiUl",   [IsLoad, 
IsStreamingCompatible],   MemEltTyInt8,"aarch64_sve_ld1">;
-def SVLD1UB : MInst<"svld1ub_{d}", "dPW", "silUsUiUl",   [IsLoad, 
IsZExtReturn, IsStreamingCompatible], MemEltTyInt8,"aarch64_sve_ld1">;
-def SVLD1SH : MInst<"svld1sh_{d}", "dPT", "ilUiUl",  [IsLoad, 
IsStreamingCompatible],   MemEltTyInt16,   "aarch64_sve_ld1">;
-def SVLD1UH : MInst<"svld1uh_{d}", "dPX", "ilUiUl",  [IsLoad, 
IsZExtReturn, IsStreamingCompatible], MemEltTyInt16,   "aarch64_sve_ld1">;
-def SVLD1SW : MInst<"svld1sw_{d}", "dPU", "lUl", [IsLoad, 
IsStreamingCompatible],   MemEltTyInt32,   "aarch64_sve_ld1">;
-def SVLD1UW : MInst<"svld1uw_{d}", "dPY", "lUl", [IsLoad, 
IsZExtReturn, IsStreamingCompatible], MemEltTyInt32,   "aarch64_sve_ld1">;
+def SVLD1   : MInst<"svld1[_{2}]", "dPc", "csilUcUsUiUlhfd", [IsLoad], 
  MemEltTyDefault, "aarch64_sve_ld1">;
+def SVLD1SB : MInst<"svld1sb_{d}", "dPS", "silUsUiUl",   [IsLoad], 
  MemEltTyInt8,"aarch64_sve_ld1">;
+def SVLD1UB : MInst<"svld1ub_{d}", "dPW", "silUsUiUl",   [IsLoad, 
IsZExtReturn], MemEltTyInt8,"aarch64_sve_ld1">;
+def SVLD1SH : MInst<"svld1sh_{d}", "dPT", "ilUiUl",  [IsLoad], 
  MemEltTyInt16,   "aarch64_sve_ld1">;
+def SVLD1UH : MInst<"svld1uh_{d}", "dPX", "ilUiUl",  [IsLoad, 
IsZExtReturn], MemEltTyInt16,   "aarch64_sve_ld1">;
+def SVLD1SW : MInst<"svld1sw_{d}", "dPU", "lUl", [IsLoad], 
  MemEltTyInt32,   "aarch64_sve_ld1">;
+def SVLD1UW : MInst<"svld1uw_{d}", "dPY", "lUl", [IsLoad, 
IsZExtReturn], MemEltTyInt32,   "aarch64_sv

[clang-tools-extra] [clang] [llvm] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-14 Thread David Sherwood via cfe-commits


@@ -0,0 +1,726 @@
+
+//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition 
-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+#include "AArch64LoopIdiomTransform.h"
+#include "llvm/Analysis/DomTreeUpdater.h"
+#include "llvm/Analysis/LoopPass.h"
+#include "llvm/Analysis/TargetTransformInfo.h"
+#include "llvm/IR/Dominators.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/Intrinsics.h"
+#include "llvm/IR/MDBuilder.h"
+#include "llvm/IR/PatternMatch.h"
+#include "llvm/InitializePasses.h"
+#include "llvm/Transforms/Utils/BasicBlockUtils.h"
+
+using namespace llvm;
+
+#define DEBUG_TYPE "aarch64-lit"
+
+static cl::opt
+DisableAll("disable-aarch64-lit-all", cl::Hidden, cl::init(false),
+   cl::desc("Disable AArch64 Loop Idiom Transform Pass."));
+
+static cl::opt DisableByteCmp(
+"disable-aarch64-lit-bytecmp", cl::Hidden, cl::init(false),
+cl::desc("Proceed with AArch64 Loop Idiom Transform Pass, but do "
+ "not convert byte-compare loop(s)."));
+
+namespace llvm {
+
+void initializeAArch64LoopIdiomTransformLegacyPassPass(PassRegistry &);
+Pass *createAArch64LoopIdiomTransformPass();
+
+} // end namespace llvm
+
+namespace {
+
+class AArch64LoopIdiomTransform {
+  Loop *CurLoop = nullptr;
+  DominatorTree *DT;
+  LoopInfo *LI;
+  const TargetTransformInfo *TTI;
+  const DataLayout *DL;
+
+public:
+  explicit AArch64LoopIdiomTransform(DominatorTree *DT, LoopInfo *LI,
+ const TargetTransformInfo *TTI,
+ const DataLayout *DL)
+  : DT(DT), LI(LI), TTI(TTI), DL(DL) {}
+
+  bool run(Loop *L);
+
+private:
+  /// \name Countable Loop Idiom Handling
+  /// @{
+
+  bool runOnCountableLoop();
+  bool runOnLoopBlock(BasicBlock *BB, const SCEV *BECount,
+  SmallVectorImpl &ExitBlocks);
+
+  bool recognizeByteCompare();
+  Value *expandFindMismatch(IRBuilder<> &Builder, GetElementPtrInst *GEPA,
+GetElementPtrInst *GEPB, Value *Start,
+Value *MaxLen);
+  void transformByteCompare(GetElementPtrInst *GEPA, GetElementPtrInst *GEPB,
+Value *MaxLen, Value *Index, Value *Start,
+bool IncIdx, BasicBlock *FoundBB,
+BasicBlock *EndBB);
+  /// @}
+};
+
+class AArch64LoopIdiomTransformLegacyPass : public LoopPass {
+public:
+  static char ID;
+
+  explicit AArch64LoopIdiomTransformLegacyPass() : LoopPass(ID) {
+initializeAArch64LoopIdiomTransformLegacyPassPass(
+*PassRegistry::getPassRegistry());
+  }
+
+  StringRef getPassName() const override {
+return "Recognize AArch64-specific loop idioms";
+  }
+
+  void getAnalysisUsage(AnalysisUsage &AU) const override {
+AU.addRequired();
+AU.addRequired();
+AU.addRequired();
+  }
+
+  bool runOnLoop(Loop *L, LPPassManager &LPM) override;
+};
+
+bool AArch64LoopIdiomTransformLegacyPass::runOnLoop(Loop *L,
+LPPassManager &LPM) {
+
+  if (skipLoop(L))
+return false;
+
+  auto *DT = &getAnalysis().getDomTree();
+  auto *LI = &getAnalysis().getLoopInfo();
+  auto &TTI = getAnalysis().getTTI(
+  *L->getHeader()->getParent());
+  return AArch64LoopIdiomTransform(
+ DT, LI, &TTI, &L->getHeader()->getModule()->getDataLayout())
+  .run(L);
+}
+
+} // end anonymous namespace
+
+char AArch64LoopIdiomTransformLegacyPass::ID = 0;
+
+INITIALIZE_PASS_BEGIN(
+AArch64LoopIdiomTransformLegacyPass, "aarch64-lit",
+"Transform specific loop idioms into optimised vector forms", false, false)
+INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
+INITIALIZE_PASS_DEPENDENCY(LoopSimplify)
+INITIALIZE_PASS_DEPENDENCY(LCSSAWrapperPass)
+INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
+INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
+INITIALIZE_PASS_END(
+AArch64LoopIdiomTransformLegacyPass, "aarch64-lit",
+"Transform specific loop idioms into optimised vector forms", false, false)
+
+Pass *llvm::createAArch64LoopIdiomTransformPass() {
+  return new AArch64LoopIdiomTransformLegacyPass();
+}
+
+PreservedAnalyses
+AArch64LoopIdiomTransformPass::run(Loop &L, LoopAnalysisManager &AM,
+   LoopStandardAnalysisResults &AR,
+   LPMUpdater &) {
+  if (DisableAll)
+return PreservedAnalyses::all();
+
+  const auto *DL = &L.getHeader()->getModule()->getDataLayout();
+
+  AArch64LoopIdiomTransform LIT(&AR.DT, &AR.LI, &AR.TTI, DL);
+  if (!LIT.run(&L))
+return PreservedAnalyses::all();
+
+  return PreservedAnalyses::none();
+}
+
+//===-

[clang] [AArch64][SME] Warn when using a streaming builtin from a non-streaming function (PR #74064)

2023-12-14 Thread Sam Tebbs via cfe-commits

SamTebbs33 wrote:

Thanks for reporting that Nico. I've reverted the patch and will work on 
improving compile time. I like your idea Sander.

https://github.com/llvm/llvm-project/pull/74064
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Revert "[AArch64][SME] Warn when using a streaming builtin from a non-streaming function" (PR #75449)

2023-12-14 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 3e8b175eec6fef1a073fb7d0d867fbc6a7837f57 
242a64cb712f02851f781e3290e2bb2f1b679c19 -- clang/include/clang/Sema/Sema.h 
clang/lib/Sema/SemaChecking.cpp 
clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_add-i32.c 
clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_add-i64.c 
clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mopa-za32.c 
clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mopa-za64.c 
clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mops-za32.c 
clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_mops-za64.c 
clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_read.c 
clang/test/CodeGen/aarch64-sme-intrinsics/acle_sme_write.c 
clang/test/Sema/aarch64-incompat-sm-builtin-calls.c 
clang/test/Sema/aarch64-sme-intrinsics/acle_sme_imm.cpp 
clang/test/Sema/aarch64-sme-intrinsics/acle_sme_target.c 
clang/utils/TableGen/NeonEmitter.cpp clang/utils/TableGen/SveEmitter.cpp 
clang/utils/TableGen/TableGen.cpp clang/utils/TableGen/TableGenBackends.h
``





View the diff from clang-format here.


``diff
diff --git a/clang/utils/TableGen/TableGen.cpp 
b/clang/utils/TableGen/TableGen.cpp
index 3ad46b9598..8194bc31bc 100644
--- a/clang/utils/TableGen/TableGen.cpp
+++ b/clang/utils/TableGen/TableGen.cpp
@@ -282,11 +282,14 @@ cl::opt Action(
"Generate riscv_vector_builtin_cg.inc for clang"),
 clEnumValN(GenRISCVVectorBuiltinSema, "gen-riscv-vector-builtin-sema",
"Generate riscv_vector_builtin_sema.inc for clang"),
-clEnumValN(GenRISCVSiFiveVectorBuiltins, 
"gen-riscv-sifive-vector-builtins",
+clEnumValN(GenRISCVSiFiveVectorBuiltins,
+   "gen-riscv-sifive-vector-builtins",
"Generate riscv_sifive_vector_builtins.inc for clang"),
-clEnumValN(GenRISCVSiFiveVectorBuiltinCG, 
"gen-riscv-sifive-vector-builtin-codegen",
+clEnumValN(GenRISCVSiFiveVectorBuiltinCG,
+   "gen-riscv-sifive-vector-builtin-codegen",
"Generate riscv_sifive_vector_builtin_cg.inc for clang"),
-clEnumValN(GenRISCVSiFiveVectorBuiltinSema, 
"gen-riscv-sifive-vector-builtin-sema",
+clEnumValN(GenRISCVSiFiveVectorBuiltinSema,
+   "gen-riscv-sifive-vector-builtin-sema",
"Generate riscv_sifive_vector_builtin_sema.inc for clang"),
 clEnumValN(GenAttrDocs, "gen-attr-docs",
"Generate attribute documentation"),

``




https://github.com/llvm/llvm-project/pull/75449
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Phoebe Wang via cfe-commits

phoebewang wrote:

Can this solve https://github.com/llvm/llvm-project/issues/68566 too?

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus updated 
https://github.com/llvm/llvm-project/pull/75448

>From 2c367fba42b716d803ee088af45c1b57fe4bcbcd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:24:51 +0100
Subject: [PATCH 1/3] [InstCombine] Precommit test exhibiting miscompile

InstCombine is determining incorrect byte offsets for GEPs
into vectors of overaligned elements.
Add a new testcase showing this behavior, serving as precommit
for a fix.
---
 .../test/Transforms/InstCombine/getelementptr.ll | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll 
b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bc7fdc9352df6c..7c0d95973d5cf3 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
 
-target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64"
+target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64-f16:32"
 
 %intstruct = type { i32 }
 %pair = type { i32, i32 }
@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:
+; CHECK-NEXT:store i8 [[B:%.*]], ptr getelementptr inbounds ([10 x i8], 
ptr @Global, i64 0, i64 4), align 1
+; CHECK-NEXT:ret void
+;
+  %A = getelementptr <2 x half>, ptr @Global, i64 0, i64 1
+  store i8 %B, ptr %A
+  ret void
+}
+
 define ptr @test7(ptr %I, i64 %C, i64 %D) {
 ; CHECK-LABEL: @test7(
 ; CHECK-NEXT:[[A:%.*]] = getelementptr i32, ptr [[I:%.*]], i64 [[C:%.*]]

>From 2e80e5846d23946304bbb9930d0a12098a2f16bd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:29:59 +0100
Subject: [PATCH 2/3] [IR]: Add
 generic_gep_type_iterator::getSequentialElementStride

This prepares a fix to GEP offset computations on vectors
of overaligned elements.

We have many places that analyze GEP offsets using GEP iterators
with the following pattern:

  GTI = gep_type_begin(ElemTy, Indices),
  GTE = gep_type_end(ElemTy, Indices);
  for (; GTI != GTE; ++GTI) {
if (StructType *STy = GTI.getStructTypeOrNull()) {
   // handle struct
   [..]
} else {
  // handle sequential (outmost index, array, vector):
  auto Stride = DL.getTypeAllocSize(GTI.getIndexedType());
  Offset += Index * Size;
}
  }

This is incorrect for vectors of types whose bit size does not
equal its alloc size (e.g. overaligned types), as vectors
always bit-pack their elements.

This patch introduces new functions generic_gep_type_iterator::isVector()
and generic_gep_type_iterator::getSequentialElementStride(const DataLayout &)
to fix these patterns without having to teach all these places about
the specifics of vector bit layouts. With these helpers, the pattern
above can be fixed by replacing the stride computation:

  auto Stride = GTI.getSequentialElementStride(DL);
---
 .../llvm/IR/GetElementPtrTypeIterator.h   | 57 ++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterator {
 
   ItTy OpIt;
-  PointerUnion CurTy;
+  // We use two different mechanisms to store the type a GEP index applies to.
+  // In some cases, we need to know the outer aggregate type the index is
+  // applied within, e.g. a struct. In such cases, we store the aggregate type
+  // in the iterator, and derive the element type on the fly.
+  //
+  // However, this is not always possible, because for the outermost index 
there
+  // is no containing type. In such cases, or if the containing type is not
+  // relevant, e.g. for arrays, the element type is stored as Type* in CurTy.
+  //
+  // If CurTy contains a Type* value, this does not imply anything about the
+  // type itself, because it is the element type and not the outer type.
+  // In particular, Type* can be a struct type.
+  //
+  // Consider this example:
+  //
+  //%my.struct = type { i32, [ 4 x float ] }
+  //[...]
+  //%gep = 

[clang] 2eb1e75 - [clang][NFC] Inline some lambdas to their only call site

2023-12-14 Thread Timm Bäder via cfe-commits

Author: Timm Bäder
Date: 2023-12-14T10:47:32+01:00
New Revision: 2eb1e75f42d7e09e97907f535bfa749722722dbd

URL: 
https://github.com/llvm/llvm-project/commit/2eb1e75f42d7e09e97907f535bfa749722722dbd
DIFF: 
https://github.com/llvm/llvm-project/commit/2eb1e75f42d7e09e97907f535bfa749722722dbd.diff

LOG: [clang][NFC] Inline some lambdas to their only call site

Added: 


Modified: 
clang/lib/Parse/ParseExprCXX.cpp

Removed: 




diff  --git a/clang/lib/Parse/ParseExprCXX.cpp 
b/clang/lib/Parse/ParseExprCXX.cpp
index 2fc364fc811b32..ef9ea6575205cd 100644
--- a/clang/lib/Parse/ParseExprCXX.cpp
+++ b/clang/lib/Parse/ParseExprCXX.cpp
@@ -1311,18 +1311,6 @@ ExprResult Parser::ParseLambdaExpressionAfterIntroducer(
 D.takeAttributes(Attributes);
   }
 
-  // Helper to emit a warning if we see a CUDA host/device/global attribute
-  // after '(...)'. nvcc doesn't accept this.
-  auto WarnIfHasCUDATargetAttr = [&] {
-if (getLangOpts().CUDA)
-  for (const ParsedAttr &A : Attributes)
-if (A.getKind() == ParsedAttr::AT_CUDADevice ||
-A.getKind() == ParsedAttr::AT_CUDAHost ||
-A.getKind() == ParsedAttr::AT_CUDAGlobal)
-  Diag(A.getLoc(), diag::warn_cuda_attr_lambda_position)
-  << A.getAttrName()->getName();
-  };
-
   MultiParseScope TemplateParamScope(*this);
   if (Tok.is(tok::less)) {
 Diag(Tok, getLangOpts().CPlusPlus20
@@ -1377,91 +1365,6 @@ ExprResult Parser::ParseLambdaExpressionAfterIntroducer(
   bool HasSpecifiers = false;
   SourceLocation MutableLoc;
 
-  auto ParseConstexprAndMutableSpecifiers = [&] {
-// GNU-style attributes must be parsed before the mutable specifier to
-// be compatible with GCC. MSVC-style attributes must be parsed before
-// the mutable specifier to be compatible with MSVC.
-MaybeParseAttributes(PAKM_GNU | PAKM_Declspec, Attributes);
-// Parse mutable-opt and/or constexpr-opt or consteval-opt, and update
-// the DeclEndLoc.
-SourceLocation ConstexprLoc;
-SourceLocation ConstevalLoc;
-SourceLocation StaticLoc;
-
-tryConsumeLambdaSpecifierToken(*this, MutableLoc, StaticLoc, ConstexprLoc,
-   ConstevalLoc, DeclEndLoc);
-
-DiagnoseStaticSpecifierRestrictions(*this, StaticLoc, MutableLoc, Intro);
-
-addStaticToLambdaDeclSpecifier(*this, StaticLoc, DS);
-addConstexprToLambdaDeclSpecifier(*this, ConstexprLoc, DS);
-addConstevalToLambdaDeclSpecifier(*this, ConstevalLoc, DS);
-  };
-
-  auto ParseLambdaSpecifiers =
-  [&](MutableArrayRef ParamInfo,
-  SourceLocation EllipsisLoc) {
-// Parse exception-specification[opt].
-ExceptionSpecificationType ESpecType = EST_None;
-SourceRange ESpecRange;
-SmallVector DynamicExceptions;
-SmallVector DynamicExceptionRanges;
-ExprResult NoexceptExpr;
-CachedTokens *ExceptionSpecTokens;
-
-ESpecType = tryParseExceptionSpecification(
-/*Delayed=*/false, ESpecRange, DynamicExceptions,
-DynamicExceptionRanges, NoexceptExpr, ExceptionSpecTokens);
-
-if (ESpecType != EST_None)
-  DeclEndLoc = ESpecRange.getEnd();
-
-// Parse attribute-specifier[opt].
-if (MaybeParseCXX11Attributes(Attributes))
-  DeclEndLoc = Attributes.Range.getEnd();
-
-// Parse OpenCL addr space attribute.
-if (Tok.isOneOf(tok::kw___private, tok::kw___global, tok::kw___local,
-tok::kw___constant, tok::kw___generic)) {
-  ParseOpenCLQualifiers(DS.getAttributes());
-  ConsumeToken();
-}
-
-SourceLocation FunLocalRangeEnd = DeclEndLoc;
-
-// Parse trailing-return-type[opt].
-if (Tok.is(tok::arrow)) {
-  FunLocalRangeEnd = Tok.getLocation();
-  SourceRange Range;
-  TrailingReturnType = ParseTrailingReturnType(
-  Range, /*MayBeFollowedByDirectInit*/ false);
-  TrailingReturnTypeLoc = Range.getBegin();
-  if (Range.getEnd().isValid())
-DeclEndLoc = Range.getEnd();
-}
-
-SourceLocation NoLoc;
-D.AddTypeInfo(
-DeclaratorChunk::getFunction(
-/*HasProto=*/true,
-/*IsAmbiguous=*/false, LParenLoc, ParamInfo.data(),
-ParamInfo.size(), EllipsisLoc, RParenLoc,
-/*RefQualifierIsLvalueRef=*/true,
-/*RefQualifierLoc=*/NoLoc, MutableLoc, ESpecType, ESpecRange,
-DynamicExceptions.data(), DynamicExceptionRanges.data(),
-DynamicExceptions.size(),
-NoexceptExpr.isUsable() ? NoexceptExpr.get() : nullptr,
-/*ExceptionSpecTokens*/ nullptr,
-/*DeclsInPrototype=*/std::nullopt, LParenLoc, FunLocalRangeEnd,
-D, TrailingReturnType, TrailingReturnTypeLoc, &DS),
-std::move(Attri

[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 3e8b175eec6fef1a073fb7d0d867fbc6a7837f57 
4649d44cece1db0d38e00425491137c44c3aab8e -- clang/lib/CodeGen/CGExprScalar.cpp 
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h 
llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
llvm/lib/Analysis/BasicAliasAnalysis.cpp llvm/lib/Analysis/InlineCost.cpp 
llvm/lib/Analysis/Local.cpp llvm/lib/Analysis/LoopAccessAnalysis.cpp 
llvm/lib/Analysis/ValueTracking.cpp llvm/lib/CodeGen/CodeGenPrepare.cpp 
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp 
llvm/lib/CodeGen/SelectionDAG/FastISel.cpp 
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp 
llvm/lib/ExecutionEngine/Interpreter/Execution.cpp llvm/lib/IR/DataLayout.cpp 
llvm/lib/IR/Operator.cpp llvm/lib/IR/Value.cpp 
llvm/lib/Target/AArch64/AArch64FastISel.cpp llvm/lib/Target/ARM/ARMFastISel.cpp 
llvm/lib/Target/Mips/MipsFastISel.cpp llvm/lib/Target/PowerPC/PPCFastISel.cpp 
llvm/lib/Target/RISCV/RISCVGatherScatterLowering.cpp 
llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp 
llvm/lib/Target/X86/X86FastISel.cpp llvm/lib/Transforms/Scalar/SROA.cpp 
llvm/lib/Transforms/Scalar/SeparateConstOffsetFromGEP.cpp 
llvm/lib/Transforms/Scalar/StraightLineStrengthReduce.cpp 
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/CodeGen/CGExprScalar.cpp 
b/clang/lib/CodeGen/CGExprScalar.cpp
index 12da11d203..989146fd47 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -5292,8 +5292,8 @@ static GEPOffsetAndOverflow EmitGEPOffsetInBytes(Value 
*BasePtr, Value *GEPVal,
 } else {
   // Otherwise this is array-like indexing. The local offset is the index
   // multiplied by the element size.
-  auto *ElementSize = llvm::ConstantInt::get(
-  IntPtrTy, GTI.getSequentialElementStride(DL));
+  auto *ElementSize =
+  llvm::ConstantInt::get(IntPtrTy, GTI.getSequentialElementStride(DL));
   auto *IndexS = Builder.CreateIntCast(Index, IntPtrTy, /*isSigned=*/true);
   LocalOffset = eval(BO_Mul, ElementSize, IndexS);
 }

``




https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

https://github.com/jasilvanus updated 
https://github.com/llvm/llvm-project/pull/75448

>From 2c367fba42b716d803ee088af45c1b57fe4bcbcd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:24:51 +0100
Subject: [PATCH 1/3] [InstCombine] Precommit test exhibiting miscompile

InstCombine is determining incorrect byte offsets for GEPs
into vectors of overaligned elements.
Add a new testcase showing this behavior, serving as precommit
for a fix.
---
 .../test/Transforms/InstCombine/getelementptr.ll | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/llvm/test/Transforms/InstCombine/getelementptr.ll 
b/llvm/test/Transforms/InstCombine/getelementptr.ll
index bc7fdc9352df6c..7c0d95973d5cf3 100644
--- a/llvm/test/Transforms/InstCombine/getelementptr.ll
+++ b/llvm/test/Transforms/InstCombine/getelementptr.ll
@@ -1,7 +1,7 @@
 ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
 ; RUN: opt < %s -passes=instcombine -S | FileCheck %s
 
-target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64"
+target datalayout = "e-p:64:64-p1:16:16-p2:32:32:32-p3:64:64:64-f16:32"
 
 %intstruct = type { i32 }
 %pair = type { i32, i32 }
@@ -111,6 +111,20 @@ define void @test_evaluate_gep_as_ptrs_array(ptr 
addrspace(2) %B) {
   ret void
 }
 
+define void @test_overaligned_vec(i8 %B) {
+; This should be turned into a constexpr instead of being an 
instruction
+; CHECK-LABEL: @test_overaligned_vec(
+; TODO: In this test case, half is overaligned to 32 bits.
+;   Vectors are bit-packed and don't respect alignment.
+;   Thus, the byte offset of the second half in <2 x half> is 2 bytes, not 
4 bytes:
+; CHECK-NEXT:store i8 [[B:%.*]], ptr getelementptr inbounds ([10 x i8], 
ptr @Global, i64 0, i64 4), align 1
+; CHECK-NEXT:ret void
+;
+  %A = getelementptr <2 x half>, ptr @Global, i64 0, i64 1
+  store i8 %B, ptr %A
+  ret void
+}
+
 define ptr @test7(ptr %I, i64 %C, i64 %D) {
 ; CHECK-LABEL: @test7(
 ; CHECK-NEXT:[[A:%.*]] = getelementptr i32, ptr [[I:%.*]], i64 [[C:%.*]]

>From 2e80e5846d23946304bbb9930d0a12098a2f16bd Mon Sep 17 00:00:00 2001
From: Jannik Silvanus 
Date: Thu, 14 Dec 2023 09:29:59 +0100
Subject: [PATCH 2/3] [IR]: Add
 generic_gep_type_iterator::getSequentialElementStride

This prepares a fix to GEP offset computations on vectors
of overaligned elements.

We have many places that analyze GEP offsets using GEP iterators
with the following pattern:

  GTI = gep_type_begin(ElemTy, Indices),
  GTE = gep_type_end(ElemTy, Indices);
  for (; GTI != GTE; ++GTI) {
if (StructType *STy = GTI.getStructTypeOrNull()) {
   // handle struct
   [..]
} else {
  // handle sequential (outmost index, array, vector):
  auto Stride = DL.getTypeAllocSize(GTI.getIndexedType());
  Offset += Index * Size;
}
  }

This is incorrect for vectors of types whose bit size does not
equal its alloc size (e.g. overaligned types), as vectors
always bit-pack their elements.

This patch introduces new functions generic_gep_type_iterator::isVector()
and generic_gep_type_iterator::getSequentialElementStride(const DataLayout &)
to fix these patterns without having to teach all these places about
the specifics of vector bit layouts. With these helpers, the pattern
above can be fixed by replacing the stride computation:

  auto Stride = GTI.getSequentialElementStride(DL);
---
 .../llvm/IR/GetElementPtrTypeIterator.h   | 57 ++-
 1 file changed, 54 insertions(+), 3 deletions(-)

diff --git a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h 
b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
index f3272327c3f8b2..5b63ccb182a842 100644
--- a/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
+++ b/llvm/include/llvm/IR/GetElementPtrTypeIterator.h
@@ -16,6 +16,7 @@
 
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/PointerUnion.h"
+#include "llvm/IR/DataLayout.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Operator.h"
 #include "llvm/IR/User.h"
@@ -30,7 +31,39 @@ template 
 class generic_gep_type_iterator {
 
   ItTy OpIt;
-  PointerUnion CurTy;
+  // We use two different mechanisms to store the type a GEP index applies to.
+  // In some cases, we need to know the outer aggregate type the index is
+  // applied within, e.g. a struct. In such cases, we store the aggregate type
+  // in the iterator, and derive the element type on the fly.
+  //
+  // However, this is not always possible, because for the outermost index 
there
+  // is no containing type. In such cases, or if the containing type is not
+  // relevant, e.g. for arrays, the element type is stored as Type* in CurTy.
+  //
+  // If CurTy contains a Type* value, this does not imply anything about the
+  // type itself, because it is the element type and not the outer type.
+  // In particular, Type* can be a struct type.
+  //
+  // Consider this example:
+  //
+  //%my.struct = type { i32, [ 4 x float ] }
+  //[...]
+  //%gep = 

[clang] [Clang][AArch64]Add QCVTN builtin to SVE2.1 (PR #75454)

2023-12-14 Thread via cfe-commits

https://github.com/CarolineConcatto created 
https://github.com/llvm/llvm-project/pull/75454

 ``` c
   // All the intrinsics below are [SVE2.1 or SME2]
   // Variants are also available for _u16[_s32]_x2 and _u16[_u32]_x2
   svint16_t svqcvtn_s16[_s32_x2](svint32x2_t zn);
   ```

According to PR#257[1]

[1]https://github.com/ARM-software/acle/pull/257

>From 3508b4fbd9b4b9b51553a590b237e443fb58e098 Mon Sep 17 00:00:00 2001
From: Caroline Concatto 
Date: Thu, 14 Dec 2023 09:50:36 +
Subject: [PATCH] [Clang][AArch64]Add QCVTN builtin to SVE2.1

 ``` c
   // All the intrinsics below are [SVE2.1 or SME2]
   // Variants are also available for _u16[_s32]_x2 and _u16[_u32]_x2
   svint16_t svqcvtn_s16[_s32_x2](svint32x2_t zn);
   ```

According to PR#257[1]

[1]https://github.com/ARM-software/acle/pull/257
---
 clang/include/clang/Basic/arm_sve.td  |  4 +-
 .../acle_sve2p1_qcvtn.c   | 78 +++
 2 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 
clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c

diff --git a/clang/include/clang/Basic/arm_sve.td 
b/clang/include/clang/Basic/arm_sve.td
index db6f17d1c493af..6979e65fbf4cb4 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -2266,11 +2266,13 @@ let TargetGuard = "sme2" in {
 //
 // Multi-vector saturating extract narrow and interleave
 //
-let TargetGuard = "sme2" in {
+let TargetGuard = "sme2|sve2p1" in {
   def SVQCVTN_S16_S32_X2 : SInst<"svqcvtn_s16[_{d}_x2]", "h2.d", "i", 
MergeNone, "aarch64_sve_sqcvtn_x2", [IsStreamingCompatible], []>;
   def SVQCVTN_U16_U32_X2 : SInst<"svqcvtn_u16[_{d}_x2]", "e2.d", "Ui", 
MergeNone, "aarch64_sve_uqcvtn_x2", [IsStreamingCompatible], []>;
   def SVQCVTN_U16_S32_X2 : SInst<"svqcvtn_u16[_{d}_x2]", "e2.d", "i", 
MergeNone, "aarch64_sve_sqcvtun_x2", [IsStreamingCompatible], []>;
+}
 
+let TargetGuard = "sme2" in {
   def SVQCVTN_S8_S32_X4 : SInst<"svqcvtn_s8[_{d}_x4]", "q4.d", "i", MergeNone, 
"aarch64_sve_sqcvtn_x4", [IsStreaming], []>;
   def SVQCVTN_U8_U32_X4 : SInst<"svqcvtn_u8[_{d}_x4]", "b4.d", "Ui", 
MergeNone, "aarch64_sve_uqcvtn_x4", [IsStreaming], []>;
   def SVQCVTN_U8_S32_X4 : SInst<"svqcvtn_u8[_{d}_x4]", "b4.d", "i", MergeNone, 
"aarch64_sve_sqcvtun_x4", [IsStreaming], []>;
diff --git a/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c 
b/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c
new file mode 100644
index 00..477b7b0a08e671
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c
@@ -0,0 +1,78 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+
+// REQUIRES: aarch64-registered-target
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 
-target-feature +bf16 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | 
opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve1p1 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve2p1 -target-feature +bf16 -S -disable-O0-optnone -Werror 
-Wall -emit-llvm -o - %s | opt -S -p mem2reg,instcombine,tailcallelim | 
FileCheck %s
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve2p1 -target-feature +bf16 -S -disable-O0-optnone -Werror 
-Wall -emit-llvm -o - -x c++ %s | opt -S -p mem2reg,instcombine,tailcallelim | 
FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 
-target-feature +bf16 -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -target-feature +bf16 
-DSME2_STANDALONE_TEST -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s 
| opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -target-feature +bf16 
-DSME2_STANDALONE_TEST -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x 
c++ %s | opt -S 

[clang] [Clang][AArch64]Add QCVTN builtin to SVE2.1 (PR #75454)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (CarolineConcatto)


Changes

 ``` c
   // All the intrinsics below are [SVE2.1 or SME2]
   // Variants are also available for _u16[_s32]_x2 and _u16[_u32]_x2
   svint16_t svqcvtn_s16[_s32_x2](svint32x2_t zn);
   ```

According to PR#257[1]

[1]https://github.com/ARM-software/acle/pull/257

---
Full diff: https://github.com/llvm/llvm-project/pull/75454.diff


2 Files Affected:

- (modified) clang/include/clang/Basic/arm_sve.td (+3-1) 
- (added) clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c 
(+78) 


``diff
diff --git a/clang/include/clang/Basic/arm_sve.td 
b/clang/include/clang/Basic/arm_sve.td
index db6f17d1c493af..6979e65fbf4cb4 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -2266,11 +2266,13 @@ let TargetGuard = "sme2" in {
 //
 // Multi-vector saturating extract narrow and interleave
 //
-let TargetGuard = "sme2" in {
+let TargetGuard = "sme2|sve2p1" in {
   def SVQCVTN_S16_S32_X2 : SInst<"svqcvtn_s16[_{d}_x2]", "h2.d", "i", 
MergeNone, "aarch64_sve_sqcvtn_x2", [IsStreamingCompatible], []>;
   def SVQCVTN_U16_U32_X2 : SInst<"svqcvtn_u16[_{d}_x2]", "e2.d", "Ui", 
MergeNone, "aarch64_sve_uqcvtn_x2", [IsStreamingCompatible], []>;
   def SVQCVTN_U16_S32_X2 : SInst<"svqcvtn_u16[_{d}_x2]", "e2.d", "i", 
MergeNone, "aarch64_sve_sqcvtun_x2", [IsStreamingCompatible], []>;
+}
 
+let TargetGuard = "sme2" in {
   def SVQCVTN_S8_S32_X4 : SInst<"svqcvtn_s8[_{d}_x4]", "q4.d", "i", MergeNone, 
"aarch64_sve_sqcvtn_x4", [IsStreaming], []>;
   def SVQCVTN_U8_U32_X4 : SInst<"svqcvtn_u8[_{d}_x4]", "b4.d", "Ui", 
MergeNone, "aarch64_sve_uqcvtn_x4", [IsStreaming], []>;
   def SVQCVTN_U8_S32_X4 : SInst<"svqcvtn_u8[_{d}_x4]", "b4.d", "i", MergeNone, 
"aarch64_sve_sqcvtun_x4", [IsStreaming], []>;
diff --git a/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c 
b/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c
new file mode 100644
index 00..477b7b0a08e671
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c
@@ -0,0 +1,78 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+
+// REQUIRES: aarch64-registered-target
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 
-target-feature +bf16 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | 
opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve1p1 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve2p1 -target-feature +bf16 -S -disable-O0-optnone -Werror 
-Wall -emit-llvm -o - %s | opt -S -p mem2reg,instcombine,tailcallelim | 
FileCheck %s
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve2p1 -target-feature +bf16 -S -disable-O0-optnone -Werror 
-Wall -emit-llvm -o - -x c++ %s | opt -S -p mem2reg,instcombine,tailcallelim | 
FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 
-target-feature +bf16 -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -target-feature +bf16 
-DSME2_STANDALONE_TEST -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s 
| opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -target-feature +bf16 
-DSME2_STANDALONE_TEST -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x 
c++ %s | opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s 
-check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -o /dev/null %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1,A2_UNUSED,A3,A4_UNUSED) A1##A3
+#else
+#define SVE_ACLE_FUNC(A1,A2

[clang] [clang] Accept recursive non-dependent calls to functions with deduced return type (PR #75456)

2023-12-14 Thread Mariya Podchishchaeva via cfe-commits

https://github.com/Fznamznon created 
https://github.com/llvm/llvm-project/pull/75456

Treat such calls as dependent since it is much easier to implement.

Fixes https://github.com/llvm/llvm-project/issues/71015

>From 0e190f131862dd8f4b07891c3ee712a0a163f936 Mon Sep 17 00:00:00 2001
From: "Podchishchaeva, Mariya" 
Date: Thu, 14 Dec 2023 01:33:17 -0800
Subject: [PATCH] [clang] Accept recursive non-dependent calls to functions
 with deduced return type

Treat such calls as dependent since it is much easier to implement.

Fixes https://github.com/llvm/llvm-project/issues/71015
---
 clang/docs/ReleaseNotes.rst   |  3 ++
 clang/lib/AST/ComputeDependence.cpp   |  2 ++
 clang/lib/Sema/SemaOverload.cpp   | 18 ++
 .../SemaCXX/deduced-return-type-cxx14.cpp | 33 +++
 4 files changed, 56 insertions(+)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 05d59d0da264f3..9ffc7500414981 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -688,6 +688,9 @@ Bug Fixes in This Version
 - Fixed false positive error emitted when templated alias inside a class
   used private members of the same class.
   Fixes (`#41693 `_)
+- Clang now accepts recursive non-dependent calls to functions with deduced 
return
+  type.
+  Fixes (`#71015 `_)
 
 Bug Fixes to Compiler Builtins
 ^^
diff --git a/clang/lib/AST/ComputeDependence.cpp 
b/clang/lib/AST/ComputeDependence.cpp
index 097753fd3267b5..584b58473294be 100644
--- a/clang/lib/AST/ComputeDependence.cpp
+++ b/clang/lib/AST/ComputeDependence.cpp
@@ -603,6 +603,8 @@ ExprDependence clang::computeDependence(PredefinedExpr *E) {
 ExprDependence clang::computeDependence(CallExpr *E,
 llvm::ArrayRef PreArgs) {
   auto D = E->getCallee()->getDependence();
+  if (E->getType()->isDependentType())
+D |= ExprDependence::Type;
   for (auto *A : llvm::ArrayRef(E->getArgs(), E->getNumArgs())) {
 if (A)
   D |= A->getDependence();
diff --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index 5026e1d603e5ee..9fb767101e1eb7 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -13994,6 +13994,24 @@ ExprResult Sema::BuildOverloadedCallExpr(Scope *S, 
Expr *Fn,
   OverloadCandidateSet::iterator Best;
   OverloadingResult OverloadResult =
   CandidateSet.BestViableFunction(*this, Fn->getBeginLoc(), Best);
+  FunctionDecl *FDecl = Best->Function;
+
+  // Model the case with a call to a templated function whose definition
+  // encloses the call and whose return type contains a placeholder type as if
+  // the UnresolvedLookupExpr was type-dependent.
+  if (OverloadResult == OR_Success && FDecl &&
+  FDecl->isTemplateInstantiation() &&
+  FDecl->getReturnType()->isUndeducedType()) {
+if (auto TP = FDecl->getTemplateInstantiationPattern(false)) {
+  if (TP->willHaveBody()) {
+CallExpr *CE =
+CallExpr::Create(Context, Fn, Args, Context.DependentTy, 
VK_PRValue,
+ RParenLoc, CurFPFeatureOverrides());
+result = CE;
+return result;
+  }
+}
+  }
 
   return FinishOverloadedCallExpr(*this, S, Fn, ULE, LParenLoc, Args, 
RParenLoc,
   ExecConfig, &CandidateSet, &Best,
diff --git a/clang/test/SemaCXX/deduced-return-type-cxx14.cpp 
b/clang/test/SemaCXX/deduced-return-type-cxx14.cpp
index 6344d1df3fbaeb..1da597499d34f5 100644
--- a/clang/test/SemaCXX/deduced-return-type-cxx14.cpp
+++ b/clang/test/SemaCXX/deduced-return-type-cxx14.cpp
@@ -640,3 +640,36 @@ namespace PR46637 {
   template struct Y { T x; };
   Y auto> y; // expected-error {{'auto' not allowed in template 
argument}}
 }
+
+namespace GH71015 {
+
+// Check that there is no error in case a templated function is recursive and
+// has a placeholder return type.
+struct Node {
+  int value;
+  Node* left;
+  Node* right;
+};
+
+bool parse(const char*);
+Node* parsePrimaryExpr();
+
+auto parseMulExpr(auto node) { // cxx14-error {{'auto' not allowed in function 
prototype}}
+  if (node == nullptr) node = parsePrimaryExpr();
+  if (!parse("*")) return node;
+  return parseMulExpr(new Node{.left = node, .right = parsePrimaryExpr()});
+}
+
+template 
+auto parseMulExpr2(T node) {
+  if (node == nullptr) node = parsePrimaryExpr();
+  if (!parse("*")) return node;
+  return parseMulExpr2(new Node{.left = node, .right = parsePrimaryExpr()});
+}
+
+auto f(auto x) { // cxx14-error {{'auto' not allowed in function prototype}}
+  if (x == 0) return 0;
+  return f(1) + 1;
+}
+
+}

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Accept recursive non-dependent calls to functions with deduced return type (PR #75456)

2023-12-14 Thread Mariya Podchishchaeva via cfe-commits

Fznamznon wrote:

This attempts to implement the approach described by @zygoloid in 
https://github.com/llvm/llvm-project/issues/71015#issuecomment-1828745626 .

https://github.com/llvm/llvm-project/pull/75456
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang] Accept recursive non-dependent calls to functions with deduced return type (PR #75456)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Mariya Podchishchaeva (Fznamznon)


Changes

Treat such calls as dependent since it is much easier to implement.

Fixes https://github.com/llvm/llvm-project/issues/71015

---
Full diff: https://github.com/llvm/llvm-project/pull/75456.diff


4 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+3) 
- (modified) clang/lib/AST/ComputeDependence.cpp (+2) 
- (modified) clang/lib/Sema/SemaOverload.cpp (+18) 
- (modified) clang/test/SemaCXX/deduced-return-type-cxx14.cpp (+33) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 05d59d0da264f3..9ffc7500414981 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -688,6 +688,9 @@ Bug Fixes in This Version
 - Fixed false positive error emitted when templated alias inside a class
   used private members of the same class.
   Fixes (`#41693 `_)
+- Clang now accepts recursive non-dependent calls to functions with deduced 
return
+  type.
+  Fixes (`#71015 `_)
 
 Bug Fixes to Compiler Builtins
 ^^
diff --git a/clang/lib/AST/ComputeDependence.cpp 
b/clang/lib/AST/ComputeDependence.cpp
index 097753fd3267b5..584b58473294be 100644
--- a/clang/lib/AST/ComputeDependence.cpp
+++ b/clang/lib/AST/ComputeDependence.cpp
@@ -603,6 +603,8 @@ ExprDependence clang::computeDependence(PredefinedExpr *E) {
 ExprDependence clang::computeDependence(CallExpr *E,
 llvm::ArrayRef PreArgs) {
   auto D = E->getCallee()->getDependence();
+  if (E->getType()->isDependentType())
+D |= ExprDependence::Type;
   for (auto *A : llvm::ArrayRef(E->getArgs(), E->getNumArgs())) {
 if (A)
   D |= A->getDependence();
diff --git a/clang/lib/Sema/SemaOverload.cpp b/clang/lib/Sema/SemaOverload.cpp
index 5026e1d603e5ee..9fb767101e1eb7 100644
--- a/clang/lib/Sema/SemaOverload.cpp
+++ b/clang/lib/Sema/SemaOverload.cpp
@@ -13994,6 +13994,24 @@ ExprResult Sema::BuildOverloadedCallExpr(Scope *S, 
Expr *Fn,
   OverloadCandidateSet::iterator Best;
   OverloadingResult OverloadResult =
   CandidateSet.BestViableFunction(*this, Fn->getBeginLoc(), Best);
+  FunctionDecl *FDecl = Best->Function;
+
+  // Model the case with a call to a templated function whose definition
+  // encloses the call and whose return type contains a placeholder type as if
+  // the UnresolvedLookupExpr was type-dependent.
+  if (OverloadResult == OR_Success && FDecl &&
+  FDecl->isTemplateInstantiation() &&
+  FDecl->getReturnType()->isUndeducedType()) {
+if (auto TP = FDecl->getTemplateInstantiationPattern(false)) {
+  if (TP->willHaveBody()) {
+CallExpr *CE =
+CallExpr::Create(Context, Fn, Args, Context.DependentTy, 
VK_PRValue,
+ RParenLoc, CurFPFeatureOverrides());
+result = CE;
+return result;
+  }
+}
+  }
 
   return FinishOverloadedCallExpr(*this, S, Fn, ULE, LParenLoc, Args, 
RParenLoc,
   ExecConfig, &CandidateSet, &Best,
diff --git a/clang/test/SemaCXX/deduced-return-type-cxx14.cpp 
b/clang/test/SemaCXX/deduced-return-type-cxx14.cpp
index 6344d1df3fbaeb..1da597499d34f5 100644
--- a/clang/test/SemaCXX/deduced-return-type-cxx14.cpp
+++ b/clang/test/SemaCXX/deduced-return-type-cxx14.cpp
@@ -640,3 +640,36 @@ namespace PR46637 {
   template struct Y { T x; };
   Y auto> y; // expected-error {{'auto' not allowed in template 
argument}}
 }
+
+namespace GH71015 {
+
+// Check that there is no error in case a templated function is recursive and
+// has a placeholder return type.
+struct Node {
+  int value;
+  Node* left;
+  Node* right;
+};
+
+bool parse(const char*);
+Node* parsePrimaryExpr();
+
+auto parseMulExpr(auto node) { // cxx14-error {{'auto' not allowed in function 
prototype}}
+  if (node == nullptr) node = parsePrimaryExpr();
+  if (!parse("*")) return node;
+  return parseMulExpr(new Node{.left = node, .right = parsePrimaryExpr()});
+}
+
+template 
+auto parseMulExpr2(T node) {
+  if (node == nullptr) node = parsePrimaryExpr();
+  if (!parse("*")) return node;
+  return parseMulExpr2(new Node{.left = node, .right = parsePrimaryExpr()});
+}
+
+auto f(auto x) { // cxx14-error {{'auto' not allowed in function prototype}}
+  if (x == 0) return 0;
+  return f(1) + 1;
+}
+
+}

``




https://github.com/llvm/llvm-project/pull/75456
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] Allow to pass config file to clang-tidy-diff (PR #75457)

2023-12-14 Thread Michael Lettrich via cfe-commits

https://github.com/MichaelLettrich created 
https://github.com/llvm/llvm-project/pull/75457

Adds a `-config-file` command line option that passes on the path of 
.`clang-tidy` or custom config file to the `clang-tidy` executable.

>From 05aff16d9b117e7e04c5342ec1792c91ef41e48b Mon Sep 17 00:00:00 2001
From: Michael Lettrich 
Date: Thu, 14 Dec 2023 11:31:28 +0100
Subject: [PATCH] Allow to pass config file to clang-tidy-diff

Adds a `-config-file` command line option that passes on the path of 
.`clang-tidy` or custom config file to the `clang-tidy` executable.
---
 clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py | 4 
 1 file changed, 4 insertions(+)

diff --git a/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py 
b/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
index 8817e2914f6e25..53c990f58a7edc 100755
--- a/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
+++ b/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
@@ -173,6 +173,8 @@ def main():
 help="checks filter, when not specified, use clang-tidy " "default",
 default="",
 )
+parser.add_argument("-config-file", dest="config_file",
+help="Specify the path of .clang-tidy or custom config 
file",default="")
 parser.add_argument("-use-color", action="store_true", help="Use colors in 
output")
 parser.add_argument(
 "-path", dest="build_path", help="Path used to read a compile command 
database."
@@ -313,6 +315,8 @@ def main():
 common_clang_tidy_args.append("-fix")
 if args.checks != "":
 common_clang_tidy_args.append("-checks=" + args.checks)
+if args.config_file != "":
+common_clang_tidy_args.append("-config-file=" + args.config_file)
 if args.quiet:
 common_clang_tidy_args.append("-quiet")
 if args.build_path is not None:

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] Allow to pass config file to clang-tidy-diff (PR #75457)

2023-12-14 Thread via cfe-commits

github-actions[bot] wrote:

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this 
page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using `@` followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from 
other developers.

If you have further questions, they may be answered by the [LLVM GitHub User 
Guide](https://llvm.org/docs/GitHub.html).

You can also ask questions in a comment on this PR, on the [LLVM 
Discord](https://discord.com/invite/xS7Z362) or on the 
[forums](https://discourse.llvm.org/).

https://github.com/llvm/llvm-project/pull/75457
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] Allow to pass config file to clang-tidy-diff (PR #75457)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-tidy

Author: Michael Lettrich (MichaelLettrich)


Changes

Adds a `-config-file` command line option that passes on the path of 
.`clang-tidy` or custom config file to the `clang-tidy` executable.

---
Full diff: https://github.com/llvm/llvm-project/pull/75457.diff


1 Files Affected:

- (modified) clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py (+4) 


``diff
diff --git a/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py 
b/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
index 8817e2914f6e25..53c990f58a7edc 100755
--- a/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
+++ b/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
@@ -173,6 +173,8 @@ def main():
 help="checks filter, when not specified, use clang-tidy " "default",
 default="",
 )
+parser.add_argument("-config-file", dest="config_file",
+help="Specify the path of .clang-tidy or custom config 
file",default="")
 parser.add_argument("-use-color", action="store_true", help="Use colors in 
output")
 parser.add_argument(
 "-path", dest="build_path", help="Path used to read a compile command 
database."
@@ -313,6 +315,8 @@ def main():
 common_clang_tidy_args.append("-fix")
 if args.checks != "":
 common_clang_tidy_args.append("-checks=" + args.checks)
+if args.config_file != "":
+common_clang_tidy_args.append("-config-file=" + args.config_file)
 if args.quiet:
 common_clang_tidy_args.append("-quiet")
 if args.build_path is not None:

``




https://github.com/llvm/llvm-project/pull/75457
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [AArch64] Disable large global group relocation (PR #75445)

2023-12-14 Thread via cfe-commits

https://github.com/wc00862805aj updated 
https://github.com/llvm/llvm-project/pull/75445

>From ec9429cd7c13ab86189976f4f327d612183a6010 Mon Sep 17 00:00:00 2001
From: wcleungaj 
Date: Thu, 14 Dec 2023 16:54:37 +0800
Subject: [PATCH] [AArch64] Disable large global group relocation

---
 clang/include/clang/Driver/Options.td |  3 +++
 clang/lib/Driver/ToolChains/Clang.cpp |  5 +
 .../AArch64/GISel/AArch64InstructionSelector.cpp  |  6 +-
 .../AArch64/GlobalISel/select-blockaddress.mir| 11 +++
 4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..592358d0935853 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4663,6 +4663,9 @@ def mno_fix_cortex_a53_835769 : Flag<["-"], 
"mno-fix-cortex-a53-835769">,
 def mmark_bti_property : Flag<["-"], "mmark-bti-property">,
   Group,
   HelpText<"Add .note.gnu.property with BTI to assembly files (AArch64 only)">;
+def mno_large_global_group_reloc: Flag<["-"], "mno-large-global-group-reloc">, 
+  Group,
+  HelpText<"Disable group relocation type for global value and symbol when 
code model is large">;
 def mno_bti_at_return_twice : Flag<["-"], "mno-bti-at-return-twice">,
   Group,
   HelpText<"Do not add a BTI instruction after a setjmp or other"
diff --git a/clang/lib/Driver/ToolChains/Clang.cpp 
b/clang/lib/Driver/ToolChains/Clang.cpp
index de9fd5eaa1e020..8edfe00358a066 100644
--- a/clang/lib/Driver/ToolChains/Clang.cpp
+++ b/clang/lib/Driver/ToolChains/Clang.cpp
@@ -4977,6 +4977,11 @@ void Clang::ConstructJob(Compilation &C, const JobAction 
&JA,
   if (Args.getLastArg(options::OPT_save_temps_EQ))
 Args.AddLastArg(CmdArgs, options::OPT_save_temps_EQ);
 
+  if (Args.getLastArg(options::OPT_mno_large_global_group_reloc)){
+CmdArgs.push_back("-mllvm");
+CmdArgs.push_back("-mno-large-global-group-reloc");
+  }
+
   auto *MemProfArg = Args.getLastArg(options::OPT_fmemory_profile,
  options::OPT_fmemory_profile_EQ,
  options::OPT_fno_memory_profile);
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index bdaae4dd724d53..2c61396422d79e 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -66,6 +66,10 @@ namespace {
 #include "AArch64GenGlobalISel.inc"
 #undef GET_GLOBALISEL_PREDICATE_BITSET
 
+static cl::opt DisableLargeGlobalGroupReloc(
+  "mno-large-global-group-reloc",
+  cl::desc("Disable group relocation type for global value and symbol when 
code model is large"),
+  cl::init(false));
 
 class AArch64InstructionSelector : public InstructionSelector {
 public:
@@ -2850,7 +2854,7 @@ bool AArch64InstructionSelector::select(MachineInstr &I) {
   I.setDesc(TII.get(AArch64::LOADgot));
   I.getOperand(1).setTargetFlags(OpFlags);
 } else if (TM.getCodeModel() == CodeModel::Large &&
-   !TM.isPositionIndependent()) {
+   !DisableLargeGlobalGroupReloc && !TM.isPositionIndependent()) {
   // Materialize the global using movz/movk instructions.
   materializeLargeCMVal(I, GV, OpFlags);
   I.eraseFromParent();
diff --git a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
index 28d279d7421642..dadde2d8f33426 100644
--- a/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
+++ b/llvm/test/CodeGen/AArch64/GlobalISel/select-blockaddress.mir
@@ -2,6 +2,7 @@
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select %s | FileCheck %s
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large %s | FileCheck %s 
--check-prefix=LARGE
 # RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -relocation-model=pic %s | 
FileCheck %s --check-prefix=LARGE-PIC
+# RUN: llc -mtriple=aarch64-unknown-unknown -o - -verify-machineinstrs 
-run-pass=instruction-select -code-model=large -mno-large-global-group-reloc %s 
| FileCheck %s --check-prefix=NO-LARGE-GLOBAL-GROUP-RELOC
 --- |
   source_filename = "blockaddress.ll"
   target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
@@ -62,6 +63,16 @@ body: |
   ; LARGE-PIC-NEXT:   BR [[MOVaddrBA]]
   ; LARGE-PIC-NEXT: {{  $}}
   ; LARGE-PIC-NEXT: bb.1.block (ir-block-address-taken %ir-block.block):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC-LABEL: name: test_blockaddress
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: bb.0 (%ir-block.0):
+  ; NO-LARGE-GLOBAL-GROUP-RELOC: [[MOVZXi:%[0-9]+]]:gpr64 = MOVZXi 
target-flags(aarch64-g0, aarch64-nc) blockaddress(@test_blockaddress, 
%ir-blo

[clang-tools-extra] Allow to pass config file to clang-tidy-diff (PR #75457)

2023-12-14 Thread via cfe-commits

github-actions[bot] wrote:




:warning: Python code formatter, darker found issues in your code. :warning:



You can test this locally with the following command:


``bash
darker --check --diff -r 
2952bc3384412ca67fd1dcd2eac595088d692802...05aff16d9b117e7e04c5342ec1792c91ef41e48b
 clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
``





View the diff from darker here.


``diff
--- clang-tidy-diff.py  2023-12-14 10:31:28.00 +
+++ clang-tidy-diff.py  2023-12-14 10:37:50.326684 +
@@ -171,12 +171,16 @@
 parser.add_argument(
 "-checks",
 help="checks filter, when not specified, use clang-tidy " "default",
 default="",
 )
-parser.add_argument("-config-file", dest="config_file",
-help="Specify the path of .clang-tidy or custom config 
file",default="")
+parser.add_argument(
+"-config-file",
+dest="config_file",
+help="Specify the path of .clang-tidy or custom config file",
+default="",
+)
 parser.add_argument("-use-color", action="store_true", help="Use colors in 
output")
 parser.add_argument(
 "-path", dest="build_path", help="Path used to read a compile command 
database."
 )
 if yaml:

``




https://github.com/llvm/llvm-project/pull/75457
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [clang] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

> Can this solve #68566 too?

I don't think it solves it, as it only fixes offset computations within GEPs 
and doesn't teach code in general about the correct vector layout. However, it 
is reducing the amount of code assuming the wrong layout.

There seem to be some clang test failures though that I'm currently looking 
into.

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] Allow to pass config file to clang-tidy-diff (PR #75457)

2023-12-14 Thread Michael Lettrich via cfe-commits

https://github.com/MichaelLettrich updated 
https://github.com/llvm/llvm-project/pull/75457

>From 382a8a5355b06f191941099c1eac029dbb9d4bb4 Mon Sep 17 00:00:00 2001
From: Michael Lettrich 
Date: Thu, 14 Dec 2023 11:31:28 +0100
Subject: [PATCH] Allow to pass config file to clang-tidy-diff

Adds a `-config-file` command line option that passes on the path of 
.`clang-tidy` or custom config file to the `clang-tidy` executable.
---
 clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py | 8 
 1 file changed, 8 insertions(+)

diff --git a/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py 
b/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
index 8817e2914f6e25..d96b3450fdbe81 100755
--- a/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
+++ b/clang-tools-extra/clang-tidy/tool/clang-tidy-diff.py
@@ -173,6 +173,12 @@ def main():
 help="checks filter, when not specified, use clang-tidy " "default",
 default="",
 )
+parser.add_argument(
+"-config-file",
+dest="config_file",
+help="Specify the path of .clang-tidy or custom config file",
+default="",
+)
 parser.add_argument("-use-color", action="store_true", help="Use colors in 
output")
 parser.add_argument(
 "-path", dest="build_path", help="Path used to read a compile command 
database."
@@ -313,6 +319,8 @@ def main():
 common_clang_tidy_args.append("-fix")
 if args.checks != "":
 common_clang_tidy_args.append("-checks=" + args.checks)
+if args.config_file != "":
+common_clang_tidy_args.append("-config-file=" + args.config_file)
 if args.quiet:
 common_clang_tidy_args.append("-quiet")
 if args.build_path is not None:

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver] Remove all vendor triples (PR #75459)

2023-12-14 Thread Andreas Schwab via cfe-commits

https://github.com/andreas-schwab created 
https://github.com/llvm/llvm-project/pull/75459

None

>From 65b392b384fadc994fe0647a254d623a334723e1 Mon Sep 17 00:00:00 2001
From: Andreas Schwab 
Date: Wed, 6 Dec 2023 10:50:54 +0100
Subject: [PATCH] [Driver] Remove all vendor triples

---
 clang/lib/Driver/ToolChains/Gnu.cpp | 53 +
 1 file changed, 16 insertions(+), 37 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index 835215a83c4037..89b5765d92bf49 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2316,22 +2316,18 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
   // lists should shrink over time. Please don't add more elements to *Triples.
   static const char *const AArch64LibDirs[] = {"/lib64", "/lib"};
   static const char *const AArch64Triples[] = {
-  "aarch64-none-linux-gnu", "aarch64-linux-gnu", "aarch64-redhat-linux",
-  "aarch64-suse-linux"};
+  "aarch64-none-linux-gnu", "aarch64-linux-gnu"};
   static const char *const AArch64beLibDirs[] = {"/lib"};
   static const char *const AArch64beTriples[] = {"aarch64_be-none-linux-gnu",
  "aarch64_be-linux-gnu"};
 
   static const char *const ARMLibDirs[] = {"/lib"};
   static const char *const ARMTriples[] = {"arm-linux-gnueabi"};
-  static const char *const ARMHFTriples[] = {"arm-linux-gnueabihf",
- "armv7hl-redhat-linux-gnueabi",
- "armv6hl-suse-linux-gnueabi",
- "armv7hl-suse-linux-gnueabi"};
+  static const char *const ARMHFTriples[] = {"arm-linux-gnueabihf"};
   static const char *const ARMebLibDirs[] = {"/lib"};
   static const char *const ARMebTriples[] = {"armeb-linux-gnueabi"};
   static const char *const ARMebHFTriples[] = {
-  "armeb-linux-gnueabihf", "armebv7hl-redhat-linux-gnueabi"};
+  "armeb-linux-gnueabihf"};
 
   static const char *const AVRLibDirs[] = {"/lib"};
   static const char *const AVRTriples[] = {"avr"};
@@ -2342,20 +2338,13 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
 
   static const char *const X86_64LibDirs[] = {"/lib64", "/lib"};
   static const char *const X86_64Triples[] = {
-  "x86_64-linux-gnu",   "x86_64-unknown-linux-gnu",
-  "x86_64-pc-linux-gnu","x86_64-redhat-linux6E",
-  "x86_64-redhat-linux","x86_64-suse-linux",
-  "x86_64-manbo-linux-gnu", "x86_64-linux-gnu",
-  "x86_64-slackware-linux", "x86_64-unknown-linux",
-  "x86_64-amazon-linux"};
-  static const char *const X32Triples[] = {"x86_64-linux-gnux32",
-   "x86_64-pc-linux-gnux32"};
+  "x86_64-linux-gnu",   "x86_64-unknown-linux-gnu", "x86_64-linux-gnu",
+  "x86_64-unknown-linux"};
+  static const char *const X32Triples[] = {"x86_64-linux-gnux32"};
   static const char *const X32LibDirs[] = {"/libx32", "/lib"};
   static const char *const X86LibDirs[] = {"/lib32", "/lib"};
   static const char *const X86Triples[] = {
-  "i586-linux-gnu",  "i686-linux-gnu","i686-pc-linux-gnu",
-  "i386-redhat-linux6E", "i686-redhat-linux", "i386-redhat-linux",
-  "i586-suse-linux", "i686-montavista-linux", "i686-gnu",
+  "i586-linux-gnu",  "i686-linux-gnu", "i686-gnu",
   };
 
   static const char *const LoongArch64LibDirs[] = {"/lib64", "/lib"};
@@ -2364,25 +2353,22 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
 
   static const char *const M68kLibDirs[] = {"/lib"};
   static const char *const M68kTriples[] = {
-  "m68k-linux-gnu", "m68k-unknown-linux-gnu", "m68k-suse-linux"};
+  "m68k-linux-gnu", "m68k-unknown-linux-gnu"};
 
   static const char *const MIPSLibDirs[] = {"/libo32", "/lib"};
   static const char *const MIPSTriples[] = {
-  "mips-linux-gnu", "mips-mti-linux", "mips-mti-linux-gnu",
-  "mips-img-linux-gnu", "mipsisa32r6-linux-gnu"};
+  "mips-linux-gnu", "mipsisa32r6-linux-gnu"};
   static const char *const MIPSELLibDirs[] = {"/libo32", "/lib"};
   static const char *const MIPSELTriples[] = {
-  "mipsel-linux-gnu", "mips-img-linux-gnu", "mipsisa32r6el-linux-gnu"};
+  "mipsel-linux-gnu", "mipsisa32r6el-linux-gnu"};
 
   static const char *const MIPS64LibDirs[] = {"/lib64", "/lib"};
   static const char *const MIPS64Triples[] = {
-  "mips64-linux-gnu",  "mips-mti-linux-gnu",
-  "mips-img-linux-gnu","mips64-linux-gnuabi64",
+  "mips64-linux-gnu","mips64-linux-gnuabi64",
   "mipsisa64r6-linux-gnu", "mipsisa64r6-linux-gnuabi64"};
   static const char *const MIPS64ELLibDirs[] = {"/lib64", "/lib"};
   static const char *const MIPS64ELTriples[] = {
-  "mips64el-linux-gnu",  "mips-mti-linux-gnu",
-  "mips-img-linux-gnu",  "mips64el-linux-gnuabi64",
+  "mips64el-linux-gnu",  "mips

[clang] [Driver] Remove all vendor triples (PR #75459)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Andreas Schwab (andreas-schwab)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/75459.diff


1 Files Affected:

- (modified) clang/lib/Driver/ToolChains/Gnu.cpp (+16-37) 


``diff
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index 835215a83c4037..89b5765d92bf49 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2316,22 +2316,18 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
   // lists should shrink over time. Please don't add more elements to *Triples.
   static const char *const AArch64LibDirs[] = {"/lib64", "/lib"};
   static const char *const AArch64Triples[] = {
-  "aarch64-none-linux-gnu", "aarch64-linux-gnu", "aarch64-redhat-linux",
-  "aarch64-suse-linux"};
+  "aarch64-none-linux-gnu", "aarch64-linux-gnu"};
   static const char *const AArch64beLibDirs[] = {"/lib"};
   static const char *const AArch64beTriples[] = {"aarch64_be-none-linux-gnu",
  "aarch64_be-linux-gnu"};
 
   static const char *const ARMLibDirs[] = {"/lib"};
   static const char *const ARMTriples[] = {"arm-linux-gnueabi"};
-  static const char *const ARMHFTriples[] = {"arm-linux-gnueabihf",
- "armv7hl-redhat-linux-gnueabi",
- "armv6hl-suse-linux-gnueabi",
- "armv7hl-suse-linux-gnueabi"};
+  static const char *const ARMHFTriples[] = {"arm-linux-gnueabihf"};
   static const char *const ARMebLibDirs[] = {"/lib"};
   static const char *const ARMebTriples[] = {"armeb-linux-gnueabi"};
   static const char *const ARMebHFTriples[] = {
-  "armeb-linux-gnueabihf", "armebv7hl-redhat-linux-gnueabi"};
+  "armeb-linux-gnueabihf"};
 
   static const char *const AVRLibDirs[] = {"/lib"};
   static const char *const AVRTriples[] = {"avr"};
@@ -2342,20 +2338,13 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
 
   static const char *const X86_64LibDirs[] = {"/lib64", "/lib"};
   static const char *const X86_64Triples[] = {
-  "x86_64-linux-gnu",   "x86_64-unknown-linux-gnu",
-  "x86_64-pc-linux-gnu","x86_64-redhat-linux6E",
-  "x86_64-redhat-linux","x86_64-suse-linux",
-  "x86_64-manbo-linux-gnu", "x86_64-linux-gnu",
-  "x86_64-slackware-linux", "x86_64-unknown-linux",
-  "x86_64-amazon-linux"};
-  static const char *const X32Triples[] = {"x86_64-linux-gnux32",
-   "x86_64-pc-linux-gnux32"};
+  "x86_64-linux-gnu",   "x86_64-unknown-linux-gnu", "x86_64-linux-gnu",
+  "x86_64-unknown-linux"};
+  static const char *const X32Triples[] = {"x86_64-linux-gnux32"};
   static const char *const X32LibDirs[] = {"/libx32", "/lib"};
   static const char *const X86LibDirs[] = {"/lib32", "/lib"};
   static const char *const X86Triples[] = {
-  "i586-linux-gnu",  "i686-linux-gnu","i686-pc-linux-gnu",
-  "i386-redhat-linux6E", "i686-redhat-linux", "i386-redhat-linux",
-  "i586-suse-linux", "i686-montavista-linux", "i686-gnu",
+  "i586-linux-gnu",  "i686-linux-gnu", "i686-gnu",
   };
 
   static const char *const LoongArch64LibDirs[] = {"/lib64", "/lib"};
@@ -2364,25 +2353,22 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
 
   static const char *const M68kLibDirs[] = {"/lib"};
   static const char *const M68kTriples[] = {
-  "m68k-linux-gnu", "m68k-unknown-linux-gnu", "m68k-suse-linux"};
+  "m68k-linux-gnu", "m68k-unknown-linux-gnu"};
 
   static const char *const MIPSLibDirs[] = {"/libo32", "/lib"};
   static const char *const MIPSTriples[] = {
-  "mips-linux-gnu", "mips-mti-linux", "mips-mti-linux-gnu",
-  "mips-img-linux-gnu", "mipsisa32r6-linux-gnu"};
+  "mips-linux-gnu", "mipsisa32r6-linux-gnu"};
   static const char *const MIPSELLibDirs[] = {"/libo32", "/lib"};
   static const char *const MIPSELTriples[] = {
-  "mipsel-linux-gnu", "mips-img-linux-gnu", "mipsisa32r6el-linux-gnu"};
+  "mipsel-linux-gnu", "mipsisa32r6el-linux-gnu"};
 
   static const char *const MIPS64LibDirs[] = {"/lib64", "/lib"};
   static const char *const MIPS64Triples[] = {
-  "mips64-linux-gnu",  "mips-mti-linux-gnu",
-  "mips-img-linux-gnu","mips64-linux-gnuabi64",
+  "mips64-linux-gnu","mips64-linux-gnuabi64",
   "mipsisa64r6-linux-gnu", "mipsisa64r6-linux-gnuabi64"};
   static const char *const MIPS64ELLibDirs[] = {"/lib64", "/lib"};
   static const char *const MIPS64ELTriples[] = {
-  "mips64el-linux-gnu",  "mips-mti-linux-gnu",
-  "mips-img-linux-gnu",  "mips64el-linux-gnuabi64",
+  "mips64el-linux-gnu",  "mips64el-linux-gnuabi64",
   "mipsisa64r6el-linux-gnu", "mipsisa64r6el-linux-gnuabi64"};
 
   static const char *const MIPSN32LibDirs[] = {"/lib3

[clang-tools-extra] Allow to pass config file to clang-tidy-diff (PR #75457)

2023-12-14 Thread Michael Lettrich via cfe-commits

MichaelLettrich wrote:

Note: This file lacks a `SPDX-FileCopyrightText:` in the header. Is this on 
purpose?

https://github.com/llvm/llvm-project/pull/75457
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Driver] Remove all vendor triples (PR #75459)

2023-12-14 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff 2952bc3384412ca67fd1dcd2eac595088d692802 
65b392b384fadc994fe0647a254d623a334723e1 -- clang/lib/Driver/ToolChains/Gnu.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp 
b/clang/lib/Driver/ToolChains/Gnu.cpp
index 89b5765d92..cf310c4f46 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2315,8 +2315,8 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
   // and always uses the full --target (e.g. --target=aarch64-linux-gnu).  The
   // lists should shrink over time. Please don't add more elements to *Triples.
   static const char *const AArch64LibDirs[] = {"/lib64", "/lib"};
-  static const char *const AArch64Triples[] = {
-  "aarch64-none-linux-gnu", "aarch64-linux-gnu"};
+  static const char *const AArch64Triples[] = {"aarch64-none-linux-gnu",
+   "aarch64-linux-gnu"};
   static const char *const AArch64beLibDirs[] = {"/lib"};
   static const char *const AArch64beTriples[] = {"aarch64_be-none-linux-gnu",
  "aarch64_be-linux-gnu"};
@@ -2326,8 +2326,7 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
   static const char *const ARMHFTriples[] = {"arm-linux-gnueabihf"};
   static const char *const ARMebLibDirs[] = {"/lib"};
   static const char *const ARMebTriples[] = {"armeb-linux-gnueabi"};
-  static const char *const ARMebHFTriples[] = {
-  "armeb-linux-gnueabihf"};
+  static const char *const ARMebHFTriples[] = {"armeb-linux-gnueabihf"};
 
   static const char *const AVRLibDirs[] = {"/lib"};
   static const char *const AVRTriples[] = {"avr"};
@@ -2338,13 +2337,15 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
 
   static const char *const X86_64LibDirs[] = {"/lib64", "/lib"};
   static const char *const X86_64Triples[] = {
-  "x86_64-linux-gnu",   "x86_64-unknown-linux-gnu", "x86_64-linux-gnu",
+  "x86_64-linux-gnu", "x86_64-unknown-linux-gnu", "x86_64-linux-gnu",
   "x86_64-unknown-linux"};
   static const char *const X32Triples[] = {"x86_64-linux-gnux32"};
   static const char *const X32LibDirs[] = {"/libx32", "/lib"};
   static const char *const X86LibDirs[] = {"/lib32", "/lib"};
   static const char *const X86Triples[] = {
-  "i586-linux-gnu",  "i686-linux-gnu", "i686-gnu",
+  "i586-linux-gnu",
+  "i686-linux-gnu",
+  "i686-gnu",
   };
 
   static const char *const LoongArch64LibDirs[] = {"/lib64", "/lib"};
@@ -2352,23 +2353,23 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
   "loongarch64-linux-gnu", "loongarch64-unknown-linux-gnu"};
 
   static const char *const M68kLibDirs[] = {"/lib"};
-  static const char *const M68kTriples[] = {
-  "m68k-linux-gnu", "m68k-unknown-linux-gnu"};
+  static const char *const M68kTriples[] = {"m68k-linux-gnu",
+"m68k-unknown-linux-gnu"};
 
   static const char *const MIPSLibDirs[] = {"/libo32", "/lib"};
-  static const char *const MIPSTriples[] = {
-  "mips-linux-gnu", "mipsisa32r6-linux-gnu"};
+  static const char *const MIPSTriples[] = {"mips-linux-gnu",
+"mipsisa32r6-linux-gnu"};
   static const char *const MIPSELLibDirs[] = {"/libo32", "/lib"};
-  static const char *const MIPSELTriples[] = {
-  "mipsel-linux-gnu", "mipsisa32r6el-linux-gnu"};
+  static const char *const MIPSELTriples[] = {"mipsel-linux-gnu",
+  "mipsisa32r6el-linux-gnu"};
 
   static const char *const MIPS64LibDirs[] = {"/lib64", "/lib"};
   static const char *const MIPS64Triples[] = {
-  "mips64-linux-gnu","mips64-linux-gnuabi64",
-  "mipsisa64r6-linux-gnu", "mipsisa64r6-linux-gnuabi64"};
+  "mips64-linux-gnu", "mips64-linux-gnuabi64", "mipsisa64r6-linux-gnu",
+  "mipsisa64r6-linux-gnuabi64"};
   static const char *const MIPS64ELLibDirs[] = {"/lib64", "/lib"};
   static const char *const MIPS64ELTriples[] = {
-  "mips64el-linux-gnu",  "mips64el-linux-gnuabi64",
+  "mips64el-linux-gnu", "mips64el-linux-gnuabi64",
   "mipsisa64r6el-linux-gnu", "mipsisa64r6el-linux-gnuabi64"};
 
   static const char *const MIPSN32LibDirs[] = {"/lib32"};
@@ -2390,11 +2391,11 @@ void 
Generic_GCC::GCCInstallationDetector::AddDefaultGCCPrefixes(
  "powerpcle-linux-musl"};
 
   static const char *const PPC64LibDirs[] = {"/lib64", "/lib"};
-  static const char *const PPC64Triples[] = {
-  "powerpc64-linux-gnu", "powerpc64-unknown-linux-gnu"};
+  static const char *const PPC64Triples[] = {"powerpc64-linux-gnu",
+ 

[clang] 78accaf - [AArch64][SME2] Add builtins for SQDMULH (#75326)

2023-12-14 Thread via cfe-commits

Author: Dinar Temirbulatov
Date: 2023-12-14T10:53:04Z
New Revision: 78accaf7a06f7f72ab2f7819758f1d9bce8b8552

URL: 
https://github.com/llvm/llvm-project/commit/78accaf7a06f7f72ab2f7819758f1d9bce8b8552
DIFF: 
https://github.com/llvm/llvm-project/commit/78accaf7a06f7f72ab2f7819758f1d9bce8b8552.diff

LOG: [AArch64][SME2] Add builtins for SQDMULH (#75326)

Patch by: Kerry McLaughlin 

Added: 
clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_sqdmulh.c

Modified: 
clang/include/clang/Basic/arm_sve.td

Removed: 




diff  --git a/clang/include/clang/Basic/arm_sve.td 
b/clang/include/clang/Basic/arm_sve.td
index db6f17d1c493af..826ff498d83d07 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -2185,6 +2185,12 @@ let TargetGuard = "sme2" in {
 
   def REINTERPRET_SVBOOL_TO_SVCOUNT : Inst<"svreinterpret[_c]", "}P", "Pc", 
MergeNone, "", [IsStreamingCompatible], []>;
   def REINTERPRET_SVCOUNT_TO_SVBOOL : Inst<"svreinterpret[_b]", "P}", "Pc", 
MergeNone, "", [IsStreamingCompatible], []>;
+
+  // SQDMULH
+  def SVSQDMULH_SINGLE_X2 : SInst<"svqdmulh[_single_{d}_x2]", "22d", "csil", 
MergeNone, "aarch64_sve_sqdmulh_single_vgx2", [IsStreaming], []>;
+  def SVSQDMULH_SINGLE_X4 : SInst<"svqdmulh[_single_{d}_x4]", "44d", "csil", 
MergeNone, "aarch64_sve_sqdmulh_single_vgx4", [IsStreaming], []>;
+  def SVSQDMULH_X2: SInst<"svqdmulh[_{d}_x2]","222", "csil", 
MergeNone, "aarch64_sve_sqdmulh_vgx2",[IsStreaming], []>;
+  def SVSQDMULH_X4: SInst<"svqdmulh[_{d}_x4]","444", "csil", 
MergeNone, "aarch64_sve_sqdmulh_vgx4",[IsStreaming], []>;
 }
 
 let TargetGuard = "sve2p1" in {

diff  --git a/clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_sqdmulh.c 
b/clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_sqdmulh.c
new file mode 100644
index 00..6bbd23ccd32a52
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-sme2-intrinsics/acle_sme2_sqdmulh.c
@@ -0,0 +1,584 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// REQUIRES: aarch64-registered-target
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sme2 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | 
opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sme2 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x 
c++ %s | opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s 
-check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -S 
-disable-O0-optnone -Werror -Wall -o /dev/null %s
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1,A2_UNUSED,A3,A4_UNUSED,A5) A1##A3##A5
+#else
+#define SVE_ACLE_FUNC(A1,A2,A3,A4,A5) A1##A2##A3##A4##A5
+#endif
+
+// Single, x2
+
+// CHECK-LABEL: @test_svqdmulh_single_s8_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZDN:%.*]], i64 0)
+// CHECK-NEXT:[[TMP1:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZDN]], i64 16)
+// CHECK-NEXT:[[TMP2:%.*]] = tail call { ,  } @llvm.aarch64.sve.sqdmulh.single.vgx2.nxv16i8( 
[[TMP0]],  [[TMP1]],  [[ZM:%.*]])
+// CHECK-NEXT:[[TMP3:%.*]] = extractvalue { ,  } [[TMP2]], 0
+// CHECK-NEXT:[[TMP4:%.*]] = tail call  
@llvm.vector.insert.nxv32i8.nxv16i8( poison,  [[TMP3]], i64 0)
+// CHECK-NEXT:[[TMP5:%.*]] = extractvalue { ,  } [[TMP2]], 1
+// CHECK-NEXT:[[TMP6:%.*]] = tail call  
@llvm.vector.insert.nxv32i8.nxv16i8( [[TMP4]],  [[TMP5]], i64 16)
+// CHECK-NEXT:ret  [[TMP6]]
+//
+// CPP-CHECK-LABEL: @_Z26test_svqdmulh_single_s8_x210svint8x2_tu10__SVInt8_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZDN:%.*]], i64 0)
+// CPP-CHECK-NEXT:[[TMP1:%.*]] = tail call  
@llvm.vector.extract.nxv16i8.nxv32i8( [[ZDN]], i64 16)
+// CPP-CHECK-NEXT:[[TMP2:%.*]] = tail call { ,  } @llvm.aarch64.sve.sqdmulh.single.vgx2.nxv16i8( 
[[TMP0]],  [[TMP1]],  [[ZM:%.*]])
+// CPP-CHECK-NEXT:[[TMP3:%.*]] = extractvalue { , 
 } [[TMP2]], 0
+// CPP-CHECK-NEXT:[[TMP4:%.*]] = tail call  
@llvm.vector.insert.nxv32i8.nxv16i8( poison,  [[TMP3]], i64 0)
+// CPP-CHECK-NEXT:[[TMP5:%.*]] = extractvalue { , 
 } [[TMP2]], 1
+// CPP-CHECK-NEXT:   

[clang] [AArch64][SME2] Add builtins for SQDMULH (PR #75326)

2023-12-14 Thread Dinar Temirbulatov via cfe-commits

https://github.com/dtemirbulatov closed 
https://github.com/llvm/llvm-project/pull/75326
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 797fee6 - [clang][Interp] Start supporting complex types

2023-12-14 Thread Timm Bäder via cfe-commits

Author: Timm Bäder
Date: 2023-12-14T11:57:38+01:00
New Revision: 797fee68d1cb6a4122d89880d44f8c99559c5cac

URL: 
https://github.com/llvm/llvm-project/commit/797fee68d1cb6a4122d89880d44f8c99559c5cac
DIFF: 
https://github.com/llvm/llvm-project/commit/797fee68d1cb6a4122d89880d44f8c99559c5cac.diff

LOG: [clang][Interp] Start supporting complex types

Differential Revision: https://reviews.llvm.org/D146408

Added: 
clang/test/AST/Interp/complex.cpp

Modified: 
clang/lib/AST/Interp/ByteCodeExprGen.cpp
clang/lib/AST/Interp/Context.cpp
clang/lib/AST/Interp/EvalEmitter.cpp

Removed: 




diff  --git a/clang/lib/AST/Interp/ByteCodeExprGen.cpp 
b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
index f7f8e6c73d84e2..efa98c6517a2ef 100644
--- a/clang/lib/AST/Interp/ByteCodeExprGen.cpp
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
@@ -671,6 +671,22 @@ bool ByteCodeExprGen::VisitInitListExpr(const 
InitListExpr *E) {
 return true;
   }
 
+  if (T->isAnyComplexType()) {
+unsigned InitIndex = 0;
+for (const Expr *Init : E->inits()) {
+  PrimType InitT = classifyPrim(Init->getType());
+
+  if (!this->visit(Init))
+return false;
+
+  if (!this->emitInitElem(InitT, InitIndex, E))
+return false;
+  ++InitIndex;
+}
+assert(InitIndex == 2);
+return true;
+  }
+
   return false;
 }
 
@@ -2550,8 +2566,22 @@ bool ByteCodeExprGen::VisitUnaryOperator(const 
UnaryOperator *E) {
 if (!this->visit(SubExpr))
   return false;
 return DiscardResult ? this->emitPop(*T, E) : this->emitComp(*T, E);
-  case UO_Real:   // __real x
-  case UO_Imag:   // __imag x
+  case UO_Real: { // __real x
+assert(!T);
+if (!this->visit(SubExpr))
+  return false;
+if (!this->emitConstUint8(0, E))
+  return false;
+return this->emitArrayElemPtrPopUint8(E);
+  }
+  case UO_Imag: { // __imag x
+assert(!T);
+if (!this->visit(SubExpr))
+  return false;
+if (!this->emitConstUint8(1, E))
+  return false;
+return this->emitArrayElemPtrPopUint8(E);
+  }
   case UO_Extension:
 return this->delegate(SubExpr);
   case UO_Coawait:

diff  --git a/clang/lib/AST/Interp/Context.cpp 
b/clang/lib/AST/Interp/Context.cpp
index 4fe6d1173f427e..17abb71635839c 100644
--- a/clang/lib/AST/Interp/Context.cpp
+++ b/clang/lib/AST/Interp/Context.cpp
@@ -92,6 +92,9 @@ std::optional Context::classify(QualType T) const {
   if (T->isBooleanType())
 return PT_Bool;
 
+  if (T->isAnyComplexType())
+return std::nullopt;
+
   if (T->isSignedIntegerOrEnumerationType()) {
 switch (Ctx.getIntWidth(T)) {
 case 64:

diff  --git a/clang/lib/AST/Interp/EvalEmitter.cpp 
b/clang/lib/AST/Interp/EvalEmitter.cpp
index 9bc42057c5f578..0ff0bde8fd17e8 100644
--- a/clang/lib/AST/Interp/EvalEmitter.cpp
+++ b/clang/lib/AST/Interp/EvalEmitter.cpp
@@ -208,6 +208,27 @@ bool EvalEmitter::emitRetValue(const SourceInfo &Info) {
   }
   return Ok;
 }
+
+// Complex types.
+if (const auto *CT = Ty->getAs()) {
+  QualType ElemTy = CT->getElementType();
+  std::optional ElemT = Ctx.classify(ElemTy);
+  assert(ElemT);
+
+  if (ElemTy->isIntegerType()) {
+INT_TYPE_SWITCH(*ElemT, {
+  auto V1 = Ptr.atIndex(0).deref();
+  auto V2 = Ptr.atIndex(1).deref();
+  Result = APValue(V1.toAPSInt(), V2.toAPSInt());
+  return true;
+});
+  } else if (ElemTy->isFloatingType()) {
+Result = APValue(Ptr.atIndex(0).deref().getAPFloat(),
+ Ptr.atIndex(1).deref().getAPFloat());
+return true;
+  }
+  return false;
+}
 llvm_unreachable("invalid value to return");
   };
 

diff  --git a/clang/test/AST/Interp/complex.cpp 
b/clang/test/AST/Interp/complex.cpp
new file mode 100644
index 00..4fd2b5cfd73640
--- /dev/null
+++ b/clang/test/AST/Interp/complex.cpp
@@ -0,0 +1,33 @@
+// RUN: %clang_cc1 -fexperimental-new-constant-interpreter -verify %s
+// RUN: %clang_cc1 -verify=ref %s
+
+// expected-no-diagnostics
+// ref-no-diagnostics
+
+constexpr _Complex double z1 = {1.0, 2.0};
+static_assert(__real(z1) == 1.0, "");
+static_assert(__imag(z1) == 2.0, "");
+
+constexpr double setter() {
+  _Complex float d = {1.0, 2.0};
+
+  __imag(d) = 4.0;
+  return __imag(d);
+}
+static_assert(setter() == 4, "");
+
+constexpr _Complex double getter() {
+  return {1.0, 3.0};
+}
+constexpr _Complex double D = getter();
+static_assert(__real(D) == 1.0, "");
+static_assert(__imag(D) == 3.0, "");
+
+
+constexpr _Complex int I1 = {1, 2};
+static_assert(__real(I1) == 1, "");
+static_assert(__imag(I1) == 2, "");
+
+
+/// FIXME: This should work in the new interpreter as well.
+// constexpr _Complex _BitInt(8) A = 0;// = {4};



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang-tools-extra] [lldb] [mlir] [libc] [lld] [clang] [libcxx] [compiler-rt] [llvm] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 updated 
https://github.com/llvm/llvm-project/pull/71555

>From 7bb2f9793b2a2cccbaa401f6e2ac850b587f2b59 Mon Sep 17 00:00:00 2001
From: Muneeb Khan 
Date: Tue, 7 Nov 2023 23:52:17 +0800
Subject: [PATCH 1/9] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF

This patch adds continuous loop peeling to scf loop transforms
in the MLIR backend. This transforms the target loop into a
chain of loops, with step sizes that are powers of two and
decrease exponetially across subsequent loops. Originally
authored by Litu Zhou litu.z...@huawei.com.
---
 .../SCF/TransformOps/SCFTransformOps.td   |  36 +
 .../SCF/TransformOps/SCFTransformOps.cpp  | 147 ++
 .../Dialect/SCF/loop-continuous-peel.mlir |  98 
 3 files changed, 281 insertions(+)
 create mode 100644 mlir/test/Dialect/SCF/loop-continuous-peel.mlir

diff --git a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td 
b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
index 14df7e23a430fb..e3d79a7f0ae40f 100644
--- a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
+++ b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
@@ -147,6 +147,42 @@ def LoopPeelOp : Op {
+  let description = [{
+Transforms the loop into a chain of loops, with step sizes that are
+powers of two and decrease exponetially across subsequent loops.
+The transform is similar to loop.peel in the effect that it creates a loop
+with a step (that is power of 2) to divide the range evenly, with the
+difference that the remaining iterations are spread across similar loops
+with exponentially decreasing step sizes, with the last loop with step size
+of 2^0 = 1.
+
+ Return modes
+
+This operation consumes the `target` handles and produces the
+continuously-peeled loop.
+  }];
+
+  let arguments =
+  (ins TransformHandleTypeInterface:$target,
+   DefaultValuedAttr:$single_iter_opt);
+  // TODO: Return both the peeled loop and the remainder loop.
+  let results = (outs TransformHandleTypeInterface:$transformed);
+
+  let assemblyFormat =
+"$target attr-dict `:` functional-type(operands, results)";
+
+  let extraClassDeclaration = [{
+::mlir::DiagnosedSilenceableFailure applyToOne(
+::mlir::transform::TransformRewriter &rewriter,
+::mlir::Operation *target,
+::mlir::transform::ApplyToEachResultList &results,
+::mlir::transform::TransformState &state);
+  }];
+}
+
 def LoopPipelineOp : Op {
diff --git a/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp 
b/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
index 62370604142cd5..dcba6a8b406b21 100644
--- a/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
+++ b/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
@@ -206,6 +206,153 @@ 
transform::LoopPeelOp::applyToOne(transform::TransformRewriter &rewriter,
   return DiagnosedSilenceableFailure::success();
 }
 
+//===-===//
+// LoopContinuousPeelOp
+//===-===//
+
+static LogicalResult splitLoopHelper(RewriterBase &b, scf::ForOp &forOp,
+ scf::ForOp &partialIteration,
+ Value &splitBound) {
+  RewriterBase::InsertionGuard guard(b);
+  auto lbInt = getConstantIntValue(forOp.getLowerBound());
+  auto ubInt = getConstantIntValue(forOp.getUpperBound());
+  auto stepInt = getConstantIntValue(forOp.getStep());
+
+  // No specialization necessary if step already divides upper bound evenly.
+  if (lbInt && ubInt && stepInt && (*ubInt - *lbInt) % *stepInt == 0)
+return failure();
+  // No specialization necessary if step size is 1.
+  if (stepInt == static_cast(1))
+return failure();
+
+  // Create ForOp for partial iteration.
+  b.setInsertionPointAfter(forOp);
+  partialIteration = cast(b.clone(*forOp.getOperation()));
+  partialIteration.getLowerBoundMutable().assign(splitBound);
+  forOp.replaceAllUsesWith(partialIteration->getResults());
+  partialIteration.getInitArgsMutable().assign(forOp->getResults());
+
+  // Set new upper loop bound.
+  b.updateRootInPlace(
+  forOp, [&]() { forOp.getUpperBoundMutable().assign(splitBound); });
+
+  return success();
+}
+
+static scf::IfOp convertSingleIterFor(RewriterBase &b, scf::ForOp &forOp) {
+  Location loc = forOp->getLoc();
+  IRMapping mapping;
+  mapping.map(forOp.getInductionVar(), forOp.getLowerBound());
+  for (auto [arg, operand] :
+   llvm::zip(forOp.getRegionIterArgs(), forOp.getInitsMutable())) {
+mapping.map(arg, operand.get());
+  }
+  b.setInsertionPoint(forOp);
+  auto cond =
+  b.create(loc, arith::CmpIPredicate::slt,
+  forOp.getLowerBound(), forOp.getUpperBound());
+  auto ifOp = b.create(loc, forOp->getResultTypes(), cond, true);
+  // then branch
+  b.setInsert

[clang] [flang][driver] Don't use -whole-archive on Darwin (PR #75393)

2023-12-14 Thread Leandro Lupori via cfe-commits

https://github.com/luporl approved this pull request.

LGTM. Worked fine on my machine.

NOTE: tested by replacing `CommonArgs.cpp` in main, to avoid conflicts.

https://github.com/llvm/llvm-project/pull/75393
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [clang-tools-extra] [llvm] [libcxx] [compiler-rt] [mlir] [libc] [clang] [lldb] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 updated 
https://github.com/llvm/llvm-project/pull/71555

>From 7bb2f9793b2a2cccbaa401f6e2ac850b587f2b59 Mon Sep 17 00:00:00 2001
From: Muneeb Khan 
Date: Tue, 7 Nov 2023 23:52:17 +0800
Subject: [PATCH 1/9] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF

This patch adds continuous loop peeling to scf loop transforms
in the MLIR backend. This transforms the target loop into a
chain of loops, with step sizes that are powers of two and
decrease exponetially across subsequent loops. Originally
authored by Litu Zhou litu.z...@huawei.com.
---
 .../SCF/TransformOps/SCFTransformOps.td   |  36 +
 .../SCF/TransformOps/SCFTransformOps.cpp  | 147 ++
 .../Dialect/SCF/loop-continuous-peel.mlir |  98 
 3 files changed, 281 insertions(+)
 create mode 100644 mlir/test/Dialect/SCF/loop-continuous-peel.mlir

diff --git a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td 
b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
index 14df7e23a430fb..e3d79a7f0ae40f 100644
--- a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
+++ b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
@@ -147,6 +147,42 @@ def LoopPeelOp : Op {
+  let description = [{
+Transforms the loop into a chain of loops, with step sizes that are
+powers of two and decrease exponetially across subsequent loops.
+The transform is similar to loop.peel in the effect that it creates a loop
+with a step (that is power of 2) to divide the range evenly, with the
+difference that the remaining iterations are spread across similar loops
+with exponentially decreasing step sizes, with the last loop with step size
+of 2^0 = 1.
+
+ Return modes
+
+This operation consumes the `target` handles and produces the
+continuously-peeled loop.
+  }];
+
+  let arguments =
+  (ins TransformHandleTypeInterface:$target,
+   DefaultValuedAttr:$single_iter_opt);
+  // TODO: Return both the peeled loop and the remainder loop.
+  let results = (outs TransformHandleTypeInterface:$transformed);
+
+  let assemblyFormat =
+"$target attr-dict `:` functional-type(operands, results)";
+
+  let extraClassDeclaration = [{
+::mlir::DiagnosedSilenceableFailure applyToOne(
+::mlir::transform::TransformRewriter &rewriter,
+::mlir::Operation *target,
+::mlir::transform::ApplyToEachResultList &results,
+::mlir::transform::TransformState &state);
+  }];
+}
+
 def LoopPipelineOp : Op {
diff --git a/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp 
b/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
index 62370604142cd5..dcba6a8b406b21 100644
--- a/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
+++ b/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
@@ -206,6 +206,153 @@ 
transform::LoopPeelOp::applyToOne(transform::TransformRewriter &rewriter,
   return DiagnosedSilenceableFailure::success();
 }
 
+//===-===//
+// LoopContinuousPeelOp
+//===-===//
+
+static LogicalResult splitLoopHelper(RewriterBase &b, scf::ForOp &forOp,
+ scf::ForOp &partialIteration,
+ Value &splitBound) {
+  RewriterBase::InsertionGuard guard(b);
+  auto lbInt = getConstantIntValue(forOp.getLowerBound());
+  auto ubInt = getConstantIntValue(forOp.getUpperBound());
+  auto stepInt = getConstantIntValue(forOp.getStep());
+
+  // No specialization necessary if step already divides upper bound evenly.
+  if (lbInt && ubInt && stepInt && (*ubInt - *lbInt) % *stepInt == 0)
+return failure();
+  // No specialization necessary if step size is 1.
+  if (stepInt == static_cast(1))
+return failure();
+
+  // Create ForOp for partial iteration.
+  b.setInsertionPointAfter(forOp);
+  partialIteration = cast(b.clone(*forOp.getOperation()));
+  partialIteration.getLowerBoundMutable().assign(splitBound);
+  forOp.replaceAllUsesWith(partialIteration->getResults());
+  partialIteration.getInitArgsMutable().assign(forOp->getResults());
+
+  // Set new upper loop bound.
+  b.updateRootInPlace(
+  forOp, [&]() { forOp.getUpperBoundMutable().assign(splitBound); });
+
+  return success();
+}
+
+static scf::IfOp convertSingleIterFor(RewriterBase &b, scf::ForOp &forOp) {
+  Location loc = forOp->getLoc();
+  IRMapping mapping;
+  mapping.map(forOp.getInductionVar(), forOp.getLowerBound());
+  for (auto [arg, operand] :
+   llvm::zip(forOp.getRegionIterArgs(), forOp.getInitsMutable())) {
+mapping.map(arg, operand.get());
+  }
+  b.setInsertionPoint(forOp);
+  auto cond =
+  b.create(loc, arith::CmpIPredicate::slt,
+  forOp.getLowerBound(), forOp.getUpperBound());
+  auto ifOp = b.create(loc, forOp->getResultTypes(), cond, true);
+  // then branch
+  b.setInsert

[clang-tools-extra] [libcxx] [llvm] [compiler-rt] [libc] [flang] [clang] [RISCV][MC] Add support for experimental Zimop extension (PR #75182)

2023-12-14 Thread Wang Pengcheng via cfe-commits

https://github.com/wangpc-pp approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/75182
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][ARM] support arm target attribute, and warning for bad typo (PR #74812)

2023-12-14 Thread via cfe-commits

hstk30-hw wrote:

CI fail in `ninja check-clang` build  `bin/clang-format.exe` in windows, it has 
nothing about my code. @gkistanova 

https://github.com/llvm/llvm-project/pull/74812
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 101083e - [AArch64][SME2] Add SQRSHRN, UQRSHRN, SQRSHRUN builtins for SME2, SVE2p1 (#75325)

2023-12-14 Thread via cfe-commits

Author: Dinar Temirbulatov
Date: 2023-12-14T11:38:45Z
New Revision: 101083e4b7f7e274a42291ba39f5a122f5d9d11d

URL: 
https://github.com/llvm/llvm-project/commit/101083e4b7f7e274a42291ba39f5a122f5d9d11d
DIFF: 
https://github.com/llvm/llvm-project/commit/101083e4b7f7e274a42291ba39f5a122f5d9d11d.diff

LOG: [AArch64][SME2] Add SQRSHRN, UQRSHRN, SQRSHRUN builtins for SME2, SVE2p1 
(#75325)

Add SQRSHRN, UQRSHRN, SQRSHRUN builtins for SME2, SVE2p1.

Added: 
clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qrshr.c

Modified: 
clang/include/clang/Basic/arm_sve.td

Removed: 




diff  --git a/clang/include/clang/Basic/arm_sve.td 
b/clang/include/clang/Basic/arm_sve.td
index 826ff498d83d07..278a791ff760dc 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -2193,6 +2193,15 @@ let TargetGuard = "sme2" in {
   def SVSQDMULH_X4: SInst<"svqdmulh[_{d}_x4]","444", "csil", 
MergeNone, "aarch64_sve_sqdmulh_vgx4",[IsStreaming], []>;
 }
 
+let TargetGuard = "sve2p1|sme2" in {
+  // SQRSHRN / UQRSHRN
+  def SVQRSHRN_X2   : SInst<"svqrshrn[_n]_{0}[_{d}_x2]", "h2i", "i",
MergeNone, "aarch64_sve_sqrshrn_x2", [IsStreamingCompatible], [ImmCheck<1, 
ImmCheck1_16>]>;
+  def SVUQRSHRN_X2  : SInst<"svqrshrn[_n]_{0}[_{d}_x2]", "e2i", "Ui",   
MergeNone, "aarch64_sve_uqrshrn_x2", [IsStreamingCompatible], [ImmCheck<1, 
ImmCheck1_16>]>;
+
+  // SQRSHRUN
+  def SVSQRSHRUN_X2 : SInst<"svqrshrun[_n]_{0}[_{d}_x2]", "e2i", "i",  
MergeNone, "aarch64_sve_sqrshrun_x2", [IsStreamingCompatible], [ImmCheck<1, 
ImmCheck1_16>]>;
+}
+
 let TargetGuard = "sve2p1" in {
   // ZIPQ1, ZIPQ2, UZPQ1, UZPQ2
   def SVZIPQ1 : SInst<"svzipq1[_{d}]", "ddd", "cUcsUsiUilUlbhfd", MergeNone, 
"aarch64_sve_zipq1", [], []>;

diff  --git a/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qrshr.c 
b/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qrshr.c
new file mode 100644
index 00..8e8b7203148934
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qrshr.c
@@ -0,0 +1,78 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | 
opt -S  -passes=mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S  
-passes=mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x 
c++ %s | opt -S  -passes=mem2reg,instcombine,tailcallelim | FileCheck %s 
-check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -S -disable-O0-optnone -Werror -Wall 
-emit-llvm -o - %s | opt -S  -passes=mem2reg,instcombine,tailcallelim | 
FileCheck %s
+// RUN: %clang_cc1 -DSVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -S -disable-O0-optnone -Werror -Wall 
-emit-llvm -o - -x c++ %s | opt -S  -passes=mem2reg,instcombine,tailcallelim | 
FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +sme-f64f64 -S -disable-O0-optnone 
-Werror -Wall -o /dev/null %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 -S 
-disable-O0-optnone -Werror -Wall -o /dev/null %s
+
+#include 
+
+#ifdef SVE_OVERLOADED_FORMS
+// A simple used,unused... macro, long enough to represent any SVE builtin.
+#define SVE_ACLE_FUNC(A1,A2_UNUSED,A3,A4_UNUSED,A5) A1##A3##A5
+#else
+#define SVE_ACLE_FUNC(A1,A2,A3,A4,A5) A1##A2##A3##A4##A5
+#endif
+
+
+// SQRSHRN x 2
+
+// CHECK-LABEL: @test_svqrshrn_s16_s32_x2(
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.vector.extract.nxv4i32.nxv8i32( [[ZN:%.*]], i64 0)
+// CHECK-NEXT:[[TMP1:%.*]] = tail call  
@llvm.vector.extract.nxv4i32.nxv8i32( [[ZN]], i64 4)
+// CHECK-NEXT:[[TMP2:%.*]] = tail call  
@llvm.aarch64.sve.sqrshrn.x2.nxv4i32( [[TMP0]],  [[TMP1]], i32 16)
+// CHECK-NEXT:ret  [[TMP2]]
+//
+// CPP-CHECK-LABEL: @_Z24test_svqrshrn_s16_s32_x211svint32x2_t(
+// CPP-CHECK-NEXT:  entry:
+// CPP-CHECK-NEXT:[[TMP0:%.*]] = tail call  
@llvm.vector.extract.nxv4i32.nxv8i32( [[ZN:%.*]], i64 0)
+// CPP-CHECK-NEXT:[[TMP1:%.*]] = tail call  
@llvm.vector.extract.nxv4i32.nxv8i32( [[ZN]], i64 4)
+// CPP-CHECK-NEXT:[[TMP2:%.*]] = tail call  
@llvm.aarch64.sve.sqrshrn.x2.nxv4i32( [[TMP0]],  [[TMP1]], i32 16)
+// CPP-CHECK-NEXT:ret  [[TMP2]]
+//
+svint16_t test_svqrshrn_s16_s32_x2(svint32x2_t zn) __arm_streaming_compatible {
+  return SVE_ACLE_FU

[clang] [AArch64][SME2] Add SQRSHRN, UQRSHRN, SQRSHRUN builtins for SME2, SVE2p1 (PR #75325)

2023-12-14 Thread Dinar Temirbulatov via cfe-commits

https://github.com/dtemirbulatov closed 
https://github.com/llvm/llvm-project/pull/75325
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [clang-tools-extra] [llvm] [libcxx] [compiler-rt] [mlir] [libc] [clang] [lldb] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 updated 
https://github.com/llvm/llvm-project/pull/71555

>From 7bb2f9793b2a2cccbaa401f6e2ac850b587f2b59 Mon Sep 17 00:00:00 2001
From: Muneeb Khan 
Date: Tue, 7 Nov 2023 23:52:17 +0800
Subject: [PATCH 01/10] [MLIR][LLVM] Add Continuous Loop Peeling transform to
 SCF

This patch adds continuous loop peeling to scf loop transforms
in the MLIR backend. This transforms the target loop into a
chain of loops, with step sizes that are powers of two and
decrease exponetially across subsequent loops. Originally
authored by Litu Zhou litu.z...@huawei.com.
---
 .../SCF/TransformOps/SCFTransformOps.td   |  36 +
 .../SCF/TransformOps/SCFTransformOps.cpp  | 147 ++
 .../Dialect/SCF/loop-continuous-peel.mlir |  98 
 3 files changed, 281 insertions(+)
 create mode 100644 mlir/test/Dialect/SCF/loop-continuous-peel.mlir

diff --git a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td 
b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
index 14df7e23a430fb..e3d79a7f0ae40f 100644
--- a/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
+++ b/mlir/include/mlir/Dialect/SCF/TransformOps/SCFTransformOps.td
@@ -147,6 +147,42 @@ def LoopPeelOp : Op {
+  let description = [{
+Transforms the loop into a chain of loops, with step sizes that are
+powers of two and decrease exponetially across subsequent loops.
+The transform is similar to loop.peel in the effect that it creates a loop
+with a step (that is power of 2) to divide the range evenly, with the
+difference that the remaining iterations are spread across similar loops
+with exponentially decreasing step sizes, with the last loop with step size
+of 2^0 = 1.
+
+ Return modes
+
+This operation consumes the `target` handles and produces the
+continuously-peeled loop.
+  }];
+
+  let arguments =
+  (ins TransformHandleTypeInterface:$target,
+   DefaultValuedAttr:$single_iter_opt);
+  // TODO: Return both the peeled loop and the remainder loop.
+  let results = (outs TransformHandleTypeInterface:$transformed);
+
+  let assemblyFormat =
+"$target attr-dict `:` functional-type(operands, results)";
+
+  let extraClassDeclaration = [{
+::mlir::DiagnosedSilenceableFailure applyToOne(
+::mlir::transform::TransformRewriter &rewriter,
+::mlir::Operation *target,
+::mlir::transform::ApplyToEachResultList &results,
+::mlir::transform::TransformState &state);
+  }];
+}
+
 def LoopPipelineOp : Op {
diff --git a/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp 
b/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
index 62370604142cd5..dcba6a8b406b21 100644
--- a/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
+++ b/mlir/lib/Dialect/SCF/TransformOps/SCFTransformOps.cpp
@@ -206,6 +206,153 @@ 
transform::LoopPeelOp::applyToOne(transform::TransformRewriter &rewriter,
   return DiagnosedSilenceableFailure::success();
 }
 
+//===-===//
+// LoopContinuousPeelOp
+//===-===//
+
+static LogicalResult splitLoopHelper(RewriterBase &b, scf::ForOp &forOp,
+ scf::ForOp &partialIteration,
+ Value &splitBound) {
+  RewriterBase::InsertionGuard guard(b);
+  auto lbInt = getConstantIntValue(forOp.getLowerBound());
+  auto ubInt = getConstantIntValue(forOp.getUpperBound());
+  auto stepInt = getConstantIntValue(forOp.getStep());
+
+  // No specialization necessary if step already divides upper bound evenly.
+  if (lbInt && ubInt && stepInt && (*ubInt - *lbInt) % *stepInt == 0)
+return failure();
+  // No specialization necessary if step size is 1.
+  if (stepInt == static_cast(1))
+return failure();
+
+  // Create ForOp for partial iteration.
+  b.setInsertionPointAfter(forOp);
+  partialIteration = cast(b.clone(*forOp.getOperation()));
+  partialIteration.getLowerBoundMutable().assign(splitBound);
+  forOp.replaceAllUsesWith(partialIteration->getResults());
+  partialIteration.getInitArgsMutable().assign(forOp->getResults());
+
+  // Set new upper loop bound.
+  b.updateRootInPlace(
+  forOp, [&]() { forOp.getUpperBoundMutable().assign(splitBound); });
+
+  return success();
+}
+
+static scf::IfOp convertSingleIterFor(RewriterBase &b, scf::ForOp &forOp) {
+  Location loc = forOp->getLoc();
+  IRMapping mapping;
+  mapping.map(forOp.getInductionVar(), forOp.getLowerBound());
+  for (auto [arg, operand] :
+   llvm::zip(forOp.getRegionIterArgs(), forOp.getInitsMutable())) {
+mapping.map(arg, operand.get());
+  }
+  b.setInsertionPoint(forOp);
+  auto cond =
+  b.create(loc, arith::CmpIPredicate::slt,
+  forOp.getLowerBound(), forOp.getUpperBound());
+  auto ifOp = b.create(loc, forOp->getResultTypes(), cond, true);
+  // then branch
+  b.setIns

[clang] [Driver] Remove all vendor triples (PR #75459)

2023-12-14 Thread Timm Baeder via cfe-commits

tbaederr wrote:

Care to explain?

https://github.com/llvm/llvm-project/pull/75459
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][ARM] support arm target attribute, and warning for bad typo (PR #74812)

2023-12-14 Thread David Spickett via cfe-commits

DavidSpickett wrote:

Not so sure about that:
```
LINK: command "C:\BuildTools\VC\Tools\MSVC\14.29.30133\bin\Hostx64\x64\link.exe 
/nologo 
tools\clang\tools\clang-format\CMakeFiles\clang-format.dir\ClangFormat.cpp.obj 
tools\clang\tools\clang-format\CMakeFiles\clang-format.dir\C_\ws\src\llvm\resources\windows_version_resource.rc.res
 /out:bin\clang-format.exe /implib:lib\clang-format.lib 
/pdb:bin\clang-format.pdb /version:0.0 /machine:x64 /STACK:1000 
/INCREMENTAL:NO /subsystem:console lib\LLVMSupport.lib lib\clangBasic.lib 
lib\clangFormat.lib lib\clangRewrite.lib lib\clangToolingCore.lib 
lib\clangToolingInclusions.lib lib\clangToolingCore.lib lib\clangRewrite.lib 
lib\clangLex.lib lib\clangBasic.lib lib\LLVMFrontendOpenMP.lib 
lib\LLVMScalarOpts.lib lib\LLVMAggressiveInstCombine.lib 
lib\LLVMInstCombine.lib lib\LLVMFrontendOffloading.lib 
lib\LLVMTransformUtils.lib lib\LLVMAnalysis.lib lib\LLVMProfileData.lib 
lib\LLVMSymbolize.lib lib\LLVMDebugInfoPDB.lib C:\BuildTools\DIA 
SDK\lib\amd64\diaguids.lib lib\LLVMDebugInfoMSF.lib lib\LLVMDebugInfoBTF.lib 
lib\LLVMDebugInfoDWARF.lib lib\LLVMObject.lib lib\LLVMIRReader.lib 
lib\LLVMBitReader.lib lib\LLVMAsmParser.lib lib\LLVMCore.lib 
lib\LLVMRemarks.lib lib\LLVMBitstreamReader.lib lib\LLVMMCParser.lib 
lib\LLVMMC.lib lib\LLVMDebugInfoCodeView.lib lib\LLVMTextAPI.lib 
lib\LLVMBinaryFormat.lib lib\LLVMTargetParser.lib lib\LLVMSupport.lib psapi.lib 
shell32.lib ole32.lib uuid.lib advapi32.lib WS2_32.lib delayimp.lib 
-delayload:shell32.dll -delayload:ole32.dll lib\LLVMDemangle.lib kernel32.lib 
user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib 
comdlg32.lib advapi32.lib /MANIFEST 
/MANIFESTFILE:bin\clang-format.exe.manifest" failed (exit code 1120) with the 
following output:
clangBasic.lib(ARM.cpp.obj) : error LNK2019: unresolved external symbol 
"private: void __cdecl clang::APValue::DestroyDataAndMakeUninit(void)" 
(?DestroyDataAndMakeUninit@APValue@clang@@AEAAXXZ) referenced in function 
"public: __cdecl clang::APValue::~APValue(void)" (??1APValue@clang@@QEAA@XZ)
bin\clang-format.exe : fatal error LNK1120: 1 unresolved externals
```

```
clangBasic.lib(ARM.cpp.obj) : error LNK2019: unresolved external symbol 
"private: void __cdecl clang::APValue::DestroyDataAndMakeUninit(void)"
```
Which is the file you're modifying.

I see that you have added:
```
#include "clang/AST/Attr.h"
```
`DestroyDataAndMakeUninit` is in `clang/lib/AST/APValue.cpp`, which is built 
into the library `clangAST`.

`clang/lib/Basic/Targets/ARM.cpp` is built into `clangBasic` which does not 
depend on `clangAST`, but does `target_link_libraries` for it. I'm not sure of 
the difference there, why do one or the other. (clang/lib/Basic/CMakeLists.txt)

(if you want to know what library a cpp file ends up in, look in the same 
folder for a CMakeLists.txt, that usually has the name, one folder up if not 
and so on)

So I guess that on Linux, `target_link_libraries` does what's required but on 
Windows it does not.

https://github.com/llvm/llvm-project/pull/74812
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [clang-tools-extra] [llvm] [libcxx] [compiler-rt] [mlir] [libc] [clang] [lldb] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 edited 
https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [clang-tools-extra] [llvm] [libcxx] [compiler-rt] [mlir] [libc] [clang] [lldb] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 edited 
https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [clang-tools-extra] [llvm] [libcxx] [compiler-rt] [mlir] [libc] [clang] [lldb] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 edited 
https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][ARM] support arm target attribute, and warning for bad typo (PR #74812)

2023-12-14 Thread David Spickett via cfe-commits

DavidSpickett wrote:

Also it seems that `clangAST` links to `clangBasic` but it doesn't `DEPENDS` on 
them? Not sure what that is trying to achieve.

https://github.com/llvm/llvm-project/pull/74812
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [clang] [lld] [mlir] [libcxx] [lldb] [compiler-rt] [clang-tools-extra] [llvm] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 edited 
https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [clang] [lld] [mlir] [libcxx] [lldb] [compiler-rt] [clang-tools-extra] [llvm] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits


@@ -105,6 +106,168 @@ static void specializeForLoopForUnrolling(ForOp op) {
   op.erase();
 }
 
+static LogicalResult splitLoopHelper(RewriterBase &b, scf::ForOp &forOp,
+ scf::ForOp &partialIteration,
+ Value &splitBound) {
+  RewriterBase::InsertionGuard guard(b);
+  auto lbInt = getConstantIntValue(forOp.getLowerBound());
+  auto ubInt = getConstantIntValue(forOp.getUpperBound());
+  auto stepInt = getConstantIntValue(forOp.getStep());
+
+  // No specialization necessary if step already divides upper bound evenly.
+  if (lbInt && ubInt && stepInt && (*ubInt - *lbInt) % *stepInt == 0)
+return failure();
+  // No specialization necessary if step size is 1.
+  if (stepInt == static_cast(1))
+return failure();
+
+  // Create ForOp for partial iteration.
+  b.setInsertionPointAfter(forOp);
+  partialIteration = cast(b.clone(*forOp.getOperation()));
+  partialIteration.getLowerBoundMutable().assign(splitBound);
+  forOp.replaceAllUsesWith(partialIteration->getResults());
+  partialIteration.getInitArgsMutable().assign(forOp->getResults());
+
+  // Set new upper loop bound.
+  b.updateRootInPlace(
+  forOp, [&]() { forOp.getUpperBoundMutable().assign(splitBound); });
+
+  return success();
+}
+
+static scf::IfOp convertSingleIterFor(RewriterBase &b, scf::ForOp &forOp) {
+  Location loc = forOp->getLoc();
+  IRMapping mapping;
+  mapping.map(forOp.getInductionVar(), forOp.getLowerBound());
+  for (auto [arg, operand] :
+   llvm::zip(forOp.getRegionIterArgs(), forOp.getInitsMutable())) {
+mapping.map(arg, operand.get());
+  }
+  b.setInsertionPoint(forOp);
+  auto cond =
+  b.create(loc, arith::CmpIPredicate::slt,
+  forOp.getLowerBound(), forOp.getUpperBound());
+  auto ifOp = b.create(loc, forOp->getResultTypes(), cond, true);
+  // then branch
+  b.setInsertionPointToStart(ifOp.thenBlock());
+  for (Operation &op : forOp.getBody()->getOperations()) {
+b.clone(op, mapping);

muneebkhan85 wrote:

Done. 

https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 07e3c24 - [clang][Interp] Support empty initlist initializers for complex types

2023-12-14 Thread Timm Bäder via cfe-commits

Author: Timm Bäder
Date: 2023-12-14T12:53:40+01:00
New Revision: 07e3c245ba2c8560123cf4559678e0ac2542

URL: 
https://github.com/llvm/llvm-project/commit/07e3c245ba2c8560123cf4559678e0ac2542
DIFF: 
https://github.com/llvm/llvm-project/commit/07e3c245ba2c8560123cf4559678e0ac2542.diff

LOG: [clang][Interp] Support empty initlist initializers for complex types

Differential Revision: https://reviews.llvm.org/D147369

Added: 


Modified: 
clang/lib/AST/Interp/ByteCodeExprGen.cpp
clang/test/AST/Interp/complex.cpp

Removed: 




diff  --git a/clang/lib/AST/Interp/ByteCodeExprGen.cpp 
b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
index efa98c6517a2ef..a4a00ddab65036 100644
--- a/clang/lib/AST/Interp/ByteCodeExprGen.cpp
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
@@ -672,18 +672,28 @@ bool ByteCodeExprGen::VisitInitListExpr(const 
InitListExpr *E) {
   }
 
   if (T->isAnyComplexType()) {
-unsigned InitIndex = 0;
-for (const Expr *Init : E->inits()) {
-  PrimType InitT = classifyPrim(Init->getType());
-
-  if (!this->visit(Init))
-return false;
+unsigned NumInits = E->getNumInits();
+QualType ElemQT = E->getType()->getAs()->getElementType();
+PrimType ElemT = classifyPrim(ElemQT);
+if (NumInits == 0) {
+  // Zero-initialize both elements.
+  for (unsigned I = 0; I < 2; ++I) {
+if (!this->visitZeroInitializer(ElemT, ElemQT, E))
+  return false;
+if (!this->emitInitElem(ElemT, I, E))
+  return false;
+  }
+} else if (NumInits == 2) {
+  unsigned InitIndex = 0;
+  for (const Expr *Init : E->inits()) {
+if (!this->visit(Init))
+  return false;
 
-  if (!this->emitInitElem(InitT, InitIndex, E))
-return false;
-  ++InitIndex;
+if (!this->emitInitElem(ElemT, InitIndex, E))
+  return false;
+++InitIndex;
+  }
 }
-assert(InitIndex == 2);
 return true;
   }
 

diff  --git a/clang/test/AST/Interp/complex.cpp 
b/clang/test/AST/Interp/complex.cpp
index 4fd2b5cfd73640..ba9fcd39fdd777 100644
--- a/clang/test/AST/Interp/complex.cpp
+++ b/clang/test/AST/Interp/complex.cpp
@@ -29,5 +29,28 @@ static_assert(__real(I1) == 1, "");
 static_assert(__imag(I1) == 2, "");
 
 
+constexpr _Complex double D1 = {};
+static_assert(__real(D1) == 0, "");
+static_assert(__imag(D1) == 0, "");
+
+constexpr _Complex int I2 = {};
+static_assert(__real(I2) == 0, "");
+static_assert(__imag(I2) == 0, "");
+
+
+#if 0
+/// FIXME: This should work in the new interpreter.
+constexpr _Complex double D2 = {12};
+static_assert(__real(D2) == 12, "");
+static_assert(__imag(D2) == 12, "");
+
+constexpr _Complex int I3 = {15};
+static_assert(__real(I3) == 15, "");
+static_assert(__imag(I3) == 15, "");
+#endif
+
+
+
+
 /// FIXME: This should work in the new interpreter as well.
 // constexpr _Complex _BitInt(8) A = 0;// = {4};



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][ARM] support arm target attribute, and warning for bad typo (PR #74812)

2023-12-14 Thread David Spickett via cfe-commits

DavidSpickett wrote:

I was mistaken about the `target_link_libraries`, that's what `clangBasic` 
links to not `clangAST`.

It's possible that `clangBasic` now needs to depend on `clangAST`, assuming 
cmake and the linker are ok with that.

https://github.com/llvm/llvm-project/pull/74812
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [clang] [lld] [mlir] [libcxx] [lldb] [compiler-rt] [clang-tools-extra] [llvm] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 edited 
https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][AArch64]Add QCVTN builtin to SVE2.1 (PR #75454)

2023-12-14 Thread via cfe-commits

https://github.com/CarolineConcatto updated 
https://github.com/llvm/llvm-project/pull/75454

>From 3508b4fbd9b4b9b51553a590b237e443fb58e098 Mon Sep 17 00:00:00 2001
From: Caroline Concatto 
Date: Thu, 14 Dec 2023 09:50:36 +
Subject: [PATCH 1/2] [Clang][AArch64]Add QCVTN builtin to SVE2.1

 ``` c
   // All the intrinsics below are [SVE2.1 or SME2]
   // Variants are also available for _u16[_s32]_x2 and _u16[_u32]_x2
   svint16_t svqcvtn_s16[_s32_x2](svint32x2_t zn);
   ```

According to PR#257[1]

[1]https://github.com/ARM-software/acle/pull/257
---
 clang/include/clang/Basic/arm_sve.td  |  4 +-
 .../acle_sve2p1_qcvtn.c   | 78 +++
 2 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100644 
clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c

diff --git a/clang/include/clang/Basic/arm_sve.td 
b/clang/include/clang/Basic/arm_sve.td
index db6f17d1c493af..6979e65fbf4cb4 100644
--- a/clang/include/clang/Basic/arm_sve.td
+++ b/clang/include/clang/Basic/arm_sve.td
@@ -2266,11 +2266,13 @@ let TargetGuard = "sme2" in {
 //
 // Multi-vector saturating extract narrow and interleave
 //
-let TargetGuard = "sme2" in {
+let TargetGuard = "sme2|sve2p1" in {
   def SVQCVTN_S16_S32_X2 : SInst<"svqcvtn_s16[_{d}_x2]", "h2.d", "i", 
MergeNone, "aarch64_sve_sqcvtn_x2", [IsStreamingCompatible], []>;
   def SVQCVTN_U16_U32_X2 : SInst<"svqcvtn_u16[_{d}_x2]", "e2.d", "Ui", 
MergeNone, "aarch64_sve_uqcvtn_x2", [IsStreamingCompatible], []>;
   def SVQCVTN_U16_S32_X2 : SInst<"svqcvtn_u16[_{d}_x2]", "e2.d", "i", 
MergeNone, "aarch64_sve_sqcvtun_x2", [IsStreamingCompatible], []>;
+}
 
+let TargetGuard = "sme2" in {
   def SVQCVTN_S8_S32_X4 : SInst<"svqcvtn_s8[_{d}_x4]", "q4.d", "i", MergeNone, 
"aarch64_sve_sqcvtn_x4", [IsStreaming], []>;
   def SVQCVTN_U8_U32_X4 : SInst<"svqcvtn_u8[_{d}_x4]", "b4.d", "Ui", 
MergeNone, "aarch64_sve_uqcvtn_x4", [IsStreaming], []>;
   def SVQCVTN_U8_S32_X4 : SInst<"svqcvtn_u8[_{d}_x4]", "b4.d", "i", MergeNone, 
"aarch64_sve_sqcvtun_x4", [IsStreaming], []>;
diff --git a/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c 
b/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c
new file mode 100644
index 00..477b7b0a08e671
--- /dev/null
+++ b/clang/test/CodeGen/aarch64-sve2p1-intrinsics/acle_sve2p1_qcvtn.c
@@ -0,0 +1,78 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+
+// REQUIRES: aarch64-registered-target
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 
-target-feature +bf16 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | 
opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve1p1 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve2p1 -target-feature +bf16 -S -disable-O0-optnone -Werror 
-Wall -emit-llvm -o - %s | opt -S -p mem2reg,instcombine,tailcallelim | 
FileCheck %s
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve2p1 -target-feature +bf16 -S -disable-O0-optnone -Werror 
-Wall -emit-llvm -o - -x c++ %s | opt -S -p mem2reg,instcombine,tailcallelim | 
FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 
-target-feature +bf16 -S -disable-O0-optnone -Werror -Wall -o /dev/null %s
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -emit-llvm -o - -x c++ %s | opt -S -p 
mem2reg,instcombine,tailcallelim | FileCheck %s -check-prefix=CPP-CHECK
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -target-feature +bf16 
-DSME2_STANDALONE_TEST -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s 
| opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1  -D__SVE_OVERLOADED_FORMS -triple aarch64-none-linux-gnu 
-target-feature +sve -target-feature +sme2 -target-feature +bf16 
-DSME2_STANDALONE_TEST -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - -x 
c++ %s | opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s 
-check-prefix=CPP-CHECK
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve 
-target-feature +sme2 -target-feature +bf16 -DSME2_STANDALONE_TEST -S 
-disable-O0-optnone -Werror -Wall -o /

[libc] [clang] [lld] [mlir] [libcxx] [lldb] [compiler-rt] [clang-tools-extra] [llvm] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

muneebkhan85 wrote:

ping @matthias-springer 

https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][SME2] Add multi-vector zip & unzip builtins (PR #74841)

2023-12-14 Thread Dinar Temirbulatov via cfe-commits

https://github.com/dtemirbulatov approved this pull request.

LGTM, with David's request to rename acle_sme2_* acle_sme2_vector_* tests in 
clang/test/CodeGen/aarch64-sme2-intrinsics

https://github.com/llvm/llvm-project/pull/74841
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [Clang][AArch64]Add QCVTN builtin to SVE2.1 (PR #75454)

2023-12-14 Thread Kerry McLaughlin via cfe-commits


@@ -0,0 +1,78 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py
+
+// REQUIRES: aarch64-registered-target
+
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 
-target-feature +bf16 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | 
opt -S -p mem2reg,instcombine,tailcallelim | FileCheck %s
+// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve1p1 
-target-feature +sme2 -target-feature +bf16 -S -disable-O0-optnone -Werror 
-Wall -emit-llvm -o - -x c++ %s | opt -S -p mem2reg,instcombine,tailcallelim | 
FileCheck %s -check-prefix=CPP-CHECK

kmclaughlin-arm wrote:

This run line uses +sme2, but I think it should only be for sve2p1?

It also contains +sve1p1, which I think is meant to be +sve2p1 :)

https://github.com/llvm/llvm-project/pull/75454
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang][driver] Don't use -whole-archive on Darwin (PR #75393)

2023-12-14 Thread Andrzej Warzyński via cfe-commits

https://github.com/banach-space updated 
https://github.com/llvm/llvm-project/pull/75393

From 95b4db0690d5725011a741f81237f5954bc08ff8 Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski 
Date: Wed, 13 Dec 2023 22:05:07 +
Subject: [PATCH] [flang][driver] Don't use -whole-archive on Darwin

Direct follow-up of #7312 - the linker on Darwin does not support
`-whole-archive`, so that needs to be removed from the linker
invocation.

For context:
  * https://github.com/llvm/llvm-project/pull/7312
---
 clang/lib/Driver/ToolChains/CommonArgs.cpp | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp 
b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 01fb0718b4079d..ac1abd82e49768 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1132,24 +1132,29 @@ void tools::addFortranRuntimeLibs(const ToolChain &TC, 
const ArgList &Args,
   // --whole-archive flag to the link line.  If it's not, add a proper
   // --whole-archive/--no-whole-archive bracket to the link line.
   bool WholeArchiveActive = false;
-  for (auto *Arg : Args.filtered(options::OPT_Wl_COMMA))
-if (Arg)
+  for (auto *Arg : Args.filtered(options::OPT_Wl_COMMA)) {
+if (Arg) {
   for (StringRef ArgValue : Arg->getValues()) {
 if (ArgValue == "--whole-archive")
   WholeArchiveActive = true;
 if (ArgValue == "--no-whole-archive")
   WholeArchiveActive = false;
   }
+}
+  }
 
-  if (!WholeArchiveActive)
+  if (!WholeArchiveActive && !TC.getTriple().isMacOSX()) {
 CmdArgs.push_back("--whole-archive");
-  CmdArgs.push_back("-lFortran_main");
-  if (!WholeArchiveActive)
+CmdArgs.push_back("-lFortran_main");
 CmdArgs.push_back("--no-whole-archive");
+  } else {
+CmdArgs.push_back("-lFortran_main");
+  }
+
+  // Perform regular linkage of the remaining runtime libraries.
+  CmdArgs.push_back("-lFortranRuntime");
+  CmdArgs.push_back("-lFortranDecimal");
 }
-// Perform regular linkage of the remaining runtime libraries.
-CmdArgs.push_back("-lFortranRuntime");
-CmdArgs.push_back("-lFortranDecimal");
   } else {
 if (LinkFortranMain) {
   unsigned RTOptionID = options::OPT__SLASH_MT;

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang][driver] Don't use -whole-archive on Darwin (PR #75393)

2023-12-14 Thread Andrzej Warzyński via cfe-commits

banach-space wrote:

> LGTM. Worked fine on my machine.
> 
> NOTE: tested by replacing `CommonArgs.cpp` in main, to avoid conflicts.

Thanks for checking and apologies for the merge conflict - I thought that I was 
up to date :( I've just rebased and force-pushed. 

I will be landing this today if there are no further comments 🙏🏻 

https://github.com/llvm/llvm-project/pull/75393
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 497480b - [clang][Interp] IntegralComplexToBoolean casts

2023-12-14 Thread Timm Bäder via cfe-commits

Author: Timm Bäder
Date: 2023-12-14T13:11:00+01:00
New Revision: 497480b38a49977b67c33651b3f29d5f1d151793

URL: 
https://github.com/llvm/llvm-project/commit/497480b38a49977b67c33651b3f29d5f1d151793
DIFF: 
https://github.com/llvm/llvm-project/commit/497480b38a49977b67c33651b3f29d5f1d151793.diff

LOG: [clang][Interp] IntegralComplexToBoolean casts

Differential Revision: https://reviews.llvm.org/D148426

Added: 


Modified: 
clang/lib/AST/Interp/ByteCodeExprGen.cpp
clang/lib/AST/Interp/ByteCodeExprGen.h
clang/test/AST/Interp/complex.cpp

Removed: 




diff  --git a/clang/lib/AST/Interp/ByteCodeExprGen.cpp 
b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
index a4a00ddab65036..c428446386c04b 100644
--- a/clang/lib/AST/Interp/ByteCodeExprGen.cpp
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
@@ -222,6 +222,52 @@ bool ByteCodeExprGen::VisitCastExpr(const 
CastExpr *CE) {
 return this->emitNE(PtrT, CE);
   }
 
+  case CK_IntegralComplexToBoolean: {
+std::optional ElemT =
+classifyComplexElementType(SubExpr->getType());
+if (!ElemT)
+  return false;
+// We emit the expression (__real(E) != 0 || __imag(E) != 0)
+// for us, that means (bool)E[0] || (bool)E[1]
+if (!this->visit(SubExpr))
+  return false;
+if (!this->emitConstUint8(0, CE))
+  return false;
+if (!this->emitArrayElemPtrUint8(CE))
+  return false;
+if (!this->emitLoadPop(*ElemT, CE))
+  return false;
+if (!this->emitCast(*ElemT, PT_Bool, CE))
+  return false;
+// We now have the bool value of E[0] on the stack.
+LabelTy LabelTrue = this->getLabel();
+if (!this->jumpTrue(LabelTrue))
+  return false;
+
+if (!this->emitConstUint8(1, CE))
+  return false;
+if (!this->emitArrayElemPtrPopUint8(CE))
+  return false;
+if (!this->emitLoadPop(*ElemT, CE))
+  return false;
+if (!this->emitCast(*ElemT, PT_Bool, CE))
+  return false;
+// Leave the boolean value of E[1] on the stack.
+LabelTy EndLabel = this->getLabel();
+this->jump(EndLabel);
+
+this->emitLabel(LabelTrue);
+if (!this->emitPopPtr(CE))
+  return false;
+if (!this->emitConstBool(true, CE))
+  return false;
+
+this->fallthrough(EndLabel);
+this->emitLabel(EndLabel);
+
+return true;
+  }
+
   case CK_ToVoid:
 return discard(SubExpr);
 
@@ -1673,7 +1719,8 @@ template  bool 
ByteCodeExprGen::visit(const Expr *E) {
 return this->discard(E);
 
   // Create local variable to hold the return value.
-  if (!E->isGLValue() && !classify(E->getType())) {
+  if (!E->isGLValue() && !E->getType()->isAnyComplexType() &&
+  !classify(E->getType())) {
 std::optional LocalIndex = allocateLocal(E, /*IsExtended=*/true);
 if (!LocalIndex)
   return false;
@@ -1859,6 +1906,9 @@ bool ByteCodeExprGen::dereference(
 return Indirect(*T);
   }
 
+  if (LV->getType()->isAnyComplexType())
+return visit(LV);
+
   return false;
 }
 

diff  --git a/clang/lib/AST/Interp/ByteCodeExprGen.h 
b/clang/lib/AST/Interp/ByteCodeExprGen.h
index bc1d5d11a11513..1c4739544454af 100644
--- a/clang/lib/AST/Interp/ByteCodeExprGen.h
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.h
@@ -285,6 +285,14 @@ class ByteCodeExprGen : public 
ConstStmtVisitor, bool>,
   }
 
   bool emitPrimCast(PrimType FromT, PrimType ToT, QualType ToQT, const Expr 
*E);
+  std::optional classifyComplexElementType(QualType T) const {
+assert(T->isAnyComplexType());
+
+QualType ElemType = T->getAs()->getElementType();
+
+return this->classify(ElemType);
+  }
+
   bool emitRecordDestruction(const Descriptor *Desc);
   unsigned collectBaseOffset(const RecordType *BaseType,
  const RecordType *DerivedType);

diff  --git a/clang/test/AST/Interp/complex.cpp 
b/clang/test/AST/Interp/complex.cpp
index ba9fcd39fdd777..dbdbc2f7356e6b 100644
--- a/clang/test/AST/Interp/complex.cpp
+++ b/clang/test/AST/Interp/complex.cpp
@@ -49,8 +49,21 @@ static_assert(__real(I3) == 15, "");
 static_assert(__imag(I3) == 15, "");
 #endif
 
-
-
-
 /// FIXME: This should work in the new interpreter as well.
 // constexpr _Complex _BitInt(8) A = 0;// = {4};
+
+namespace CastToBool {
+  constexpr _Complex int F = {0, 1};
+  static_assert(F, "");
+  constexpr _Complex int F2 = {1, 0};
+  static_assert(F2, "");
+  constexpr _Complex int F3 = {0, 0};
+  static_assert(!F3, "");
+
+  constexpr _Complex unsigned char F4 = {0, 1};
+  static_assert(F4, "");
+  constexpr _Complex unsigned char F5 = {1, 0};
+  static_assert(F5, "");
+  constexpr _Complex unsigned char F6 = {0, 0};
+  static_assert(!F6, "");
+}



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang][driver] Don't use -whole-archive on Darwin (PR #75393)

2023-12-14 Thread Michael Klemm via cfe-commits

mjklemm wrote:

> > LGTM. Worked fine on my machine.
> 
> > 
> 
> > NOTE: tested by replacing `CommonArgs.cpp` in main, to avoid conflicts.
> 
> 
> 
> Thanks for checking and apologies for the merge conflict - I thought that I 
> was up to date :( I've just rebased and force-pushed. 
> 
> 
> 
> I will be landing this today if there are no further comments 🙏🏻 

This patch seems to hide the original problem of the compiler not erroring out 
when multiple definitions of main happen.  T think we should rather inject the 
proper linker equivalent of --whole-archive on Darwin, too.

I'd be OK to land this patch for now, if we agree to have a follow up PR to 
(re-)instantiate the error handling also for Darwin.

https://github.com/llvm/llvm-project/pull/75393
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [IR] Fix GEP offset computations for vector GEPs (PR #75448)

2023-12-14 Thread Jannik Silvanus via cfe-commits

jasilvanus wrote:

> There seem to be some clang test failures though that I'm currently looking 
> into.

Solved: The clang failures were caused by me having `dxv` in `$PATH`, which 
implicitly enabled some extra `dxv`-based tests which also fail without this 
patch for me. 

Clang was invoking `dxv - -o -`, and `dxv` complained about `-` not being a 
file. Not sure whether automatically enabling extra tests based on `$PATH` is 
desirable behavior, but that's a different topic.

https://github.com/llvm/llvm-project/pull/75448
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2023-12-14 Thread Jan Patrick Lehr via cfe-commits

https://github.com/jplehr created 
https://github.com/llvm/llvm-project/pull/75467

This adds a basic test to check the correct generation of double indirect 
access to declare target globals in USM mode vs non-USM mode.
I am a bit unhappy with the way this test is set up, but could not find a 
better way to do it. Happy to improve that and add more tests then.

Marked as XFAIL to first land test and then enable in subsequent patch.

>From ea2a9191122c5659aac380803b381f763c816e07 Mon Sep 17 00:00:00 2001
From: JP Lehr 
Date: Wed, 12 Jul 2023 05:04:41 -0400
Subject: [PATCH] [OpenMP][USM] Adds test for -fopenmp-force-usm flag

This adds a basic test to check the correct generation of double
indirect access to declare target globals in USM mode vs non-USM mode.

Marked as XFAIL to first land test and then enable in subsequent patch.
---
 clang/test/OpenMP/force-usm.c | 73 +++
 1 file changed, 73 insertions(+)
 create mode 100644 clang/test/OpenMP/force-usm.c

diff --git a/clang/test/OpenMP/force-usm.c b/clang/test/OpenMP/force-usm.c
new file mode 100644
index 00..222705322b8976
--- /dev/null
+++ b/clang/test/OpenMP/force-usm.c
@@ -0,0 +1,73 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" "pl_cond[.].+[.|,]" 
--prefix-filecheck-ir-name _ --version 3
+// XFAIL: amdgpu-registered-target
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -emit-llvm-bc %s -o 
%t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-USM %s
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-DEFAULT %s
+// expected-no-diagnostics
+
+extern "C" void *malloc(unsigned int b);
+
+int GI;
+#pragma omp declare target
+int *pGI;
+#pragma omp end declare target
+
+int main(void) {
+
+  GI = 0;
+
+  pGI = (int *) malloc(sizeof(int));
+  *pGI = 42;
+
+#pragma omp target map(pGI[:1], GI)
+  {
+GI = 1;
+*pGI = 2;
+  }
+
+  return 0;
+}
+
+// CHECK-USM-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-USM-SAME: (ptr noundef nonnull align 4 dereferenceable(4) [[GI:%.*]]) 
#[[ATTR0:[0-9]+]] {
+// CHECK-USM-NEXT:  entry:
+// CHECK-USM-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-USM-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) 
[[GI_ADDR]] to ptr
+// CHECK-USM-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-USM-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-USM-NEXT:br i1 [[EXEC_USER_CODE]], label [[USER_CODE_ENTRY:%.*]], 
label [[WORKER_EXIT:%.*]]
+// CHECK-USM:   user_code.entry:
+// CHECK-USM-NEXT:store i32 1, ptr [[TMP0]], align 4
+// CHECK-USM-NEXT:[[TMP2:%.*]] = load ptr, ptr @pGI_decl_tgt_ref_ptr, 
align 8
+// CHECK-USM-NEXT:[[TMP3:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK-USM-NEXT:store i32 2, ptr [[TMP3]], align 4
+// CHECK-USM-NEXT:call void @__kmpc_target_deinit(ptr addrspacecast (ptr 
addrspace(1) @[[GLOB1]] to ptr), i8 1)
+// CHECK-USM-NEXT:ret void
+// CHECK-USM:   worker.exit:
+// CHECK-USM-NEXT:ret void
+//
+//
+// CHECK-DEFAULT-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-DEFAULT-SAME: (ptr noundef nonnull align 4 dereferenceable(4) 
[[GI:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-DEFAULT-NEXT:  entry:
+// CHECK-DEFAULT-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-DEFAULT-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr 
addrspace(5) [[GI_ADDR]] to ptr
+// CHECK-DEFAULT-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-DEFAULT-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], 
align 8
+// CHECK-DEFAULT-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-DEFAULT-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-DEFAULT-NEXT:   

[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Jan Patrick Lehr (jplehr)


Changes

This adds a basic test to check the correct generation of double indirect 
access to declare target globals in USM mode vs non-USM mode.
I am a bit unhappy with the way this test is set up, but could not find a 
better way to do it. Happy to improve that and add more tests then.

Marked as XFAIL to first land test and then enable in subsequent patch.

---
Full diff: https://github.com/llvm/llvm-project/pull/75467.diff


1 Files Affected:

- (added) clang/test/OpenMP/force-usm.c (+73) 


``diff
diff --git a/clang/test/OpenMP/force-usm.c b/clang/test/OpenMP/force-usm.c
new file mode 100644
index 00..222705322b8976
--- /dev/null
+++ b/clang/test/OpenMP/force-usm.c
@@ -0,0 +1,73 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" "pl_cond[.].+[.|,]" 
--prefix-filecheck-ir-name _ --version 3
+// XFAIL: amdgpu-registered-target
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -emit-llvm-bc %s -o 
%t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -include 
%S/../../lib/Headers/openmp_wrappers/usm/force_usm.h -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-USM %s
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple amdgcn-amd-amdhsa 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck 
-check-prefix=CHECK-DEFAULT %s
+// expected-no-diagnostics
+
+extern "C" void *malloc(unsigned int b);
+
+int GI;
+#pragma omp declare target
+int *pGI;
+#pragma omp end declare target
+
+int main(void) {
+
+  GI = 0;
+
+  pGI = (int *) malloc(sizeof(int));
+  *pGI = 42;
+
+#pragma omp target map(pGI[:1], GI)
+  {
+GI = 1;
+*pGI = 2;
+  }
+
+  return 0;
+}
+
+// CHECK-USM-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-USM-SAME: (ptr noundef nonnull align 4 dereferenceable(4) [[GI:%.*]]) 
#[[ATTR0:[0-9]+]] {
+// CHECK-USM-NEXT:  entry:
+// CHECK-USM-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-USM-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr addrspace(5) 
[[GI_ADDR]] to ptr
+// CHECK-USM-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-USM-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-USM-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-USM-NEXT:br i1 [[EXEC_USER_CODE]], label [[USER_CODE_ENTRY:%.*]], 
label [[WORKER_EXIT:%.*]]
+// CHECK-USM:   user_code.entry:
+// CHECK-USM-NEXT:store i32 1, ptr [[TMP0]], align 4
+// CHECK-USM-NEXT:[[TMP2:%.*]] = load ptr, ptr @pGI_decl_tgt_ref_ptr, 
align 8
+// CHECK-USM-NEXT:[[TMP3:%.*]] = load ptr, ptr [[TMP2]], align 8
+// CHECK-USM-NEXT:store i32 2, ptr [[TMP3]], align 4
+// CHECK-USM-NEXT:call void @__kmpc_target_deinit(ptr addrspacecast (ptr 
addrspace(1) @[[GLOB1]] to ptr), i8 1)
+// CHECK-USM-NEXT:ret void
+// CHECK-USM:   worker.exit:
+// CHECK-USM-NEXT:ret void
+//
+//
+// CHECK-DEFAULT-LABEL: define weak_odr protected amdgpu_kernel void 
@{{__omp_offloading_[0-9a-z]+_[0-9a-z]+}}_main_l25
+// CHECK-DEFAULT-SAME: (ptr noundef nonnull align 4 dereferenceable(4) 
[[GI:%.*]]) #[[ATTR0:[0-9]+]] {
+// CHECK-DEFAULT-NEXT:  entry:
+// CHECK-DEFAULT-NEXT:[[GI_ADDR:%.*]] = alloca ptr, align 8, addrspace(5)
+// CHECK-DEFAULT-NEXT:[[GI_ADDR_ASCAST:%.*]] = addrspacecast ptr 
addrspace(5) [[GI_ADDR]] to ptr
+// CHECK-DEFAULT-NEXT:store ptr [[GI]], ptr [[GI_ADDR_ASCAST]], align 8
+// CHECK-DEFAULT-NEXT:[[TMP0:%.*]] = load ptr, ptr [[GI_ADDR_ASCAST]], 
align 8
+// CHECK-DEFAULT-NEXT:[[TMP1:%.*]] = call i32 @__kmpc_target_init(ptr 
addrspacecast (ptr addrspace(1) @[[GLOB1:[0-9]+]] to ptr), i8 1, i1 true)
+// CHECK-DEFAULT-NEXT:[[EXEC_USER_CODE:%.*]] = icmp eq i32 [[TMP1]], -1
+// CHECK-DEFAULT-NEXT:br i1 [[EXEC_USER_CODE]], label 
[[USER_CODE_ENTRY:%.*]], label [[WORKER_EXIT:%.*]]
+// CHECK-DEFAULT:   user_code.entry:
+// CHECK-DEFAULT-NEXT:store i32 1, ptr [[TMP0]], align 4
+// CHECK-DEFAULT-NEXT:[[TMP2:%.*]] = load ptr, ptr addrspacecast (ptr 
addrspace(1) @pGI to ptr), align 8
+// CHECK-DEFAULT-NEXT:store i32 2, ptr [[TMP2]], align 4
+// CHECK-DEFAULT-NEXT:call void @__km

[clang] [OpenMP] Introduce -fopenmp-force-usm flag (PR #75468)

2023-12-14 Thread Jan Patrick Lehr via cfe-commits

https://github.com/jplehr created 
https://github.com/llvm/llvm-project/pull/75468

The new flag implements logic to include `#pragma omp requires 
unified_shared_memory` in every translation unit.
This enables a straightforward way to enable USM for an application without the 
need to modify sources.

This is the flag mentioned in https://github.com/llvm/llvm-project/pull/75467
Once the test landed, I'll rebase and enable the test with this patch.

>From bc912bf0a63e6d10b60655d26846731d961021f3 Mon Sep 17 00:00:00 2001
From: JP Lehr 
Date: Thu, 6 Jul 2023 16:47:21 -0400
Subject: [PATCH] [OpenMP] Introduce -fopenmp-force-usm flag

The new flag implements logic to include #pragma omp requires
unified_shared_memory in every translation unit.
This enables a straightforward way to enable USM for an application
without the need to modify sources.
---
 clang/include/clang/Driver/Options.td |  2 ++
 clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp  | 14 ++
 clang/lib/Headers/CMakeLists.txt  |  1 +
 clang/lib/Headers/openmp_wrappers/usm/force_usm.h |  6 ++
 4 files changed, 23 insertions(+)
 create mode 100644 clang/lib/Headers/openmp_wrappers/usm/force_usm.h

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..b9cd3043a13a9a 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3381,6 +3381,8 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], 
"fopenmp-cuda-blocks-per-sm=">
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
 def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"], 
"fopenmp-cuda-teams-reduction-recs-num=">, Group,
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
+def fopenmp_force_usm : Flag<["-"], "fopenmp-force-usm">, Group,
+  Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[CC1Option]>;
 
 
//===--===//
 // Shared cc1 + fc1 OpenMP Target Options
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp 
b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
index b012b7cb729378..2484a59085c276 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -129,6 +129,20 @@ AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList 
&Args) const {
 void AMDGPUOpenMPToolChain::AddClangSystemIncludeArgs(
 const ArgList &DriverArgs, ArgStringList &CC1Args) const {
   HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
+
+  CC1Args.push_back("-internal-isystem");
+  SmallString<128> P(HostTC.getDriver().ResourceDir);
+  llvm::sys::path::append(P, "include/cuda_wrappers");
+  CC1Args.push_back(DriverArgs.MakeArgString(P));
+
+  // Force APU mode will focefully include #pragma omp requires
+  // unified_shared_memory via the force_usm header
+  if (DriverArgs.hasArg(options::OPT_fopenmp_force_usm)) {
+CC1Args.push_back("-include");
+CC1Args.push_back(
+DriverArgs.MakeArgString(HostTC.getDriver().ResourceDir +
+ "/include/openmp_wrappers/force_usm.h"));
+  }
 }
 
 void AMDGPUOpenMPToolChain::AddIAMCUIncludeArgs(const ArgList &Args,
diff --git a/clang/lib/Headers/CMakeLists.txt b/clang/lib/Headers/CMakeLists.txt
index f8fdd402777e48..aac232fa8b4405 100644
--- a/clang/lib/Headers/CMakeLists.txt
+++ b/clang/lib/Headers/CMakeLists.txt
@@ -319,6 +319,7 @@ set(openmp_wrapper_files
   openmp_wrappers/__clang_openmp_device_functions.h
   openmp_wrappers/complex_cmath.h
   openmp_wrappers/new
+  openmp_wrappers/usm/force_usm.h
 )
 
 set(llvm_libc_wrapper_files
diff --git a/clang/lib/Headers/openmp_wrappers/usm/force_usm.h 
b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
new file mode 100644
index 00..15c394e27ce9c2
--- /dev/null
+++ b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
@@ -0,0 +1,6 @@
+#ifndef __CLANG_FORCE_OPENMP_USM
+#define __CLANG_FORCE_OPENMP_USM
+
+#pragma omp requires unified_shared_memory
+
+#endif

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP] Introduce -fopenmp-force-usm flag (PR #75468)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-amdgpu

Author: Jan Patrick Lehr (jplehr)


Changes

The new flag implements logic to include `#pragma omp requires 
unified_shared_memory` in every translation unit.
This enables a straightforward way to enable USM for an application without the 
need to modify sources.

This is the flag mentioned in https://github.com/llvm/llvm-project/pull/75467
Once the test landed, I'll rebase and enable the test with this patch.

---
Full diff: https://github.com/llvm/llvm-project/pull/75468.diff


4 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+2) 
- (modified) clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp (+14) 
- (modified) clang/lib/Headers/CMakeLists.txt (+1) 
- (added) clang/lib/Headers/openmp_wrappers/usm/force_usm.h (+6) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..b9cd3043a13a9a 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3381,6 +3381,8 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], 
"fopenmp-cuda-blocks-per-sm=">
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
 def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"], 
"fopenmp-cuda-teams-reduction-recs-num=">, Group,
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
+def fopenmp_force_usm : Flag<["-"], "fopenmp-force-usm">, Group,
+  Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[CC1Option]>;
 
 
//===--===//
 // Shared cc1 + fc1 OpenMP Target Options
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp 
b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
index b012b7cb729378..2484a59085c276 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -129,6 +129,20 @@ AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList 
&Args) const {
 void AMDGPUOpenMPToolChain::AddClangSystemIncludeArgs(
 const ArgList &DriverArgs, ArgStringList &CC1Args) const {
   HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
+
+  CC1Args.push_back("-internal-isystem");
+  SmallString<128> P(HostTC.getDriver().ResourceDir);
+  llvm::sys::path::append(P, "include/cuda_wrappers");
+  CC1Args.push_back(DriverArgs.MakeArgString(P));
+
+  // Force APU mode will focefully include #pragma omp requires
+  // unified_shared_memory via the force_usm header
+  if (DriverArgs.hasArg(options::OPT_fopenmp_force_usm)) {
+CC1Args.push_back("-include");
+CC1Args.push_back(
+DriverArgs.MakeArgString(HostTC.getDriver().ResourceDir +
+ "/include/openmp_wrappers/force_usm.h"));
+  }
 }
 
 void AMDGPUOpenMPToolChain::AddIAMCUIncludeArgs(const ArgList &Args,
diff --git a/clang/lib/Headers/CMakeLists.txt b/clang/lib/Headers/CMakeLists.txt
index f8fdd402777e48..aac232fa8b4405 100644
--- a/clang/lib/Headers/CMakeLists.txt
+++ b/clang/lib/Headers/CMakeLists.txt
@@ -319,6 +319,7 @@ set(openmp_wrapper_files
   openmp_wrappers/__clang_openmp_device_functions.h
   openmp_wrappers/complex_cmath.h
   openmp_wrappers/new
+  openmp_wrappers/usm/force_usm.h
 )
 
 set(llvm_libc_wrapper_files
diff --git a/clang/lib/Headers/openmp_wrappers/usm/force_usm.h 
b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
new file mode 100644
index 00..15c394e27ce9c2
--- /dev/null
+++ b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
@@ -0,0 +1,6 @@
+#ifndef __CLANG_FORCE_OPENMP_USM
+#define __CLANG_FORCE_OPENMP_USM
+
+#pragma omp requires unified_shared_memory
+
+#endif

``




https://github.com/llvm/llvm-project/pull/75468
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP] Introduce -fopenmp-force-usm flag (PR #75468)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: Jan Patrick Lehr (jplehr)


Changes

The new flag implements logic to include `#pragma omp requires 
unified_shared_memory` in every translation unit.
This enables a straightforward way to enable USM for an application without the 
need to modify sources.

This is the flag mentioned in https://github.com/llvm/llvm-project/pull/75467
Once the test landed, I'll rebase and enable the test with this patch.

---
Full diff: https://github.com/llvm/llvm-project/pull/75468.diff


4 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+2) 
- (modified) clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp (+14) 
- (modified) clang/lib/Headers/CMakeLists.txt (+1) 
- (added) clang/lib/Headers/openmp_wrappers/usm/force_usm.h (+6) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..b9cd3043a13a9a 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3381,6 +3381,8 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], 
"fopenmp-cuda-blocks-per-sm=">
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
 def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"], 
"fopenmp-cuda-teams-reduction-recs-num=">, Group,
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
+def fopenmp_force_usm : Flag<["-"], "fopenmp-force-usm">, Group,
+  Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[CC1Option]>;
 
 
//===--===//
 // Shared cc1 + fc1 OpenMP Target Options
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp 
b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
index b012b7cb729378..2484a59085c276 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -129,6 +129,20 @@ AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList 
&Args) const {
 void AMDGPUOpenMPToolChain::AddClangSystemIncludeArgs(
 const ArgList &DriverArgs, ArgStringList &CC1Args) const {
   HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
+
+  CC1Args.push_back("-internal-isystem");
+  SmallString<128> P(HostTC.getDriver().ResourceDir);
+  llvm::sys::path::append(P, "include/cuda_wrappers");
+  CC1Args.push_back(DriverArgs.MakeArgString(P));
+
+  // Force APU mode will focefully include #pragma omp requires
+  // unified_shared_memory via the force_usm header
+  if (DriverArgs.hasArg(options::OPT_fopenmp_force_usm)) {
+CC1Args.push_back("-include");
+CC1Args.push_back(
+DriverArgs.MakeArgString(HostTC.getDriver().ResourceDir +
+ "/include/openmp_wrappers/force_usm.h"));
+  }
 }
 
 void AMDGPUOpenMPToolChain::AddIAMCUIncludeArgs(const ArgList &Args,
diff --git a/clang/lib/Headers/CMakeLists.txt b/clang/lib/Headers/CMakeLists.txt
index f8fdd402777e48..aac232fa8b4405 100644
--- a/clang/lib/Headers/CMakeLists.txt
+++ b/clang/lib/Headers/CMakeLists.txt
@@ -319,6 +319,7 @@ set(openmp_wrapper_files
   openmp_wrappers/__clang_openmp_device_functions.h
   openmp_wrappers/complex_cmath.h
   openmp_wrappers/new
+  openmp_wrappers/usm/force_usm.h
 )
 
 set(llvm_libc_wrapper_files
diff --git a/clang/lib/Headers/openmp_wrappers/usm/force_usm.h 
b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
new file mode 100644
index 00..15c394e27ce9c2
--- /dev/null
+++ b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
@@ -0,0 +1,6 @@
+#ifndef __CLANG_FORCE_OPENMP_USM
+#define __CLANG_FORCE_OPENMP_USM
+
+#pragma omp requires unified_shared_memory
+
+#endif

``




https://github.com/llvm/llvm-project/pull/75468
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP] Introduce -fopenmp-force-usm flag (PR #75468)

2023-12-14 Thread via cfe-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-x86

Author: Jan Patrick Lehr (jplehr)


Changes

The new flag implements logic to include `#pragma omp requires 
unified_shared_memory` in every translation unit.
This enables a straightforward way to enable USM for an application without the 
need to modify sources.

This is the flag mentioned in https://github.com/llvm/llvm-project/pull/75467
Once the test landed, I'll rebase and enable the test with this patch.

---
Full diff: https://github.com/llvm/llvm-project/pull/75468.diff


4 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+2) 
- (modified) clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp (+14) 
- (modified) clang/lib/Headers/CMakeLists.txt (+1) 
- (added) clang/lib/Headers/openmp_wrappers/usm/force_usm.h (+6) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..b9cd3043a13a9a 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3381,6 +3381,8 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], 
"fopenmp-cuda-blocks-per-sm=">
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
 def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"], 
"fopenmp-cuda-teams-reduction-recs-num=">, Group,
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
+def fopenmp_force_usm : Flag<["-"], "fopenmp-force-usm">, Group,
+  Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[CC1Option]>;
 
 
//===--===//
 // Shared cc1 + fc1 OpenMP Target Options
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp 
b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
index b012b7cb729378..2484a59085c276 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -129,6 +129,20 @@ AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList 
&Args) const {
 void AMDGPUOpenMPToolChain::AddClangSystemIncludeArgs(
 const ArgList &DriverArgs, ArgStringList &CC1Args) const {
   HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
+
+  CC1Args.push_back("-internal-isystem");
+  SmallString<128> P(HostTC.getDriver().ResourceDir);
+  llvm::sys::path::append(P, "include/cuda_wrappers");
+  CC1Args.push_back(DriverArgs.MakeArgString(P));
+
+  // Force APU mode will focefully include #pragma omp requires
+  // unified_shared_memory via the force_usm header
+  if (DriverArgs.hasArg(options::OPT_fopenmp_force_usm)) {
+CC1Args.push_back("-include");
+CC1Args.push_back(
+DriverArgs.MakeArgString(HostTC.getDriver().ResourceDir +
+ "/include/openmp_wrappers/force_usm.h"));
+  }
 }
 
 void AMDGPUOpenMPToolChain::AddIAMCUIncludeArgs(const ArgList &Args,
diff --git a/clang/lib/Headers/CMakeLists.txt b/clang/lib/Headers/CMakeLists.txt
index f8fdd402777e48..aac232fa8b4405 100644
--- a/clang/lib/Headers/CMakeLists.txt
+++ b/clang/lib/Headers/CMakeLists.txt
@@ -319,6 +319,7 @@ set(openmp_wrapper_files
   openmp_wrappers/__clang_openmp_device_functions.h
   openmp_wrappers/complex_cmath.h
   openmp_wrappers/new
+  openmp_wrappers/usm/force_usm.h
 )
 
 set(llvm_libc_wrapper_files
diff --git a/clang/lib/Headers/openmp_wrappers/usm/force_usm.h 
b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
new file mode 100644
index 00..15c394e27ce9c2
--- /dev/null
+++ b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
@@ -0,0 +1,6 @@
+#ifndef __CLANG_FORCE_OPENMP_USM
+#define __CLANG_FORCE_OPENMP_USM
+
+#pragma omp requires unified_shared_memory
+
+#endif

``




https://github.com/llvm/llvm-project/pull/75468
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [clang-tools-extra] [llvm] [libcxx] [compiler-rt] [mlir] [libc] [clang] [lldb] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 edited 
https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP] Introduce -fopenmp-force-usm flag (PR #75468)

2023-12-14 Thread Jan Patrick Lehr via cfe-commits

https://github.com/jplehr updated 
https://github.com/llvm/llvm-project/pull/75468

>From 9809ba1ec31cb1a4a066f709ae8bd3e965e1 Mon Sep 17 00:00:00 2001
From: JP Lehr 
Date: Thu, 6 Jul 2023 16:47:21 -0400
Subject: [PATCH] [OpenMP] Introduce -fopenmp-force-usm flag

The new flag implements logic to include #pragma omp requires
unified_shared_memory in every translation unit.
This enables a straightforward way to enable USM for an application
without the need to modify sources.
---
 clang/include/clang/Driver/Options.td|  2 ++
 clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp | 16 
 clang/lib/Headers/CMakeLists.txt |  1 +
 .../lib/Headers/openmp_wrappers/usm/force_usm.h  |  6 ++
 4 files changed, 25 insertions(+)
 create mode 100644 clang/lib/Headers/openmp_wrappers/usm/force_usm.h

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 1b02087425b751..b9cd3043a13a9a 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -3381,6 +3381,8 @@ def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], 
"fopenmp-cuda-blocks-per-sm=">
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
 def fopenmp_cuda_teams_reduction_recs_num_EQ : Joined<["-"], 
"fopenmp-cuda-teams-reduction-recs-num=">, Group,
   Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[ClangOption, CC1Option]>;
+def fopenmp_force_usm : Flag<["-"], "fopenmp-force-usm">, Group,
+  Flags<[NoArgumentUnused, HelpHidden]>, Visibility<[CC1Option]>;
 
 
//===--===//
 // Shared cc1 + fc1 OpenMP Target Options
diff --git a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp 
b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
index b012b7cb729378..a077f2f06d7728 100644
--- a/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+++ b/clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
@@ -129,6 +129,22 @@ AMDGPUOpenMPToolChain::GetCXXStdlibType(const ArgList 
&Args) const {
 void AMDGPUOpenMPToolChain::AddClangSystemIncludeArgs(
 const ArgList &DriverArgs, ArgStringList &CC1Args) const {
   HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
+
+  CC1Args.push_back("-internal-isystem");
+  SmallString<128> P(HostTC.getDriver().ResourceDir);
+  llvm::sys::path::append(P, "include/cuda_wrappers");
+  CC1Args.push_back(DriverArgs.MakeArgString(P));
+
+  // Force USM mode will forcefully include #pragma omp requires
+  // unified_shared_memory via the force_usm header
+  // XXX This may result in a compilation error if the source
+  // file already includes that pragma.
+  if (DriverArgs.hasArg(options::OPT_fopenmp_force_usm)) {
+CC1Args.push_back("-include");
+CC1Args.push_back(
+DriverArgs.MakeArgString(HostTC.getDriver().ResourceDir +
+ "/include/openmp_wrappers/force_usm.h"));
+  }
 }
 
 void AMDGPUOpenMPToolChain::AddIAMCUIncludeArgs(const ArgList &Args,
diff --git a/clang/lib/Headers/CMakeLists.txt b/clang/lib/Headers/CMakeLists.txt
index f8fdd402777e48..aac232fa8b4405 100644
--- a/clang/lib/Headers/CMakeLists.txt
+++ b/clang/lib/Headers/CMakeLists.txt
@@ -319,6 +319,7 @@ set(openmp_wrapper_files
   openmp_wrappers/__clang_openmp_device_functions.h
   openmp_wrappers/complex_cmath.h
   openmp_wrappers/new
+  openmp_wrappers/usm/force_usm.h
 )
 
 set(llvm_libc_wrapper_files
diff --git a/clang/lib/Headers/openmp_wrappers/usm/force_usm.h 
b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
new file mode 100644
index 00..15c394e27ce9c2
--- /dev/null
+++ b/clang/lib/Headers/openmp_wrappers/usm/force_usm.h
@@ -0,0 +1,6 @@
+#ifndef __CLANG_FORCE_OPENMP_USM
+#define __CLANG_FORCE_OPENMP_USM
+
+#pragma omp requires unified_shared_memory
+
+#endif

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] Multilib support for libraries with exceptions (PR #75031)

2023-12-14 Thread via cfe-commits

https://github.com/pwprzybyla updated 
https://github.com/llvm/llvm-project/pull/75031

>From 536e2f694f662d688cdbb8a0c5487a5a0d8d3aaf Mon Sep 17 00:00:00 2001
From: Piotr Przybyla 
Date: Wed, 29 Nov 2023 14:05:00 +
Subject: [PATCH] Multilib support for libraries with exceptions

---
 clang/include/clang/Driver/ToolChain.h | 10 ++
 clang/lib/Driver/ToolChain.cpp | 22 +-
 2 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/clang/include/clang/Driver/ToolChain.h 
b/clang/include/clang/Driver/ToolChain.h
index 2d0c1f826c1728..fbe2e8fe8e88d8 100644
--- a/clang/include/clang/Driver/ToolChain.h
+++ b/clang/include/clang/Driver/ToolChain.h
@@ -120,6 +120,11 @@ class ToolChain {
 RM_Disabled,
   };
 
+  enum ExceptionsMode {
+EM_Enabled,
+EM_Disabled,
+  };
+
   struct BitCodeLibraryInfo {
 std::string Path;
 bool ShouldInternalize;
@@ -141,6 +146,8 @@ class ToolChain {
 
   const RTTIMode CachedRTTIMode;
 
+  const ExceptionsMode CachedExceptionsMode;
+
   /// The list of toolchain specific path prefixes to search for libraries.
   path_list LibraryPaths;
 
@@ -318,6 +325,9 @@ class ToolChain {
   // Returns the RTTIMode for the toolchain with the current arguments.
   RTTIMode getRTTIMode() const { return CachedRTTIMode; }
 
+  // Returns the ExceptionsMode for the toolchain with the current arguments.
+  ExceptionsMode getExceptionsMode() const { return CachedExceptionsMode; }
+
   /// Return any implicit target and/or mode flag for an invocation of
   /// the compiler driver as `ProgName`.
   ///
diff --git a/clang/lib/Driver/ToolChain.cpp b/clang/lib/Driver/ToolChain.cpp
index ab19166f18c2dc..e4afaa20130856 100644
--- a/clang/lib/Driver/ToolChain.cpp
+++ b/clang/lib/Driver/ToolChain.cpp
@@ -77,10 +77,21 @@ static ToolChain::RTTIMode CalculateRTTIMode(const ArgList 
&Args,
   return NoRTTI ? ToolChain::RM_Disabled : ToolChain::RM_Enabled;
 }
 
+static ToolChain::ExceptionsMode CalculateExceptionsMode(const ArgList &Args) {
+
+  Arg *exceptionsArg = Args.getLastArg(options::OPT_fno_exceptions);
+  if (exceptionsArg &&
+  exceptionsArg->getOption().matches(options::OPT_fno_exceptions)) {
+return ToolChain::EM_Disabled;
+  }
+  return ToolChain::EM_Enabled;
+}
+
 ToolChain::ToolChain(const Driver &D, const llvm::Triple &T,
  const ArgList &Args)
 : D(D), Triple(T), Args(Args), CachedRTTIArg(GetRTTIArgument(Args)),
-  CachedRTTIMode(CalculateRTTIMode(Args, Triple, CachedRTTIArg)) {
+  CachedRTTIMode(CalculateRTTIMode(Args, Triple, CachedRTTIArg)),
+  CachedExceptionsMode(CalculateExceptionsMode(Args)) {
   auto addIfExists = [this](path_list &List, const std::string &Path) {
 if (getVFS().exists(Path))
   List.push_back(Path);
@@ -264,6 +275,15 @@ ToolChain::getMultilibFlags(const llvm::opt::ArgList 
&Args) const {
 break;
   }
 
+  // Include fno-exceptions and fno-rtti
+  // to improve multilib selection
+  if (getRTTIMode() == ToolChain::RTTIMode::RM_Disabled) {
+Result.push_back("-fno-rtti");
+  }
+  if (getExceptionsMode() == ToolChain::ExceptionsMode::EM_Disabled) {
+Result.push_back("-fno-exceptions");
+  }
+
   // Sort and remove duplicates.
   std::sort(Result.begin(), Result.end());
   Result.erase(std::unique(Result.begin(), Result.end()), Result.end());

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] 935f5ee - [clang][Interp] ComplexFloatingToBoolean casts

2023-12-14 Thread Timm Bäder via cfe-commits

Author: Timm Bäder
Date: 2023-12-14T13:17:40+01:00
New Revision: 935f5ee9c9fd6ff358b07fb4ff8e21b77c1a5ce8

URL: 
https://github.com/llvm/llvm-project/commit/935f5ee9c9fd6ff358b07fb4ff8e21b77c1a5ce8
DIFF: 
https://github.com/llvm/llvm-project/commit/935f5ee9c9fd6ff358b07fb4ff8e21b77c1a5ce8.diff

LOG: [clang][Interp] ComplexFloatingToBoolean casts

Differential Revision: https://reviews.llvm.org/D150654

Added: 


Modified: 
clang/lib/AST/Interp/ByteCodeExprGen.cpp
clang/test/AST/Interp/complex.cpp

Removed: 




diff  --git a/clang/lib/AST/Interp/ByteCodeExprGen.cpp 
b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
index c428446386c04b..fdc84d0a0da005 100644
--- a/clang/lib/AST/Interp/ByteCodeExprGen.cpp
+++ b/clang/lib/AST/Interp/ByteCodeExprGen.cpp
@@ -222,7 +222,8 @@ bool ByteCodeExprGen::VisitCastExpr(const CastExpr 
*CE) {
 return this->emitNE(PtrT, CE);
   }
 
-  case CK_IntegralComplexToBoolean: {
+  case CK_IntegralComplexToBoolean:
+  case CK_FloatingComplexToBoolean: {
 std::optional ElemT =
 classifyComplexElementType(SubExpr->getType());
 if (!ElemT)
@@ -237,8 +238,14 @@ bool ByteCodeExprGen::VisitCastExpr(const 
CastExpr *CE) {
   return false;
 if (!this->emitLoadPop(*ElemT, CE))
   return false;
-if (!this->emitCast(*ElemT, PT_Bool, CE))
-  return false;
+if (*ElemT == PT_Float) {
+  if (!this->emitCastFloatingIntegral(PT_Bool, CE))
+return false;
+} else {
+  if (!this->emitCast(*ElemT, PT_Bool, CE))
+return false;
+}
+
 // We now have the bool value of E[0] on the stack.
 LabelTy LabelTrue = this->getLabel();
 if (!this->jumpTrue(LabelTrue))
@@ -250,8 +257,13 @@ bool ByteCodeExprGen::VisitCastExpr(const 
CastExpr *CE) {
   return false;
 if (!this->emitLoadPop(*ElemT, CE))
   return false;
-if (!this->emitCast(*ElemT, PT_Bool, CE))
-  return false;
+if (*ElemT == PT_Float) {
+  if (!this->emitCastFloatingIntegral(PT_Bool, CE))
+return false;
+} else {
+  if (!this->emitCast(*ElemT, PT_Bool, CE))
+return false;
+}
 // Leave the boolean value of E[1] on the stack.
 LabelTy EndLabel = this->getLabel();
 this->jump(EndLabel);

diff  --git a/clang/test/AST/Interp/complex.cpp 
b/clang/test/AST/Interp/complex.cpp
index dbdbc2f7356e6b..084a63d4701c23 100644
--- a/clang/test/AST/Interp/complex.cpp
+++ b/clang/test/AST/Interp/complex.cpp
@@ -66,4 +66,11 @@ namespace CastToBool {
   static_assert(F5, "");
   constexpr _Complex unsigned char F6 = {0, 0};
   static_assert(!F6, "");
+
+  constexpr _Complex float F7 = {0, 1};
+  static_assert(F7, "");
+  constexpr _Complex float F8 = {1, 0};
+  static_assert(F8, "");
+  constexpr _Complex double F9 = {0, 0};
+  static_assert(!F9, "");
 }



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[lld] [clang-tools-extra] [llvm] [libcxx] [compiler-rt] [mlir] [libc] [clang] [lldb] [MLIR][LLVM] Add Continuous Loop Peeling transform to SCF (PR #71555)

2023-12-14 Thread via cfe-commits

https://github.com/muneebkhan85 edited 
https://github.com/llvm/llvm-project/pull/71555
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [llvm] [ValueTracking] Add dominating condition support in computeKnownBits() (PR #73662)

2023-12-14 Thread via cfe-commits

XChy wrote:

> Optimization pipeline is doing simplifications and canonicalizations. If you 
> for example use `-target amdcgn`, then I think you will see that the codegen 
> is impacted negatively when not simplifying the control flow. So it depends 
> on the backend if one form is profitable or not. I don't know really which 
> form that should be considered best (simplest and easiest for most backends 
> to deal with) here. Just saying that it changed. And that could indeed be one 
> reason for regressions (as for our backend).

You're right, it depends on the backend. For GPU, it sounds good to hoist 
common operations as selects to realize it. But I'm not sure whether such 
transformation should happen in simplifycfg pass with specified target info(and 
option), or just in backend. If in simplifycfg, it may resist other 
optimizations between basic blocks.

https://github.com/llvm/llvm-project/pull/73662
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][USM] Adds test for -fopenmp-force-usm flag (PR #75467)

2023-12-14 Thread via cfe-commits

https://github.com/ronlieb approved this pull request.


https://github.com/llvm/llvm-project/pull/75467
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang][driver] Don't use -whole-archive on Darwin (PR #75393)

2023-12-14 Thread Andrzej Warzyński via cfe-commits

banach-space wrote:

> > > LGTM. Worked fine on my machine.
> > 
> > 
> > > 
> > 
> > 
> > > NOTE: tested by replacing `CommonArgs.cpp` in main, to avoid conflicts.
> > 
> > 
> > Thanks for checking and apologies for the merge conflict - I thought that I 
> > was up to date :( I've just rebased and force-pushed.
> > I will be landing this today if there are no further comments 🙏🏻
> 
> This patch seems to hide the original problem of the compiler not erroring 
> out when multiple definitions of main happen.

To clarify - this patch is a warkaround to avoid reverting 
https://github.com/llvm/llvm-project/pull/73124. And to buy ourselves some 
time. I did a bit of research and couldn't find any equivalents for Darwin for 
`-whole-archive` :( I am a bit concerned that this might be the only option 
TBH, but hopefully I am wrong.

> I'd be OK to land this patch for now, if we agree to have a follow up PR to 
> (re-)instantiate the error handling also for Darwin.

That would be ideal, but I won't have the bandwidth for that. We'll need a 
volunteer. Another option is to revert #73124. Either way, we need to unblock 
our Darwin users ASAP :)

https://github.com/llvm/llvm-project/pull/75393
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang][driver] Don't use -whole-archive on Darwin (PR #75393)

2023-12-14 Thread Andrzej Warzyński via cfe-commits

https://github.com/banach-space updated 
https://github.com/llvm/llvm-project/pull/75393

From 95b4db0690d5725011a741f81237f5954bc08ff8 Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski 
Date: Wed, 13 Dec 2023 22:05:07 +
Subject: [PATCH 1/2] [flang][driver] Don't use -whole-archive on Darwin

Direct follow-up of #7312 - the linker on Darwin does not support
`-whole-archive`, so that needs to be removed from the linker
invocation.

For context:
  * https://github.com/llvm/llvm-project/pull/7312
---
 clang/lib/Driver/ToolChains/CommonArgs.cpp | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp 
b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index 01fb0718b4079d..ac1abd82e49768 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1132,24 +1132,29 @@ void tools::addFortranRuntimeLibs(const ToolChain &TC, 
const ArgList &Args,
   // --whole-archive flag to the link line.  If it's not, add a proper
   // --whole-archive/--no-whole-archive bracket to the link line.
   bool WholeArchiveActive = false;
-  for (auto *Arg : Args.filtered(options::OPT_Wl_COMMA))
-if (Arg)
+  for (auto *Arg : Args.filtered(options::OPT_Wl_COMMA)) {
+if (Arg) {
   for (StringRef ArgValue : Arg->getValues()) {
 if (ArgValue == "--whole-archive")
   WholeArchiveActive = true;
 if (ArgValue == "--no-whole-archive")
   WholeArchiveActive = false;
   }
+}
+  }
 
-  if (!WholeArchiveActive)
+  if (!WholeArchiveActive && !TC.getTriple().isMacOSX()) {
 CmdArgs.push_back("--whole-archive");
-  CmdArgs.push_back("-lFortran_main");
-  if (!WholeArchiveActive)
+CmdArgs.push_back("-lFortran_main");
 CmdArgs.push_back("--no-whole-archive");
+  } else {
+CmdArgs.push_back("-lFortran_main");
+  }
+
+  // Perform regular linkage of the remaining runtime libraries.
+  CmdArgs.push_back("-lFortranRuntime");
+  CmdArgs.push_back("-lFortranDecimal");
 }
-// Perform regular linkage of the remaining runtime libraries.
-CmdArgs.push_back("-lFortranRuntime");
-CmdArgs.push_back("-lFortranDecimal");
   } else {
 if (LinkFortranMain) {
   unsigned RTOptionID = options::OPT__SLASH_MT;

From fd2c65b26ad8233cf686af84359f9a3c88cbe3ac Mon Sep 17 00:00:00 2001
From: Andrzej Warzynski 
Date: Thu, 14 Dec 2023 12:40:39 +
Subject: [PATCH 2/2] fixup! [flang][driver] Don't use -whole-archive on Darwin

Add an extra comment
---
 clang/lib/Driver/ToolChains/CommonArgs.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/clang/lib/Driver/ToolChains/CommonArgs.cpp 
b/clang/lib/Driver/ToolChains/CommonArgs.cpp
index ac1abd82e49768..3d1df58190ce05 100644
--- a/clang/lib/Driver/ToolChains/CommonArgs.cpp
+++ b/clang/lib/Driver/ToolChains/CommonArgs.cpp
@@ -1143,6 +1143,7 @@ void tools::addFortranRuntimeLibs(const ToolChain &TC, 
const ArgList &Args,
 }
   }
 
+  // TODO: Find an equivalent of `--whole-archive` for Darwin.
   if (!WholeArchiveActive && !TC.getTriple().isMacOSX()) {
 CmdArgs.push_back("--whole-archive");
 CmdArgs.push_back("-lFortran_main");

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang][driver] Don't use -whole-archive on Darwin (PR #75393)

2023-12-14 Thread Leandro Lupori via cfe-commits

luporl wrote:

I've run `flang/test/Driver/no-duplicate-main.f90` manually on Darwin and 
noticed that the third RUN line fails, because no error happens:
`! RUN: not %flang -o %t.exe %t %t.c-object 2>&1`

So maybe `-whole-archive` is just not needed on it?


https://github.com/llvm/llvm-project/pull/75393
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [flang][driver] Don't use -whole-archive on Darwin (PR #75393)

2023-12-14 Thread Leandro Lupori via cfe-commits

luporl wrote:

Some sources 
(https://stackoverflow.com/questions/16082470/osx-how-do-i-convert-a-static-library-to-a-dynamic-one)
 suggest the use of `-force_load`:

```
 -force_load path_to_archive
 Loads all members of the specified static archive library.  Note: 
-all_load forces all members of all archives to be
 loaded.  This option allows you to target a specific archive.
```

https://github.com/llvm/llvm-project/pull/75393
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   3   4   >