[llvm-branch-commits] [IR] Introduce `llvm.experimental.hot()` (PR #84850)

2024-03-12 Thread Dmitry Vyukov via llvm-branch-commits


@@ -1722,6 +1722,11 @@ def int_debugtrap : Intrinsic<[]>,
 def int_ubsantrap : Intrinsic<[], [llvm_i8_ty],
   [IntrNoReturn, IntrCold, ImmArg>]>;
 
+// Return true if profile counter for containing block is hot.
+def int_experimental_hot : Intrinsic<[llvm_i1_ty], [],
+  [IntrInaccessibleMemOnly, IntrWriteMem,

dvyukov wrote:

Can't IntrWriteMem have significant effect on performance of the generated 
code? Why exactly do we need it? A comment would be useful. It's not writing to 
memory, is there a more precise attribute to capture what we need?

https://github.com/llvm/llvm-project/pull/84850
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] [clangd] Add clangd 18 release notes (PR #84436)

2024-03-12 Thread kadir çetinkaya via llvm-branch-commits

kadircet wrote:

i believe @tstellar should have that power :rocket: 

Do you mind merging this into the branch, or advise on how we should do that 
instead?

https://github.com/llvm/llvm-project/pull/84436
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Simplify --[de]compress-debug-sections and don't compress SHF_ALLOC sections (PR #84885)

2024-03-12 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay created 
https://github.com/llvm/llvm-project/pull/84885

Make it easier to add custom section [de]compression. In GNU ld,
--compress-debug-sections doesn't compress SHF_ALLOC sections. Match its
behavior.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Simplify --[de]compress-debug-sections and don't compress SHF_ALLOC sections (PR #84885)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-binary-utilities

Author: Fangrui Song (MaskRay)


Changes

Make it easier to add custom section [de]compression. In GNU ld,
--compress-debug-sections doesn't compress SHF_ALLOC sections. Match its
behavior.


---
Full diff: https://github.com/llvm/llvm-project/pull/84885.diff


4 Files Affected:

- (modified) llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp (+25-40) 
- (modified) llvm/lib/ObjCopy/ELF/ELFObject.h (+1) 
- (modified) 
llvm/test/tools/llvm-objcopy/ELF/Inputs/compress-debug-sections.yaml (+4) 
- (modified) llvm/test/tools/llvm-objcopy/ELF/compress-debug-sections-zlib.test 
(+2) 


``diff
diff --git a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp 
b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
index f52bcb74938d15..36826e02a62712 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
+++ b/llvm/lib/ObjCopy/ELF/ELFObjcopy.cpp
@@ -214,33 +214,34 @@ static Error dumpSectionToFile(StringRef SecName, 
StringRef Filename,
SecName.str().c_str());
 }
 
-static bool isCompressable(const SectionBase &Sec) {
-  return !(Sec.Flags & ELF::SHF_COMPRESSED) &&
- StringRef(Sec.Name).starts_with(".debug");
-}
-
-static Error replaceDebugSections(
-Object &Obj, function_ref ShouldReplace,
-function_ref(const SectionBase *)> AddSection) {
+Error Object::compressOrDecompressSections(const CommonConfig &Config) {
   // Build a list of the debug sections we are going to replace.
   // We can't call `AddSection` while iterating over sections,
   // because it would mutate the sections array.
-  SmallVector ToReplace;
-  for (auto &Sec : Obj.sections())
-if (ShouldReplace(Sec))
-  ToReplace.push_back(&Sec);
-
-  // Build a mapping from original section to a new one.
-  DenseMap FromTo;
-  for (SectionBase *S : ToReplace) {
-Expected NewSection = AddSection(S);
-if (!NewSection)
-  return NewSection.takeError();
-
-FromTo[S] = *NewSection;
+  SmallVector>, 0>
+  ToReplace;
+  for (SectionBase &Sec : sections()) {
+if (!StringRef(Sec.Name).starts_with(".debug"))
+  continue;
+if (auto *CS = dyn_cast(&Sec)) {
+  if (Config.DecompressDebugSections) {
+ToReplace.emplace_back(
+&Sec, [=] { return &addSection(*CS); });
+  }
+} else if (!(Sec.Flags & SHF_ALLOC) &&
+   Config.CompressionType != DebugCompressionType::None) {
+  auto *S = &Sec;
+  ToReplace.emplace_back(S, [=] {
+return &addSection(
+CompressedSection(*S, Config.CompressionType, Is64Bits));
+  });
+}
   }
 
-  return Obj.replaceSections(FromTo);
+  DenseMap FromTo;
+  for (auto [S, Func] : ToReplace)
+FromTo[S] = Func();
+  return replaceSections(FromTo);
 }
 
 static bool isAArch64MappingSymbol(const Symbol &Sym) {
@@ -534,24 +535,8 @@ static Error replaceAndRemoveSections(const CommonConfig 
&Config,
   if (Error E = Obj.removeSections(ELFConfig.AllowBrokenLinks, RemovePred))
 return E;
 
-  if (Config.CompressionType != DebugCompressionType::None) {
-if (Error Err = replaceDebugSections(
-Obj, isCompressable,
-[&Config, &Obj](const SectionBase *S) -> Expected {
-  return &Obj.addSection(
-  CompressedSection(*S, Config.CompressionType, Obj.Is64Bits));
-}))
-  return Err;
-  } else if (Config.DecompressDebugSections) {
-if (Error Err = replaceDebugSections(
-Obj,
-[](const SectionBase &S) { return isa(&S); },
-[&Obj](const SectionBase *S) {
-  const CompressedSection *CS = cast(S);
-  return &Obj.addSection(*CS);
-}))
-  return Err;
-  }
+  if (Error E = Obj.compressOrDecompressSections(Config))
+return E;
 
   return Error::success();
 }
diff --git a/llvm/lib/ObjCopy/ELF/ELFObject.h b/llvm/lib/ObjCopy/ELF/ELFObject.h
index 7a2e20d82d1150..f72c109b6009e8 100644
--- a/llvm/lib/ObjCopy/ELF/ELFObject.h
+++ b/llvm/lib/ObjCopy/ELF/ELFObject.h
@@ -1210,6 +1210,7 @@ class Object {
 
   Error removeSections(bool AllowBrokenLinks,
std::function ToRemove);
+  Error compressOrDecompressSections(const CommonConfig &Config);
   Error replaceSections(const DenseMap &FromTo);
   Error removeSymbols(function_ref ToRemove);
   template  T &addSection(Ts &&...Args) {
diff --git 
a/llvm/test/tools/llvm-objcopy/ELF/Inputs/compress-debug-sections.yaml 
b/llvm/test/tools/llvm-objcopy/ELF/Inputs/compress-debug-sections.yaml
index 67d8435fa486c1..e2dfee9163a2b8 100644
--- a/llvm/test/tools/llvm-objcopy/ELF/Inputs/compress-debug-sections.yaml
+++ b/llvm/test/tools/llvm-objcopy/ELF/Inputs/compress-debug-sections.yaml
@@ -43,6 +43,10 @@ Sections:
 Type:SHT_PROGBITS
 Flags:   [ SHF_GROUP ]
 Content: '00'
+  - Name:.debug_alloc
+Type:SHT_PROGBITS
+Flags:   [ SHF_ALLOC ]
+Content: 
000102030405060708090a0b0c0d0e0f000102030405060708090a0b0c

[llvm-branch-commits] [llvm-objcopy] Simplify --[de]compress-debug-sections and don't compress SHF_ALLOC sections (PR #84885)

2024-03-12 Thread James Henderson via llvm-branch-commits


@@ -214,33 +214,34 @@ static Error dumpSectionToFile(StringRef SecName, 
StringRef Filename,
SecName.str().c_str());
 }
 
-static bool isCompressable(const SectionBase &Sec) {
-  return !(Sec.Flags & ELF::SHF_COMPRESSED) &&
- StringRef(Sec.Name).starts_with(".debug");
-}
-
-static Error replaceDebugSections(
-Object &Obj, function_ref ShouldReplace,
-function_ref(const SectionBase *)> AddSection) {
+Error Object::compressOrDecompressSections(const CommonConfig &Config) {
   // Build a list of the debug sections we are going to replace.
   // We can't call `AddSection` while iterating over sections,
   // because it would mutate the sections array.
-  SmallVector ToReplace;
-  for (auto &Sec : Obj.sections())
-if (ShouldReplace(Sec))
-  ToReplace.push_back(&Sec);
-
-  // Build a mapping from original section to a new one.
-  DenseMap FromTo;
-  for (SectionBase *S : ToReplace) {
-Expected NewSection = AddSection(S);
-if (!NewSection)
-  return NewSection.takeError();
-
-FromTo[S] = *NewSection;
+  SmallVector>, 0>

jh7370 wrote:

I had this idea in my head that `0` was the default for `SmallVector`?

https://github.com/llvm/llvm-project/pull/84885
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Simplify --[de]compress-debug-sections and don't compress SHF_ALLOC sections (PR #84885)

2024-03-12 Thread James Henderson via llvm-branch-commits

https://github.com/jh7370 commented:

Looks basically fine to me.

https://github.com/llvm/llvm-project/pull/84885
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Simplify --[de]compress-debug-sections and don't compress SHF_ALLOC sections (PR #84885)

2024-03-12 Thread James Henderson via llvm-branch-commits

https://github.com/jh7370 edited https://github.com/llvm/llvm-project/pull/84885
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Simplify --[de]compress-debug-sections and don't compress SHF_ALLOC sections (PR #84885)

2024-03-12 Thread James Henderson via llvm-branch-commits


@@ -12,8 +12,10 @@
 # CHECK: Name  TypeAddress  Off
Size   ES Flg Lk Inf Al
 # COMPRESSED:.debug_fooPROGBITS 40 
{{.*}} 00   C  0   0  8
 # COMPRESSED-NEXT:   .notdebug_foo PROGBITS {{.*}} 
08 00  0   0  0
+# COMPRESSED:.debug_alloc  PROGBITS {{.*}} 
40 00   A  0   0  0

jh7370 wrote:

You presumably need changes to the zstd version of this test too?

https://github.com/llvm/llvm-project/pull/84885
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [IR] Introduce `llvm.experimental.hot()` (PR #84850)

2024-03-12 Thread Nikita Popov via llvm-branch-commits


@@ -27639,6 +27639,54 @@ constant `true`. However it is always correct to 
replace
 it with any other `i1` value. Any pass can
 freely do it if it can benefit from non-default lowering.
 
+'``llvm.experimental.hot``' Intrinsic
+^
+
+Syntax:
+"""
+
+::
+
+  declare i1 @llvm.experimental.hot()
+
+Overview:
+"
+
+This intrinsic returns true iff it's known that containing basic block is hot 
in
+profile.
+
+When used with profile based optimization allows to change program behaviour
+deppending on the code hotness.

nikic wrote:

```suggestion
depending on the code hotness.
```

https://github.com/llvm/llvm-project/pull/84850
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [IR] Introduce `llvm.experimental.hot()` (PR #84850)

2024-03-12 Thread Nikita Popov via llvm-branch-commits


@@ -27639,6 +27639,54 @@ constant `true`. However it is always correct to 
replace
 it with any other `i1` value. Any pass can
 freely do it if it can benefit from non-default lowering.
 
+'``llvm.experimental.hot``' Intrinsic
+^
+
+Syntax:
+"""
+
+::
+
+  declare i1 @llvm.experimental.hot()
+
+Overview:
+"
+
+This intrinsic returns true iff it's known that containing basic block is hot 
in
+profile.
+
+When used with profile based optimization allows to change program behaviour
+deppending on the code hotness.
+
+Arguments:
+""
+
+None.
+
+Semantics:
+""
+
+The intrinsic ``@llvm.experimental.hot()`` returns either `true` or `false`,
+deppending on profile used. Expresion is evaluated as `true` iff profile and
+summary are availible and profile counter for the block reach hotness 
threshold.
+For each evaluation of a call to this intrinsic, the program must be valid and
+correct both if it returns `true` and if it returns `false`.
+
+When used in a branch condition, it allows us to choose between
+two alternative correct solutions for the same problem, like
+in example below:
+
+.. code-block:: text
+
+%cond = call i1 @llvm.experimental.hot()
+br i1 %cond, label %fast_path, label %slow_path
+
+  label %fast_path:
+; Omit diagnostics.
+
+  label %slow_path:

nikic wrote:

```suggestion
  slow_path:
```

https://github.com/llvm/llvm-project/pull/84850
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [IR] Introduce `llvm.experimental.hot()` (PR #84850)

2024-03-12 Thread Nikita Popov via llvm-branch-commits


@@ -27639,6 +27639,54 @@ constant `true`. However it is always correct to 
replace
 it with any other `i1` value. Any pass can
 freely do it if it can benefit from non-default lowering.
 
+'``llvm.experimental.hot``' Intrinsic
+^
+
+Syntax:
+"""
+
+::
+
+  declare i1 @llvm.experimental.hot()
+
+Overview:
+"
+
+This intrinsic returns true iff it's known that containing basic block is hot 
in
+profile.
+
+When used with profile based optimization allows to change program behaviour
+deppending on the code hotness.
+
+Arguments:
+""
+
+None.
+
+Semantics:
+""
+
+The intrinsic ``@llvm.experimental.hot()`` returns either `true` or `false`,
+deppending on profile used. Expresion is evaluated as `true` iff profile and
+summary are availible and profile counter for the block reach hotness 
threshold.
+For each evaluation of a call to this intrinsic, the program must be valid and
+correct both if it returns `true` and if it returns `false`.
+
+When used in a branch condition, it allows us to choose between
+two alternative correct solutions for the same problem, like
+in example below:
+
+.. code-block:: text
+
+%cond = call i1 @llvm.experimental.hot()
+br i1 %cond, label %fast_path, label %slow_path
+
+  label %fast_path:

nikic wrote:

```suggestion
  fast_path:
```

https://github.com/llvm/llvm-project/pull/84850
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [IR] Introduce `llvm.experimental.hot()` (PR #84850)

2024-03-12 Thread Nikita Popov via llvm-branch-commits


@@ -27639,6 +27639,54 @@ constant `true`. However it is always correct to 
replace
 it with any other `i1` value. Any pass can
 freely do it if it can benefit from non-default lowering.
 
+'``llvm.experimental.hot``' Intrinsic
+^
+
+Syntax:
+"""
+
+::
+
+  declare i1 @llvm.experimental.hot()
+
+Overview:
+"
+
+This intrinsic returns true iff it's known that containing basic block is hot 
in
+profile.
+
+When used with profile based optimization allows to change program behaviour
+deppending on the code hotness.
+
+Arguments:
+""
+
+None.
+
+Semantics:
+""
+
+The intrinsic ``@llvm.experimental.hot()`` returns either `true` or `false`,

nikic wrote:

```suggestion
The intrinsic ``@llvm.experimental.hot()`` returns either ``true`` or ``false``,
```
Here and elsewhere.

https://github.com/llvm/llvm-project/pull/84850
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [IR] Introduce `llvm.experimental.hot()` (PR #84850)

2024-03-12 Thread Nikita Popov via llvm-branch-commits


@@ -27639,6 +27639,54 @@ constant `true`. However it is always correct to 
replace
 it with any other `i1` value. Any pass can
 freely do it if it can benefit from non-default lowering.
 
+'``llvm.experimental.hot``' Intrinsic
+^
+
+Syntax:
+"""
+
+::
+
+  declare i1 @llvm.experimental.hot()
+
+Overview:
+"
+
+This intrinsic returns true iff it's known that containing basic block is hot 
in
+profile.
+
+When used with profile based optimization allows to change program behaviour
+deppending on the code hotness.
+
+Arguments:
+""
+
+None.
+
+Semantics:
+""
+
+The intrinsic ``@llvm.experimental.hot()`` returns either `true` or `false`,
+deppending on profile used. Expresion is evaluated as `true` iff profile and

nikic wrote:

```suggestion
depending on profile used. Expresion is evaluated as `true` iff profile and
```

https://github.com/llvm/llvm-project/pull/84850
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [IR] Introduce `llvm.experimental.hot()` (PR #84850)

2024-03-12 Thread Nikita Popov via llvm-branch-commits


@@ -27639,6 +27639,54 @@ constant `true`. However it is always correct to 
replace
 it with any other `i1` value. Any pass can
 freely do it if it can benefit from non-default lowering.
 
+'``llvm.experimental.hot``' Intrinsic
+^
+
+Syntax:
+"""
+
+::
+
+  declare i1 @llvm.experimental.hot()
+
+Overview:
+"
+
+This intrinsic returns true iff it's known that containing basic block is hot 
in
+profile.
+
+When used with profile based optimization allows to change program behaviour
+deppending on the code hotness.
+
+Arguments:
+""
+
+None.
+
+Semantics:
+""
+
+The intrinsic ``@llvm.experimental.hot()`` returns either `true` or `false`,
+deppending on profile used. Expresion is evaluated as `true` iff profile and
+summary are availible and profile counter for the block reach hotness 
threshold.

nikic wrote:

```suggestion
summary are available and profile counter for the block reach hotness threshold.
```

https://github.com/llvm/llvm-project/pull/84850
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [IR] Introduce `llvm.experimental.hot()` (PR #84850)

2024-03-12 Thread Nikita Popov via llvm-branch-commits

nikic wrote:

Please submit an RFC on discourse for this change.

https://github.com/llvm/llvm-project/pull/84850
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-03-12 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84455

>From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 17:31:47 +0800
Subject: [PATCH] Reduce copies

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp |  89 +-
 llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir |  30 +---
 llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++
 3 files changed, 106 insertions(+), 188 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 7895e87702c711..9fe5666d6a81f4 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
RISCVII::VLMUL LMul, unsigned NF) const 
{
   const TargetRegisterInfo *TRI = STI.getRegisterInfo();
 
-  int I = 0, End = NF, Incr = 1;
   unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
   unsigned DstEncoding = TRI->getEncodingValue(DstReg);
   unsigned LMulVal;
   bool Fractional;
   std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul);
   assert(!Fractional && "It is impossible be fractional lmul here.");
-  if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) {
-I = NF - 1;
-End = -1;
-Incr = -1;
-  }
+  unsigned NumRegs = NF * LMulVal;
+  bool ReversedCopy =
+  forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs);
 
-  for (; I != End; I += Incr) {
+  unsigned I = 0;
+  while (I != NumRegs) {
 auto GetCopyInfo =
-[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple {
-  unsigned Opc;
-  unsigned SubRegIdx;
-  unsigned VVOpc, VIOpc;
-  switch (LMul) {
-  default:
-llvm_unreachable("Impossible LMUL for vector register copy.");
-  case RISCVII::LMUL_1:
-Opc = RISCV::VMV1R_V;
-SubRegIdx = RISCV::sub_vrm1_0;
-VVOpc = RISCV::PseudoVMV_V_V_M1;
-VIOpc = RISCV::PseudoVMV_V_I_M1;
-break;
-  case RISCVII::LMUL_2:
-Opc = RISCV::VMV2R_V;
-SubRegIdx = RISCV::sub_vrm2_0;
-VVOpc = RISCV::PseudoVMV_V_V_M2;
-VIOpc = RISCV::PseudoVMV_V_I_M2;
-break;
-  case RISCVII::LMUL_4:
-Opc = RISCV::VMV4R_V;
-SubRegIdx = RISCV::sub_vrm4_0;
-VVOpc = RISCV::PseudoVMV_V_V_M4;
-VIOpc = RISCV::PseudoVMV_V_I_M4;
-break;
-  case RISCVII::LMUL_8:
-assert(NF == 1);
-Opc = RISCV::VMV8R_V;
-SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0.
-VVOpc = RISCV::PseudoVMV_V_V_M8;
-VIOpc = RISCV::PseudoVMV_V_I_M8;
-break;
-  }
-  return {SubRegIdx, Opc, VVOpc, VIOpc};
+[&](unsigned SrcReg,
+unsigned DstReg) -> std::tuple {
+  unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
+  unsigned DstEncoding = TRI->getEncodingValue(DstReg);
+  if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs)
+return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, 
RISCV::PseudoVMV_V_V_M8,
+RISCV::PseudoVMV_V_I_M8};
+  if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs)
+return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, 
RISCV::PseudoVMV_V_V_M4,
+RISCV::PseudoVMV_V_I_M4};
+  if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs)
+return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, 
RISCV::PseudoVMV_V_V_M2,
+RISCV::PseudoVMV_V_I_M2};
+  return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1,
+  RISCV::PseudoVMV_V_I_M1};
 };
 
-auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF);
+auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, 
DstReg);
 
 MachineBasicBlock::const_iterator DefMBBI;
 if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) {
@@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 }
 
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) {
+SrcReg = Reg;
+break;
+  }
+}
+
+for (MCPhysReg Reg : RegClass.getRegisters()) {
+  if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) {
+DstReg = Reg;
+break;
+  }
+}
+
 auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) 
{
   auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg);
   bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I;
@@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock 
&MBB,
   }
 };
 
-if (NF == 1) {
-  EmitCopy(SrcReg, DstReg, Opc);
-  return;
-}
-
-EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I),
- TRI->g

[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-03-12 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/84455
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)

2024-03-12 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp ready_for_review 
https://github.com/llvm/llvm-project/pull/84455
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)

2024-03-12 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp created 
https://github.com/llvm/llvm-project/pull/84894

This TSFlags was introduced by https://reviews.llvm.org/D108815.

We store VLMul/NF into TSFlags and add helpers to get them.

This can reduce some lines and I think there will be more usages.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-risc-v

Author: Wang Pengcheng (wangpc-pp)


Changes

This TSFlags was introduced by https://reviews.llvm.org/D108815.

We store VLMul/NF into TSFlags and add helpers to get them.

This can reduce some lines and I think there will be more usages.


---
Full diff: https://github.com/llvm/llvm-project/pull/84894.diff


4 Files Affected:

- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+20-92) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.h (+22-1) 
- (modified) llvm/lib/Target/RISCV/RISCVRegisterInfo.td (+15-11) 


``diff
diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 9fe5666d6a81f4..3e52583ec8ad82 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -295,18 +295,17 @@ static bool isConvertibleToVMV_V_V(const RISCVSubtarget 
&STI,
   return false;
 }
 
-void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB,
-   MachineBasicBlock::iterator MBBI,
-   const DebugLoc &DL, MCRegister DstReg,
-   MCRegister SrcReg, bool KillSrc,
-   RISCVII::VLMUL LMul, unsigned NF) const 
{
+void RISCVInstrInfo::copyPhysRegVector(
+MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
+const DebugLoc &DL, MCRegister DstReg, MCRegister SrcReg, bool KillSrc,
+const TargetRegisterClass &RegClass) const {
   const TargetRegisterInfo *TRI = STI.getRegisterInfo();
+  RISCVII::VLMUL LMul = getLMul(RegClass.TSFlags);
+  unsigned NF = getNF(RegClass.TSFlags);
 
   unsigned SrcEncoding = TRI->getEncodingValue(SrcReg);
   unsigned DstEncoding = TRI->getEncodingValue(DstReg);
-  unsigned LMulVal;
-  bool Fractional;
-  std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul);
+  auto [LMulVal, Fractional] = RISCVVType::decodeVLMUL(LMul);
   assert(!Fractional && "It is impossible be fractional lmul here.");
   unsigned NumRegs = NF * LMulVal;
   bool ReversedCopy =
@@ -489,90 +488,19 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
   }
 
   // VR->VR copies.
-  if (RISCV::VRRegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_1);
-return;
-  }
-
-  if (RISCV::VRM2RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_2);
-return;
-  }
-
-  if (RISCV::VRM4RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_4);
-return;
-  }
-
-  if (RISCV::VRM8RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_8);
-return;
-  }
-
-  if (RISCV::VRN2M1RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_1,
-  /*NF=*/2);
-return;
-  }
-
-  if (RISCV::VRN2M2RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_2,
-  /*NF=*/2);
-return;
-  }
-
-  if (RISCV::VRN2M4RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_4,
-  /*NF=*/2);
-return;
-  }
-
-  if (RISCV::VRN3M1RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_1,
-  /*NF=*/3);
-return;
-  }
-
-  if (RISCV::VRN3M2RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_2,
-  /*NF=*/3);
-return;
-  }
-
-  if (RISCV::VRN4M1RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_1,
-  /*NF=*/4);
-return;
-  }
-
-  if (RISCV::VRN4M2RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_2,
-  /*NF=*/4);
-return;
-  }
-
-  if (RISCV::VRN5M1RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_1,
-  /*NF=*/5);
-return;
-  }
-
-  if (RISCV::VRN6M1RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_1,
-  /*NF=*/6);
-return;
-  }
-
-  if (RISCV::VRN7M1RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_1,
-  /*NF=*/7);
-return;
-  }
-
-  if (RISCV::VRN8M1RegClass.contains(DstReg, SrcReg)) {
-copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RISCVII::LMUL_1,
-  /*NF=*/8);
-return;
+  

[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)

2024-03-12 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp updated 
https://github.com/llvm/llvm-project/pull/84894

>From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001
From: Wang Pengcheng 
Date: Tue, 12 Mar 2024 18:41:50 +0800
Subject: [PATCH] Fix wrong arguments

Created using spr 1.3.4
---
 llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp 
b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
index 3e52583ec8ad82..1b3e6cf10189c5 100644
--- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
@@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB,
 RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass,
 RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) {
 if (RegClass.contains(DstReg, SrcReg)) {
-  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc,
-getLMul(RegClass.TSFlags),
-/*NF=*/
-getNF(RegClass.TSFlags));
+  copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass);
   return;
 }
   }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)

2024-03-12 Thread Wang Pengcheng via llvm-branch-commits

https://github.com/wangpc-pp edited 
https://github.com/llvm/llvm-project/pull/84894
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [llvm] Backport fixes for ARM64EC import libraries (PR #84590)

2024-03-12 Thread Jacek Caban via llvm-branch-commits

cjacek wrote:

It's great to hear that it's enough got Rust. I re-reviewed those commits for 
impact on non-ARM64EC target and I think it's safe enough.

One thing we could consider is to skip .def file parser part of it. The main 
downside of it would be that it would require skipping some tests too. Other 
than than that, it's unlikely to matter and I think that Rust calls 
`writeImportLibrary` directly anyway.

https://github.com/llvm/llvm-project/pull/84590
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang] Avoid -Wshadow warning when init-capture named same as class … (PR #84912)

2024-03-12 Thread Neil Henning via llvm-branch-commits

https://github.com/sheredom milestoned 
https://github.com/llvm/llvm-project/pull/84912
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [clang] Avoid -Wshadow warning when init-capture named same as class … (PR #84912)

2024-03-12 Thread Neil Henning via llvm-branch-commits

https://github.com/sheredom created 
https://github.com/llvm/llvm-project/pull/84912

…field (#74512)

Shadowing warning doesn't make much sense since field is not available in 
lambda's body without capturing this.

Fixes https://github.com/llvm/llvm-project/issues/71976

>From d2740d74a34982f21ec18e2cc494d67187d5c7f2 Mon Sep 17 00:00:00 2001
From: Mariya Podchishchaeva 
Date: Mon, 12 Feb 2024 12:44:20 +0300
Subject: [PATCH] [clang] Avoid -Wshadow warning when init-capture named same
 as class field (#74512)

Shadowing warning doesn't make much sense since field is not available
in lambda's body without capturing this.

Fixes https://github.com/llvm/llvm-project/issues/71976
---
 clang/docs/ReleaseNotes.rst   |  3 +
 clang/include/clang/Sema/ScopeInfo.h  |  4 +-
 clang/lib/Sema/SemaDecl.cpp   | 73 +--
 clang/test/SemaCXX/warn-shadow-in-lambdas.cpp | 92 ++-
 4 files changed, 141 insertions(+), 31 deletions(-)

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index fc27297aea2d6c..14703e23ba0be0 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -899,6 +899,9 @@ Bug Fixes in This Version
 - Clang now doesn't produce false-positive warning `-Wconstant-logical-operand`
   for logical operators in C23.
   Fixes (`#64356 `_).
+- Clang's ``-Wshadow`` no longer warns when an init-capture is named the same 
as
+  a class field unless the lambda can capture this.
+  Fixes (`#71976 `_)
 
 Bug Fixes to Compiler Builtins
 ^^
diff --git a/clang/include/clang/Sema/ScopeInfo.h 
b/clang/include/clang/Sema/ScopeInfo.h
index 6eaa74382685ba..06e47eed4e93b6 100644
--- a/clang/include/clang/Sema/ScopeInfo.h
+++ b/clang/include/clang/Sema/ScopeInfo.h
@@ -925,8 +925,8 @@ class LambdaScopeInfo final :
   /// that were defined in parent contexts. Used to avoid warnings when the
   /// shadowed variables are uncaptured by this lambda.
   struct ShadowedOuterDecl {
-const VarDecl *VD;
-const VarDecl *ShadowedDecl;
+const NamedDecl *VD;
+const NamedDecl *ShadowedDecl;
   };
   llvm::SmallVector ShadowingDecls;
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index a300badc6d0260..f5bb3e0b42e26c 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -8396,28 +8396,40 @@ void Sema::CheckShadow(NamedDecl *D, NamedDecl 
*ShadowedDecl,
 
   unsigned WarningDiag = diag::warn_decl_shadow;
   SourceLocation CaptureLoc;
-  if (isa(D) && isa(ShadowedDecl) && NewDC &&
-  isa(NewDC)) {
+  if (isa(D) && NewDC && isa(NewDC)) {
 if (const auto *RD = dyn_cast(NewDC->getParent())) {
   if (RD->isLambda() && OldDC->Encloses(NewDC->getLexicalParent())) {
-if (RD->getLambdaCaptureDefault() == LCD_None) {
-  // Try to avoid warnings for lambdas with an explicit capture list.
+if (const auto *VD = dyn_cast(ShadowedDecl)) {
   const auto *LSI = cast(getCurFunction());
-  // Warn only when the lambda captures the shadowed decl explicitly.
-  CaptureLoc = getCaptureLocation(LSI, cast(ShadowedDecl));
-  if (CaptureLoc.isInvalid())
-WarningDiag = diag::warn_decl_shadow_uncaptured_local;
-} else {
-  // Remember that this was shadowed so we can avoid the warning if the
-  // shadowed decl isn't captured and the warning settings allow it.
+  if (RD->getLambdaCaptureDefault() == LCD_None) {
+// Try to avoid warnings for lambdas with an explicit capture
+// list. Warn only when the lambda captures the shadowed decl
+// explicitly.
+CaptureLoc = getCaptureLocation(LSI, VD);
+if (CaptureLoc.isInvalid())
+  WarningDiag = diag::warn_decl_shadow_uncaptured_local;
+  } else {
+// Remember that this was shadowed so we can avoid the warning if
+// the shadowed decl isn't captured and the warning settings allow
+// it.
+cast(getCurFunction())
+->ShadowingDecls.push_back({D, VD});
+return;
+  }
+}
+if (isa(ShadowedDecl)) {
+  // If lambda can capture this, then emit default shadowing warning,
+  // Otherwise it is not really a shadowing case since field is not
+  // available in lambda's body.
+  // At this point we don't know that lambda can capture this, so
+  // remember that this was shadowed and delay until we know.
   cast(getCurFunction())
-  ->ShadowingDecls.push_back(
-  {cast(D), cast(ShadowedDecl)});
+  ->ShadowingDecls.push_back({D, ShadowedDecl});
   return;
 }
   }
-
-  if (cast(ShadowedDecl)->hasLocalStorage()) {
+  if (const auto *VD = dyn_cast(Shadowed

[llvm-branch-commits] [clang] [clang] Avoid -Wshadow warning when init-capture named same as class … (PR #84912)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Neil Henning (sheredom)


Changes

…field (#74512)

Shadowing warning doesn't make much sense since field is not available in 
lambda's body without capturing this.

Fixes https://github.com/llvm/llvm-project/issues/71976

---
Full diff: https://github.com/llvm/llvm-project/pull/84912.diff


4 Files Affected:

- (modified) clang/docs/ReleaseNotes.rst (+3) 
- (modified) clang/include/clang/Sema/ScopeInfo.h (+2-2) 
- (modified) clang/lib/Sema/SemaDecl.cpp (+47-26) 
- (modified) clang/test/SemaCXX/warn-shadow-in-lambdas.cpp (+89-3) 


``diff
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index fc27297aea2d6c..14703e23ba0be0 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -899,6 +899,9 @@ Bug Fixes in This Version
 - Clang now doesn't produce false-positive warning `-Wconstant-logical-operand`
   for logical operators in C23.
   Fixes (`#64356 `_).
+- Clang's ``-Wshadow`` no longer warns when an init-capture is named the same 
as
+  a class field unless the lambda can capture this.
+  Fixes (`#71976 `_)
 
 Bug Fixes to Compiler Builtins
 ^^
diff --git a/clang/include/clang/Sema/ScopeInfo.h 
b/clang/include/clang/Sema/ScopeInfo.h
index 6eaa74382685ba..06e47eed4e93b6 100644
--- a/clang/include/clang/Sema/ScopeInfo.h
+++ b/clang/include/clang/Sema/ScopeInfo.h
@@ -925,8 +925,8 @@ class LambdaScopeInfo final :
   /// that were defined in parent contexts. Used to avoid warnings when the
   /// shadowed variables are uncaptured by this lambda.
   struct ShadowedOuterDecl {
-const VarDecl *VD;
-const VarDecl *ShadowedDecl;
+const NamedDecl *VD;
+const NamedDecl *ShadowedDecl;
   };
   llvm::SmallVector ShadowingDecls;
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index a300badc6d0260..f5bb3e0b42e26c 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -8396,28 +8396,40 @@ void Sema::CheckShadow(NamedDecl *D, NamedDecl 
*ShadowedDecl,
 
   unsigned WarningDiag = diag::warn_decl_shadow;
   SourceLocation CaptureLoc;
-  if (isa(D) && isa(ShadowedDecl) && NewDC &&
-  isa(NewDC)) {
+  if (isa(D) && NewDC && isa(NewDC)) {
 if (const auto *RD = dyn_cast(NewDC->getParent())) {
   if (RD->isLambda() && OldDC->Encloses(NewDC->getLexicalParent())) {
-if (RD->getLambdaCaptureDefault() == LCD_None) {
-  // Try to avoid warnings for lambdas with an explicit capture list.
+if (const auto *VD = dyn_cast(ShadowedDecl)) {
   const auto *LSI = cast(getCurFunction());
-  // Warn only when the lambda captures the shadowed decl explicitly.
-  CaptureLoc = getCaptureLocation(LSI, cast(ShadowedDecl));
-  if (CaptureLoc.isInvalid())
-WarningDiag = diag::warn_decl_shadow_uncaptured_local;
-} else {
-  // Remember that this was shadowed so we can avoid the warning if the
-  // shadowed decl isn't captured and the warning settings allow it.
+  if (RD->getLambdaCaptureDefault() == LCD_None) {
+// Try to avoid warnings for lambdas with an explicit capture
+// list. Warn only when the lambda captures the shadowed decl
+// explicitly.
+CaptureLoc = getCaptureLocation(LSI, VD);
+if (CaptureLoc.isInvalid())
+  WarningDiag = diag::warn_decl_shadow_uncaptured_local;
+  } else {
+// Remember that this was shadowed so we can avoid the warning if
+// the shadowed decl isn't captured and the warning settings allow
+// it.
+cast(getCurFunction())
+->ShadowingDecls.push_back({D, VD});
+return;
+  }
+}
+if (isa(ShadowedDecl)) {
+  // If lambda can capture this, then emit default shadowing warning,
+  // Otherwise it is not really a shadowing case since field is not
+  // available in lambda's body.
+  // At this point we don't know that lambda can capture this, so
+  // remember that this was shadowed and delay until we know.
   cast(getCurFunction())
-  ->ShadowingDecls.push_back(
-  {cast(D), cast(ShadowedDecl)});
+  ->ShadowingDecls.push_back({D, ShadowedDecl});
   return;
 }
   }
-
-  if (cast(ShadowedDecl)->hasLocalStorage()) {
+  if (const auto *VD = dyn_cast(ShadowedDecl);
+  VD && VD->hasLocalStorage()) {
 // A variable can't shadow a local variable in an enclosing scope, if
 // they are separated by a non-capturing declaration context.
 for (DeclContext *ParentDC = NewDC;
@@ -8468,19 +8480,28 @@ void Sema::CheckShadow(NamedDecl *D, NamedDecl 
*ShadowedDecl,
 /// when these variables are captured by the lambda.

[llvm-branch-commits] [clang] [clang] Avoid -Wshadow warning when init-capture named same as class … (PR #84912)

2024-03-12 Thread Neil Henning via llvm-branch-commits

https://github.com/sheredom edited 
https://github.com/llvm/llvm-project/pull/84912
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 8a7f465 - Revert "[NVPTX] Add support for atomic add for f16 type (#84295)"

2024-03-12 Thread via llvm-branch-commits

Author: Danial Klimkin
Date: 2024-03-12T14:59:07+01:00
New Revision: 8a7f465b59913d84f454a366b017a780ad1172a9

URL: 
https://github.com/llvm/llvm-project/commit/8a7f465b59913d84f454a366b017a780ad1172a9
DIFF: 
https://github.com/llvm/llvm-project/commit/8a7f465b59913d84f454a366b017a780ad1172a9.diff

LOG: Revert "[NVPTX] Add support for atomic add for f16 type (#84295)"

This reverts commit 8e0f4b943fee13afc970ca8277a8e76b9da63b96.

Added: 


Modified: 
llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
llvm/test/CodeGen/NVPTX/atomics.ll

Removed: 
llvm/test/CodeGen/NVPTX/atomics-sm70.ll



diff  --git a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp 
b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
index c411c8ef9528d7..c979c03dc1b835 100644
--- a/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
@@ -6100,9 +6100,6 @@ 
NVPTXTargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *AI) const {
 
   if (AI->isFloatingPointOperation()) {
 if (AI->getOperation() == AtomicRMWInst::BinOp::FAdd) {
-  if (Ty->isHalfTy() && STI.getSmVersion() >= 70 &&
-  STI.getPTXVersion() >= 63)
-return AtomicExpansionKind::None;
   if (Ty->isFloatTy())
 return AtomicExpansionKind::None;
   if (Ty->isDoubleTy() && STI.hasAtomAddF64())

diff  --git a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td 
b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
index 869b13369e87e1..477789a164ead2 100644
--- a/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+++ b/llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
@@ -1630,13 +1630,6 @@ defm INT_PTX_ATOM_ADD_GEN_64 : F_ATOMIC_2;
 
-defm INT_PTX_ATOM_ADD_G_F16 : F_ATOMIC_2, hasPTX<63>]>;
-defm INT_PTX_ATOM_ADD_S_F16 : F_ATOMIC_2, hasPTX<63>]>;
-defm INT_PTX_ATOM_ADD_GEN_F16 : F_ATOMIC_2, hasPTX<63>]>;
-
 defm INT_PTX_ATOM_ADD_G_F32 : F_ATOMIC_2;
 defm INT_PTX_ATOM_ADD_S_F32 : F_ATOMIC_2 Preds> {
   let AddedComplexity = 1 in {
-def : ATOM23_impl;
 def : ATOM23_impl;
@@ -2027,9 +2017,6 @@ multiclass ATOM2P_impl;
   def : ATOM23_impl;
@@ -2149,8 +2136,6 @@ multiclass ATOM2_add_impl {
defm _s32  : ATOM2S_impl;
defm _u32  : ATOM2S_impl;
defm _u64  : ATOM2S_impl;
-   defm _f16  : ATOM2S_impl, hasPTX<63>]>;
defm _f32  : ATOM2S_impl;
defm _f64  : ATOM2S_impl;
-; CHECK-NEXT:.reg .b32 %r<4>;
-; CHECK-EMPTY:
-; CHECK-NEXT:  // %bb.0:
-; CHECK-NEXT:ld.param.u32 %r1, [test_param_0];
-; CHECK-NEXT:ld.param.b16 %rs1, [test_param_3];
-; CHECK-NEXT:atom.add.noftz.f16 %rs2, [%r1], %rs1;
-; CHECK-NEXT:ld.param.u32 %r2, [test_param_1];
-; CHECK-NEXT:atom.global.add.noftz.f16 %rs3, [%r2], %rs1;
-; CHECK-NEXT:ld.param.u32 %r3, [test_param_2];
-; CHECK-NEXT:atom.shared.add.noftz.f16 %rs4, [%r3], %rs1;
-; CHECK-NEXT:ret;
-;
-; CHECK64-LABEL: test(
-; CHECK64:   {
-; CHECK64-NEXT:.reg .b16 %rs<5>;
-; CHECK64-NEXT:.reg .b64 %rd<4>;
-; CHECK64-EMPTY:
-; CHECK64-NEXT:  // %bb.0:
-; CHECK64-NEXT:ld.param.u64 %rd1, [test_param_0];
-; CHECK64-NEXT:ld.param.b16 %rs1, [test_param_3];
-; CHECK64-NEXT:atom.add.noftz.f16 %rs2, [%rd1], %rs1;
-; CHECK64-NEXT:ld.param.u64 %rd2, [test_param_1];
-; CHECK64-NEXT:atom.global.add.noftz.f16 %rs3, [%rd2], %rs1;
-; CHECK64-NEXT:ld.param.u64 %rd3, [test_param_2];
-; CHECK64-NEXT:atom.shared.add.noftz.f16 %rs4, [%rd3], %rs1;
-; CHECK64-NEXT:ret;
-;
-; CHECKPTX62-LABEL: test(
-; CHECKPTX62:   {
-; CHECKPTX62-NEXT:.reg .pred %p<4>;
-; CHECKPTX62-NEXT:.reg .b16 %rs<14>;
-; CHECKPTX62-NEXT:.reg .b32 %r<49>;
-; CHECKPTX62-EMPTY:
-; CHECKPTX62-NEXT:  // %bb.0:
-; CHECKPTX62-NEXT:ld.param.b16 %rs1, [test_param_3];
-; CHECKPTX62-NEXT:ld.param.u32 %r20, [test_param_2];
-; CHECKPTX62-NEXT:ld.param.u32 %r19, [test_param_1];
-; CHECKPTX62-NEXT:ld.param.u32 %r21, [test_param_0];
-; CHECKPTX62-NEXT:and.b32 %r1, %r21, -4;
-; CHECKPTX62-NEXT:and.b32 %r22, %r21, 3;
-; CHECKPTX62-NEXT:shl.b32 %r2, %r22, 3;
-; CHECKPTX62-NEXT:mov.b32 %r23, 65535;
-; CHECKPTX62-NEXT:shl.b32 %r24, %r23, %r2;
-; CHECKPTX62-NEXT:not.b32 %r3, %r24;
-; CHECKPTX62-NEXT:ld.u32 %r46, [%r1];
-; CHECKPTX62-NEXT:  $L__BB0_1: // %atomicrmw.start
-; CHECKPTX62-NEXT:// =>This Inner Loop Header: Depth=1
-; CHECKPTX62-NEXT:shr.u32 %r25, %r46, %r2;
-; CHECKPTX62-NEXT:cvt.u16.u32 %rs2, %r25;
-; CHECKPTX62-NEXT:add.rn.f16 %rs4, %rs2, %rs1;
-; CHECKPTX62-NEXT:cvt.u32.u16 %r26, %rs4;
-; CHECKPTX62-NEXT:shl.b32 %r27, %r26, %r2;
-; CHECKPTX62-NEXT:and.b32 %r28, %r46, %r3;
-; CHECKPTX62-NEXT:or.b32 %r29, %r28, %r27;
-; CHECKPTX62-NEXT:atom.cas.b32 %r6, [%r1], %r46, %r29;
-; CHECKPTX62-NEXT:setp.ne.s32 %p1, %r6, %r46;
-; CHECKPTX62-NEXT:mov.u32 %r46, %r6;
-; CHECKPTX62-NEXT:@%p1 bra $L__BB0_1;
-; CHECKPTX62-NEXT:  // %bb.2: // %atomicrmw.end
-; CHECKPTX62-NEXT:a

[llvm-branch-commits] [flang] [flang][OpenMP] Convert unique clauses in ClauseProcessor (PR #81622)

2024-03-12 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak approved this pull request.

LGTM.

https://github.com/llvm/llvm-project/pull/81622
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak edited 
https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Sergio Afonso via llvm-branch-commits


@@ -28,9 +29,27 @@ namespace Fortran {
 namespace lower {
 namespace omp {
 
-void genObjectList(const Fortran::parser::OmpObjectList &objectList,
+void genObjectList(const ObjectList &objects,
Fortran::lower::AbstractConverter &converter,
llvm::SmallVectorImpl &operands) {
+  for (const Object &object : objects) {
+const Fortran::semantics::Symbol *sym = object.id();
+assert(sym && "Expected Symbol");
+if (mlir::Value variable = converter.getSymbolAddress(*sym)) {
+  operands.push_back(variable);
+} else {

skatrak wrote:

Nit: I know this follows the previous implementation, but I think it'd be 
better to collapse this into an `else if` and get rid of one nesting level. 
Feel free to ignore if you disagree.

https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak approved this pull request.

Thank you for the clarification, LGTM.

https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] release/18.x: [mlir][NFC] Apply rule of five to *Pass classes (#80998) (PR #83971)

2024-03-12 Thread Andrei Golubev via llvm-branch-commits

andrey-golubev wrote:

ping @joker-eph: do you mind (back)porting this change to 18.x? if yes, let's 
close this, alternatively, please approve / merge, thank you!

https://github.com/llvm/llvm-project/pull/83971
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] b6a29f4 - Revert "[analyzer] Accept C library functions from the `std` namespace (#84469)"

2024-03-12 Thread via llvm-branch-commits

Author: NagyDonat
Date: 2024-03-12T16:00:19+01:00
New Revision: b6a29f40aebde93977368979710ca1903482d4b4

URL: 
https://github.com/llvm/llvm-project/commit/b6a29f40aebde93977368979710ca1903482d4b4
DIFF: 
https://github.com/llvm/llvm-project/commit/b6a29f40aebde93977368979710ca1903482d4b4.diff

LOG: Revert "[analyzer] Accept C library functions from the `std` namespace 
(#84469)"

This reverts commit 80ab8234ac309418637488b97e0a62d8377b2ecf.

Added: 


Modified: 
clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
clang/lib/StaticAnalyzer/Core/CheckerContext.cpp
clang/unittests/StaticAnalyzer/CMakeLists.txt
llvm/utils/gn/secondary/clang/unittests/StaticAnalyzer/BUILD.gn

Removed: 
clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp



diff  --git 
a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h 
b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
index b4e1636130ca7c..3432d2648633c2 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CallDescription.h
@@ -41,8 +41,12 @@ class CallDescription {
 ///  - We also accept calls where the number of arguments or parameters is
 ///greater than the specified value.
 /// For the exact heuristics, see CheckerContext::isCLibraryFunction().
-/// (This mode only matches functions that are declared either directly
-/// within a TU or in the namespace `std`.)
+/// Note that functions whose declaration context is not a TU (e.g.
+/// methods, functions in namespaces) are not accepted as C library
+/// functions.
+/// FIXME: If I understand it correctly, this discards calls where C++ code
+/// refers a C library function through the namespace `std::` via headers
+/// like .
 CLibrary,
 
 /// Matches "simple" functions that are not methods. (Static methods are

diff  --git a/clang/lib/StaticAnalyzer/Core/CheckerContext.cpp 
b/clang/lib/StaticAnalyzer/Core/CheckerContext.cpp
index 1a9bff529e9bb1..d6d4cec9dd3d4d 100644
--- a/clang/lib/StaticAnalyzer/Core/CheckerContext.cpp
+++ b/clang/lib/StaticAnalyzer/Core/CheckerContext.cpp
@@ -87,11 +87,9 @@ bool CheckerContext::isCLibraryFunction(const FunctionDecl 
*FD,
   if (!II)
 return false;
 
-  // C library functions are either declared directly within a TU (the common
-  // case) or they are accessed through the namespace `std` (when they are used
-  // in C++ via headers like ).
-  const DeclContext *DC = FD->getDeclContext()->getRedeclContext();
-  if (!(DC->isTranslationUnit() || DC->isStdNamespace()))
+  // Look through 'extern "C"' and anything similar invented in the future.
+  // If this function is not in TU directly, it is not a C library function.
+  if (!FD->getDeclContext()->getRedeclContext()->isTranslationUnit())
 return false;
 
   // If this function is not externally visible, it is not a C library 
function.

diff  --git a/clang/unittests/StaticAnalyzer/CMakeLists.txt 
b/clang/unittests/StaticAnalyzer/CMakeLists.txt
index db56e77331b821..775f0f8486b8f9 100644
--- a/clang/unittests/StaticAnalyzer/CMakeLists.txt
+++ b/clang/unittests/StaticAnalyzer/CMakeLists.txt
@@ -11,7 +11,6 @@ add_clang_unittest(StaticAnalysisTests
   CallEventTest.cpp
   ConflictingEvalCallsTest.cpp
   FalsePositiveRefutationBRVisitorTest.cpp
-  IsCLibraryFunctionTest.cpp
   NoStateChangeFuncVisitorTest.cpp
   ParamRegionTest.cpp
   RangeSetTest.cpp

diff  --git a/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp 
b/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp
deleted file mode 100644
index 19c66cc6bee1eb..00
--- a/clang/unittests/StaticAnalyzer/IsCLibraryFunctionTest.cpp
+++ /dev/null
@@ -1,89 +0,0 @@
-#include "clang/ASTMatchers/ASTMatchFinder.h"
-#include "clang/ASTMatchers/ASTMatchers.h"
-#include "clang/Analysis/AnalysisDeclContext.h"
-#include "clang/Frontend/ASTUnit.h"
-#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
-#include "clang/Tooling/Tooling.h"
-#include "gtest/gtest.h"
-
-#include 
-
-using namespace clang;
-using namespace ento;
-using namespace ast_matchers;
-
-testing::AssertionResult extractFunctionDecl(StringRef Code,
- const FunctionDecl *&Result) {
-  auto ASTUnit = tooling::buildASTFromCode(Code);
-  if (!ASTUnit)
-return testing::AssertionFailure() << "AST construction failed";
-
-  ASTContext &Context = ASTUnit->getASTContext();
-  if (Context.getDiagnostics().hasErrorOccurred())
-return testing::AssertionFailure() << "Compilation error";
-
-  auto Matches = ast_matchers::match(functionDecl().bind("fn"), Context);
-  if (Matches.empty())
-return testing::AssertionFailure() << "No function declaration found";
-
-  if (Matches.size() > 1)
-return testing::AssertionFailure()
-   

[llvm-branch-commits] [flang] [flang][Lower] Convert OMP Map and related functions to evaluate::Expr (PR #81626)

2024-03-12 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak commented:

Changes to the `ClauseProcessor` and OpenMP lowering look good to me, but I'm 
not familiar enough with the mapping work and OpenACC lowering to comment on 
these changes.

https://github.com/llvm/llvm-project/pull/81626
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert processTODO and remove unused objects (PR #81627)

2024-03-12 Thread Sergio Afonso via llvm-branch-commits

https://github.com/skatrak approved this pull request.


https://github.com/llvm/llvm-project/pull/81627
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] Backport ARM64EC variadic args fixes to LLVM 18 (PR #81800)

2024-03-12 Thread Daniel Paoliello via llvm-branch-commits

dpaoliello wrote:

@tstellar can we please get this merged into the v18 release branch?

https://github.com/llvm/llvm-project/pull/81800
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [ArgPromotion] Remove incorrect TranspBlocks set for loads. (#84835) (PR #84945)

2024-03-12 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/84945
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [ArgPromotion] Remove incorrect TranspBlocks set for loads. (#84835) (PR #84945)

2024-03-12 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/84945

Backport 31ffdb56b4df9b772d763dccabbfde542545d695 
bba4a1daff6ee09941f1369a4e56b4af95efdc5c

Requested by: @nikic

>From 4e79eba64e1c340691e9f565e81fef9e3fe005c3 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Mon, 11 Mar 2024 21:06:03 +
Subject: [PATCH 1/2] [ArgPromotion] Add test case for #84807.

Test case for https://github.com/llvm/llvm-project/issues/84807,
showing a mis-compile in ArgPromotion.

(cherry picked from commit 31ffdb56b4df9b772d763dccabbfde542545d695)
---
 ...ing-and-non-aliasing-loads-with-clobber.ll | 100 ++
 1 file changed, 100 insertions(+)
 create mode 100644 
llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll

diff --git 
a/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
 
b/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
new file mode 100644
index 00..69385a7ea51a74
--- /dev/null
+++ 
b/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
@@ -0,0 +1,100 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -p argpromotion -S %s | FileCheck %s
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+
+@f = dso_local global { i16, i64 } { i16 1, i64 0 }, align 8
+
+; Test case for https://github.com/llvm/llvm-project/issues/84807.
+
+; FIXME: Currently the loads from @callee are moved to @caller, even though
+;the store in %then may aliases to load from %q.
+
+define i32 @caller1(i1 %c) {
+; CHECK-LABEL: define i32 @caller1(
+; CHECK-SAME: i1 [[C:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[F_VAL:%.*]] = load i16, ptr @f, align 8
+; CHECK-NEXT:[[TMP0:%.*]] = getelementptr i8, ptr @f, i64 8
+; CHECK-NEXT:[[F_VAL1:%.*]] = load i64, ptr [[TMP0]], align 8
+; CHECK-NEXT:call void @callee1(i16 [[F_VAL]], i64 [[F_VAL1]], i1 [[C]])
+; CHECK-NEXT:ret i32 0
+;
+entry:
+  call void @callee1(ptr noundef nonnull @f, i1 %c)
+  ret i32 0
+}
+
+define internal void @callee1(ptr nocapture noundef readonly %q, i1 %c) {
+; CHECK-LABEL: define internal void @callee1(
+; CHECK-SAME: i16 [[Q_0_VAL:%.*]], i64 [[Q_8_VAL:%.*]], i1 [[C:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:br i1 [[C]], label [[THEN:%.*]], label [[EXIT:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:store i16 123, ptr @f, align 8
+; CHECK-NEXT:br label [[EXIT]]
+; CHECK:   exit:
+; CHECK-NEXT:call void @use(i16 [[Q_0_VAL]], i64 [[Q_8_VAL]])
+; CHECK-NEXT:ret void
+;
+entry:
+  br i1 %c, label %then, label %exit
+
+then:
+  store i16 123, ptr @f, align 8
+  br label %exit
+
+exit:
+  %l.0 = load i16, ptr %q, align 8
+  %gep.8  = getelementptr inbounds i8, ptr %q, i64 8
+  %l.1 = load i64, ptr %gep.8, align 8
+  call void @use(i16 %l.0, i64 %l.1)
+  ret void
+
+  uselistorder ptr %q, { 1, 0 }
+}
+
+; Same as @caller1/callee2, but with default uselist order.
+define i32 @caller2(i1 %c) {
+; CHECK-LABEL: define i32 @caller2(
+; CHECK-SAME: i1 [[C:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:call void @callee2(ptr noundef nonnull @f, i1 [[C]])
+; CHECK-NEXT:ret i32 0
+;
+entry:
+  call void @callee2(ptr noundef nonnull @f, i1 %c)
+  ret i32 0
+}
+
+define internal void @callee2(ptr nocapture noundef readonly %q, i1 %c) {
+; CHECK-LABEL: define internal void @callee2(
+; CHECK-SAME: ptr nocapture noundef readonly [[Q:%.*]], i1 [[C:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:br i1 [[C]], label [[THEN:%.*]], label [[EXIT:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:store i16 123, ptr @f, align 8
+; CHECK-NEXT:br label [[EXIT]]
+; CHECK:   exit:
+; CHECK-NEXT:[[Q_0_VAL:%.*]] = load i16, ptr [[Q]], align 8
+; CHECK-NEXT:[[GEP_8:%.*]] = getelementptr inbounds i8, ptr [[Q]], i64 8
+; CHECK-NEXT:[[Q_8_VAL:%.*]] = load i64, ptr [[GEP_8]], align 8
+; CHECK-NEXT:call void @use(i16 [[Q_0_VAL]], i64 [[Q_8_VAL]])
+; CHECK-NEXT:ret void
+;
+entry:
+  br i1 %c, label %then, label %exit
+
+then:
+  store i16 123, ptr @f, align 8
+  br label %exit
+
+exit:
+  %l.0 = load i16, ptr %q, align 8
+  %gep.8  = getelementptr inbounds i8, ptr %q, i64 8
+  %l.1 = load i64, ptr %gep.8, align 8
+  call void @use(i16 %l.0, i64 %l.1)
+  ret void
+}
+
+declare void @use(i16, i64)

>From a22faddda48bf793b75cac0f287998ffac20 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Tue, 12 Mar 2024 09:47:42 +
Subject: [PATCH 2/2] [ArgPromotion] Remove incorrect TranspBlocks set for
 loads. (#84835)

The TranspBlocks set was used to cache aliasing decision for all
processed loads in the parent loop. This is incorrect, because each load
can access a different location, which means one load not being modified
in a block doesn't translate to another load not being modified in the
same block.

All loads access the 

[llvm-branch-commits] [llvm] release/18.x: [ArgPromotion] Remove incorrect TranspBlocks set for loads. (#84835) (PR #84945)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:

@nikic What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/84945
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [ArgPromotion] Remove incorrect TranspBlocks set for loads. (#84835) (PR #84945)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

Backport 31ffdb56b4df9b772d763dccabbfde542545d695 
bba4a1daff6ee09941f1369a4e56b4af95efdc5c

Requested by: @nikic

---
Full diff: https://github.com/llvm/llvm-project/pull/84945.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/IPO/ArgumentPromotion.cpp (+1-5) 
- (added) 
llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
 (+100) 


``diff
diff --git a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp 
b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
index 8058282c422503..062a3d341007ce 100644
--- a/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
+++ b/llvm/lib/Transforms/IPO/ArgumentPromotion.cpp
@@ -652,10 +652,6 @@ static bool findArgParts(Argument *Arg, const DataLayout 
&DL, AAResults &AAR,
   // check to see if the pointer is guaranteed to not be modified from entry of
   // the function to each of the load instructions.
 
-  // Because there could be several/many load instructions, remember which
-  // blocks we know to be transparent to the load.
-  df_iterator_default_set TranspBlocks;
-
   for (LoadInst *Load : Loads) {
 // Check to see if the load is invalidated from the start of the block to
 // the load itself.
@@ -669,7 +665,7 @@ static bool findArgParts(Argument *Arg, const DataLayout 
&DL, AAResults &AAR,
 // To do this, we perform a depth first search on the inverse CFG from the
 // loading block.
 for (BasicBlock *P : predecessors(BB)) {
-  for (BasicBlock *TranspBB : inverse_depth_first_ext(P, TranspBlocks))
+  for (BasicBlock *TranspBB : inverse_depth_first(P))
 if (AAR.canBasicBlockModify(*TranspBB, Loc))
   return false;
 }
diff --git 
a/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
 
b/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
new file mode 100644
index 00..1e1669b29b0db6
--- /dev/null
+++ 
b/llvm/test/Transforms/ArgumentPromotion/aliasing-and-non-aliasing-loads-with-clobber.ll
@@ -0,0 +1,100 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -p argpromotion -S %s | FileCheck %s
+
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+
+@f = dso_local global { i16, i64 } { i16 1, i64 0 }, align 8
+
+; Test case for https://github.com/llvm/llvm-project/issues/84807.
+
+; Make sure the loads from @callee are not moved to @caller, as the store
+; in %then may aliases to load from %q.
+
+define i32 @caller1(i1 %c) {
+; CHECK-LABEL: define i32 @caller1(
+; CHECK-SAME: i1 [[C:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:call void @callee1(ptr noundef nonnull @f, i1 [[C]])
+; CHECK-NEXT:ret i32 0
+;
+entry:
+  call void @callee1(ptr noundef nonnull @f, i1 %c)
+  ret i32 0
+}
+
+define internal void @callee1(ptr nocapture noundef readonly %q, i1 %c) {
+; CHECK-LABEL: define internal void @callee1(
+; CHECK-SAME: ptr nocapture noundef readonly [[Q:%.*]], i1 [[C:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:br i1 [[C]], label [[THEN:%.*]], label [[EXIT:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:store i16 123, ptr @f, align 8
+; CHECK-NEXT:br label [[EXIT]]
+; CHECK:   exit:
+; CHECK-NEXT:[[Q_0_VAL:%.*]] = load i16, ptr [[Q]], align 8
+; CHECK-NEXT:[[GEP_8:%.*]] = getelementptr inbounds i8, ptr [[Q]], i64 8
+; CHECK-NEXT:[[Q_8_VAL:%.*]] = load i64, ptr [[GEP_8]], align 8
+; CHECK-NEXT:call void @use(i16 [[Q_0_VAL]], i64 [[Q_8_VAL]])
+; CHECK-NEXT:ret void
+;
+entry:
+  br i1 %c, label %then, label %exit
+
+then:
+  store i16 123, ptr @f, align 8
+  br label %exit
+
+exit:
+  %l.0 = load i16, ptr %q, align 8
+  %gep.8  = getelementptr inbounds i8, ptr %q, i64 8
+  %l.1 = load i64, ptr %gep.8, align 8
+  call void @use(i16 %l.0, i64 %l.1)
+  ret void
+
+  uselistorder ptr %q, { 1, 0 }
+}
+
+; Same as @caller1/callee2, but with default uselist order.
+define i32 @caller2(i1 %c) {
+; CHECK-LABEL: define i32 @caller2(
+; CHECK-SAME: i1 [[C:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:call void @callee2(ptr noundef nonnull @f, i1 [[C]])
+; CHECK-NEXT:ret i32 0
+;
+entry:
+  call void @callee2(ptr noundef nonnull @f, i1 %c)
+  ret i32 0
+}
+
+define internal void @callee2(ptr nocapture noundef readonly %q, i1 %c) {
+; CHECK-LABEL: define internal void @callee2(
+; CHECK-SAME: ptr nocapture noundef readonly [[Q:%.*]], i1 [[C:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:br i1 [[C]], label [[THEN:%.*]], label [[EXIT:%.*]]
+; CHECK:   then:
+; CHECK-NEXT:store i16 123, ptr @f, align 8
+; CHECK-NEXT:br label [[EXIT]]
+; CHECK:   exit:
+; CHECK-NEXT:[[Q_0_VAL:%.*]] = load i16, ptr [[Q]], align 8
+; CHECK-NEXT:[[GEP_8:%.*]] = getelementptr inbounds i8, ptr [[Q]], i64 8
+; CHECK-NEXT:[[Q_8_VAL:%.*]] = load i64, ptr [[GEP_8]], align 8
+;

[llvm-branch-commits] [llvm] release/18.x: [DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157) (PR #84946)

2024-03-12 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/84946
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157) (PR #84946)

2024-03-12 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/84946

Backport eb8f379567e8d014194faefe02ce92813e237afc

Requested by: @nikic

>From b1850085ed8561ed051f17d6eb1bdcdbc1c5c8a7 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Wed, 6 Mar 2024 20:08:00 +
Subject: [PATCH] [DSE] Remove malloc from EarliestEscapeInfo before removing.
 (#84157)

Not removing the malloc from earliest escape info leaves stale entries
in the cache.

Fixes https://github.com/llvm/llvm-project/issues/84051.

PR: https://github.com/llvm/llvm-project/pull/84157
(cherry picked from commit eb8f379567e8d014194faefe02ce92813e237afc)
---
 .../Scalar/DeadStoreElimination.cpp   |   4 +-
 ...alloc-earliest-escape-info-invalidation.ll | 302 ++
 2 files changed, 304 insertions(+), 2 deletions(-)
 create mode 100644 
llvm/test/Transforms/DeadStoreElimination/malloc-earliest-escape-info-invalidation.ll

diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp 
b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 340fba4fb9c5a2..380d6583655367 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -1907,15 +1907,15 @@ struct DSEState {
   Malloc->getArgOperand(0), IRB, TLI);
 if (!Calloc)
   return false;
+
 MemorySSAUpdater Updater(&MSSA);
 auto *NewAccess =
   Updater.createMemoryAccessAfter(cast(Calloc), nullptr,
   MallocDef);
 auto *NewAccessMD = cast(NewAccess);
 Updater.insertDef(NewAccessMD, /*RenameUses=*/true);
-Updater.removeMemoryAccess(Malloc);
 Malloc->replaceAllUsesWith(Calloc);
-Malloc->eraseFromParent();
+deleteDeadInstruction(Malloc);
 return true;
   }
 
diff --git 
a/llvm/test/Transforms/DeadStoreElimination/malloc-earliest-escape-info-invalidation.ll
 
b/llvm/test/Transforms/DeadStoreElimination/malloc-earliest-escape-info-invalidation.ll
new file mode 100644
index 00..60a010cd49ceda
--- /dev/null
+++ 
b/llvm/test/Transforms/DeadStoreElimination/malloc-earliest-escape-info-invalidation.ll
@@ -0,0 +1,302 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -p dse -S %s | FileCheck %s
+
+target datalayout = 
"E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64"
+
+define void @widget(ptr %a) {
+; CHECK-LABEL: define void @widget(
+; CHECK-SAME: ptr [[A:%.*]]) {
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[CALL1:%.*]] = tail call noalias ptr @malloc(i64 0)
+; CHECK-NEXT:store ptr [[CALL1]], ptr [[A]], align 8
+; CHECK-NEXT:[[LOAD:%.*]] = load ptr, ptr [[A]], align 8
+; CHECK-NEXT:[[LOAD2:%.*]] = load i32, ptr [[LOAD]], align 8
+; CHECK-NEXT:[[GETELEMENTPTR:%.*]] = getelementptr i8, ptr [[CALL1]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR3:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR]], i64 1
+; CHECK-NEXT:[[GETELEMENTPTR4:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR]], i64 8
+; CHECK-NEXT:store i16 0, ptr [[GETELEMENTPTR4]], align 4
+; CHECK-NEXT:[[GETELEMENTPTR5:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR]], i64 12
+; CHECK-NEXT:store i32 0, ptr [[CALL1]], align 4
+; CHECK-NEXT:[[LOAD6:%.*]] = load i32, ptr inttoptr (i64 4 to ptr), align 4
+; CHECK-NEXT:br label [[BB48:%.*]]
+; CHECK:   bb7:
+; CHECK-NEXT:br label [[BB9:%.*]]
+; CHECK:   bb8:
+; CHECK-NEXT:br label [[BB53:%.*]]
+; CHECK:   bb9:
+; CHECK-NEXT:[[PHI:%.*]] = phi ptr [ [[CALL1]], [[BB7:%.*]] ], [ [[A]], 
[[BB43:%.*]] ]
+; CHECK-NEXT:[[GETELEMENTPTR10:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR11:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR12:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR13:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR12]], i64 1
+; CHECK-NEXT:store i8 0, ptr [[CALL1]], align 1
+; CHECK-NEXT:br label [[BB29:%.*]]
+; CHECK:   bb14:
+; CHECK-NEXT:[[GETELEMENTPTR15:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR10]], i64 8
+; CHECK-NEXT:[[LOAD16:%.*]] = load i16, ptr [[CALL1]], align 4
+; CHECK-NEXT:br i1 false, label [[BB22:%.*]], label [[BB17:%.*]]
+; CHECK:   bb17:
+; CHECK-NEXT:[[GETELEMENTPTR18:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR11]], i64 8
+; CHECK-NEXT:[[LOAD19:%.*]] = load i16, ptr [[CALL1]], align 4
+; CHECK-NEXT:[[GETELEMENTPTR20:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR12]], i64 8
+; CHECK-NEXT:store i16 0, ptr [[CALL1]], align 4
+; CHECK-NEXT:[[GETELEMENTPTR21:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:br label [[BB25:%.*]]
+; CHECK:   bb22:
+; CHECK-NEXT:[[GETELEMENTPTR23:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR24:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR23]], i64 12
+; CHECK-NEXT:br label [[BB25]]
+; CHECK:  

[llvm-branch-commits] [llvm] release/18.x: [DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157) (PR #84946)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:

@nikic What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/84946
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157) (PR #84946)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

Backport eb8f379567e8d014194faefe02ce92813e237afc

Requested by: @nikic

---
Full diff: https://github.com/llvm/llvm-project/pull/84946.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (+2-2) 
- (added) 
llvm/test/Transforms/DeadStoreElimination/malloc-earliest-escape-info-invalidation.ll
 (+302) 


``diff
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp 
b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 340fba4fb9c5a2..380d6583655367 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -1907,15 +1907,15 @@ struct DSEState {
   Malloc->getArgOperand(0), IRB, TLI);
 if (!Calloc)
   return false;
+
 MemorySSAUpdater Updater(&MSSA);
 auto *NewAccess =
   Updater.createMemoryAccessAfter(cast(Calloc), nullptr,
   MallocDef);
 auto *NewAccessMD = cast(NewAccess);
 Updater.insertDef(NewAccessMD, /*RenameUses=*/true);
-Updater.removeMemoryAccess(Malloc);
 Malloc->replaceAllUsesWith(Calloc);
-Malloc->eraseFromParent();
+deleteDeadInstruction(Malloc);
 return true;
   }
 
diff --git 
a/llvm/test/Transforms/DeadStoreElimination/malloc-earliest-escape-info-invalidation.ll
 
b/llvm/test/Transforms/DeadStoreElimination/malloc-earliest-escape-info-invalidation.ll
new file mode 100644
index 00..60a010cd49ceda
--- /dev/null
+++ 
b/llvm/test/Transforms/DeadStoreElimination/malloc-earliest-escape-info-invalidation.ll
@@ -0,0 +1,302 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 4
+; RUN: opt -p dse -S %s | FileCheck %s
+
+target datalayout = 
"E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64"
+
+define void @widget(ptr %a) {
+; CHECK-LABEL: define void @widget(
+; CHECK-SAME: ptr [[A:%.*]]) {
+; CHECK-NEXT:  bb:
+; CHECK-NEXT:[[CALL1:%.*]] = tail call noalias ptr @malloc(i64 0)
+; CHECK-NEXT:store ptr [[CALL1]], ptr [[A]], align 8
+; CHECK-NEXT:[[LOAD:%.*]] = load ptr, ptr [[A]], align 8
+; CHECK-NEXT:[[LOAD2:%.*]] = load i32, ptr [[LOAD]], align 8
+; CHECK-NEXT:[[GETELEMENTPTR:%.*]] = getelementptr i8, ptr [[CALL1]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR3:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR]], i64 1
+; CHECK-NEXT:[[GETELEMENTPTR4:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR]], i64 8
+; CHECK-NEXT:store i16 0, ptr [[GETELEMENTPTR4]], align 4
+; CHECK-NEXT:[[GETELEMENTPTR5:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR]], i64 12
+; CHECK-NEXT:store i32 0, ptr [[CALL1]], align 4
+; CHECK-NEXT:[[LOAD6:%.*]] = load i32, ptr inttoptr (i64 4 to ptr), align 4
+; CHECK-NEXT:br label [[BB48:%.*]]
+; CHECK:   bb7:
+; CHECK-NEXT:br label [[BB9:%.*]]
+; CHECK:   bb8:
+; CHECK-NEXT:br label [[BB53:%.*]]
+; CHECK:   bb9:
+; CHECK-NEXT:[[PHI:%.*]] = phi ptr [ [[CALL1]], [[BB7:%.*]] ], [ [[A]], 
[[BB43:%.*]] ]
+; CHECK-NEXT:[[GETELEMENTPTR10:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR11:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR12:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR13:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR12]], i64 1
+; CHECK-NEXT:store i8 0, ptr [[CALL1]], align 1
+; CHECK-NEXT:br label [[BB29:%.*]]
+; CHECK:   bb14:
+; CHECK-NEXT:[[GETELEMENTPTR15:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR10]], i64 8
+; CHECK-NEXT:[[LOAD16:%.*]] = load i16, ptr [[CALL1]], align 4
+; CHECK-NEXT:br i1 false, label [[BB22:%.*]], label [[BB17:%.*]]
+; CHECK:   bb17:
+; CHECK-NEXT:[[GETELEMENTPTR18:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR11]], i64 8
+; CHECK-NEXT:[[LOAD19:%.*]] = load i16, ptr [[CALL1]], align 4
+; CHECK-NEXT:[[GETELEMENTPTR20:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR12]], i64 8
+; CHECK-NEXT:store i16 0, ptr [[CALL1]], align 4
+; CHECK-NEXT:[[GETELEMENTPTR21:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:br label [[BB25:%.*]]
+; CHECK:   bb22:
+; CHECK-NEXT:[[GETELEMENTPTR23:%.*]] = getelementptr i8, ptr [[PHI]], i64 0
+; CHECK-NEXT:[[GETELEMENTPTR24:%.*]] = getelementptr i8, ptr 
[[GETELEMENTPTR23]], i64 12
+; CHECK-NEXT:br label [[BB25]]
+; CHECK:   bb25:
+; CHECK-NEXT:[[PHI26:%.*]] = phi ptr [ [[A]], [[BB17]] ], [ [[CALL1]], 
[[BB22]] ]
+; CHECK-NEXT:[[PHI27:%.*]] = phi ptr [ [[CALL1]], [[BB17]] ], [ [[CALL1]], 
[[BB22]] ]
+; CHECK-NEXT:[[PHI28:%.*]] = phi ptr [ [[CALL1]], [[BB17]] ], [ [[CALL1]], 
[[BB22]] ]
+; CHECK-NEXT:store i32 0, ptr [[CALL1]], align 4
+; CHECK-NEXT:br label [[BB29]]
+; CHECK:   bb29:
+; CHECK-NEXT:[[PHI30:%.*]] = phi ptr [ [[CALL1]], [[BB9]] ], [ [[CALL1]], 
[[BB25]] 

[llvm-branch-commits] [llvm] release/18.x: [ValueTracking] Treat phi as underlying obj when not decomposing further (#84339) (PR #84950)

2024-03-12 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/84950

Backport 4cfd4a7896b5fd50274ec8573c259d7ad41741de 
b274b23665dec30f3ae4fb83ccca8b77e6d3ada3

Requested by: @nikic

>From 78be28fbce89bd342f5f7b0d5402f2d9c0213fe0 Mon Sep 17 00:00:00 2001
From: Florian Hahn 
Date: Thu, 7 Mar 2024 13:53:02 +
Subject: [PATCH 1/2] [LAA] Add test case for #82665.

Test case for https://github.com/llvm/llvm-project/issues/82665.

(cherry picked from commit 4cfd4a7896b5fd50274ec8573c259d7ad41741de)
---
 .../underlying-object-loop-varying-phi.ll | 175 ++
 1 file changed, 175 insertions(+)
 create mode 100644 
llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll

diff --git 
a/llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll 
b/llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll
new file mode 100644
index 00..1a5a6ac08d4045
--- /dev/null
+++ 
b/llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll
@@ -0,0 +1,175 @@
+; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -passes='print' -disable-output %s 2>&1 | FileCheck %s
+
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+
+; Test case for https://github.com/llvm/llvm-project/issues/82665.
+define void @indirect_ptr_recurrences_read_write(ptr %A, ptr %B) {
+; CHECK-LABEL: 'indirect_ptr_recurrences_read_write'
+; CHECK-NEXT:loop:
+; CHECK-NEXT:  Memory dependences are safe
+; CHECK-NEXT:  Dependences:
+; CHECK-NEXT:  Run-time memory checks:
+; CHECK-NEXT:  Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; CHECK-NEXT:  SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Expressions re-written:
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 1, %entry ], [ %iv.next, %loop ]
+  %ptr.recur = phi ptr [ %A, %entry ], [ %ptr.next, %loop ]
+  %gep.B = getelementptr inbounds ptr, ptr %B, i64 %iv
+  %ptr.next = load ptr, ptr %gep.B, align 8, !tbaa !6
+  %l = load i32, ptr %ptr.recur, align 4, !tbaa !10
+  %xor = xor i32 %l, 1
+  store i32 %xor, ptr %ptr.recur, align 4, !tbaa !10
+  %iv.next = add nuw nsw i64 %iv, 1
+  %ec = icmp eq i64 %iv.next, 5
+  br i1 %ec, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define i32 @indirect_ptr_recurrences_read_only_loop(ptr %A, ptr %B) {
+; CHECK-LABEL: 'indirect_ptr_recurrences_read_only_loop'
+; CHECK-NEXT:loop:
+; CHECK-NEXT:  Memory dependences are safe
+; CHECK-NEXT:  Dependences:
+; CHECK-NEXT:  Run-time memory checks:
+; CHECK-NEXT:  Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; CHECK-NEXT:  SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Expressions re-written:
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 1, %entry ], [ %iv.next, %loop ]
+  %ptr.recur = phi ptr [ %A, %entry ], [ %ptr.next, %loop ]
+  %red = phi i32 [ 0, %entry ], [ %xor, %loop ]
+  %gep.B = getelementptr inbounds ptr, ptr %B, i64 %iv
+  %ptr.next = load ptr, ptr %gep.B, align 8, !tbaa !6
+  %l = load i32, ptr %ptr.recur, align 4, !tbaa !10
+  %xor = xor i32 %l, 1
+  %iv.next = add nuw nsw i64 %iv, 1
+  %ec = icmp eq i64 %iv.next, 5
+  br i1 %ec, label %exit, label %loop
+
+exit:
+  ret i32 %xor
+}
+
+define void @indirect_ptr_recurrences_read_write_may_alias_no_tbaa(ptr %A, ptr 
%B) {
+; CHECK-LABEL: 'indirect_ptr_recurrences_read_write_may_alias_no_tbaa'
+; CHECK-NEXT:loop:
+; CHECK-NEXT:  Report: cannot identify array bounds
+; CHECK-NEXT:  Dependences:
+; CHECK-NEXT:  Run-time memory checks:
+; CHECK-NEXT:  Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; CHECK-NEXT:  SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Expressions re-written:
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 1, %entry ], [ %iv.next, %loop ]
+  %ptr.recur = phi ptr [ %A, %entry ], [ %ptr.next, %loop ]
+  %gep.B = getelementptr inbounds ptr, ptr %B, i64 %iv
+  %ptr.next = load ptr, ptr %gep.B, align 8, !tbaa !6
+  %l = load i32, ptr %ptr.recur, align 4
+  %xor = xor i32 %l, 1
+  store i32 %xor, ptr %ptr.recur, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %ec = icmp eq i64 %iv.next, 5
+  br i1 %ec, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @indirect_ptr_recurrences_read_write_may_alias_different_obj(ptr 
%A, ptr %B, ptr %C) {
+; CHECK-LABEL: 'indirect_ptr_recurrences_read_write_may_alias_different_obj'
+; CHECK-NEXT:loop:
+; CHECK-NEXT:  Report: cannot identify array bounds
+; CHECK-NEXT:  Dependences:
+; CHECK-NEXT:  Run-time memory checks:
+; CHECK-NEXT:  Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Non vectorizable stores to invariant address were not found 
in

[llvm-branch-commits] [llvm] release/18.x: [ValueTracking] Treat phi as underlying obj when not decomposing further (#84339) (PR #84950)

2024-03-12 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/84950
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [ValueTracking] Treat phi as underlying obj when not decomposing further (#84339) (PR #84950)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:

@nikic What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/84950
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [llvm][lld][RISCV] Support x3_reg_usage (PR #84598)

2024-03-12 Thread Fangrui Song via llvm-branch-commits


@@ -50,6 +54,13 @@ Error RISCVAttributeParser::atomicAbi(unsigned Tag) {
   return Error::success();
 }
 
+Error RISCVAttributeParser::x3RegUsage(unsigned Tag) {
+  uint64_t Value = de.getULEB128(cursor);
+  std::string Description = "X3 reg usage is " + utostr(Value);

MaskRay wrote:

`printAttribute(..., "..." + Twine(Value)`.str())`

https://github.com/llvm/llvm-project/pull/84598
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [ValueTracking] Treat phi as underlying obj when not decomposing further (#84339) (PR #84950)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-analysis

Author: None (llvmbot)


Changes

Backport 4cfd4a7896b5fd50274ec8573c259d7ad41741de 
b274b23665dec30f3ae4fb83ccca8b77e6d3ada3

Requested by: @nikic

---
Full diff: https://github.com/llvm/llvm-project/pull/84950.diff


2 Files Affected:

- (modified) llvm/lib/Analysis/ValueTracking.cpp (+2) 
- (added) 
llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll 
(+180) 


``diff
diff --git a/llvm/lib/Analysis/ValueTracking.cpp 
b/llvm/lib/Analysis/ValueTracking.cpp
index 412115eb649c2f..9f9451e4e814ac 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -5986,6 +5986,8 @@ void llvm::getUnderlyingObjects(const Value *V,
   if (!LI || !LI->isLoopHeader(PN->getParent()) ||
   isSameUnderlyingObjectInLoop(PN, LI))
 append_range(Worklist, PN->incoming_values());
+  else
+Objects.push_back(P);
   continue;
 }
 
diff --git 
a/llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll 
b/llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll
new file mode 100644
index 00..106dc8c13a49fa
--- /dev/null
+++ 
b/llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll
@@ -0,0 +1,180 @@
+; NOTE: Assertions have been autogenerated by 
utils/update_analyze_test_checks.py UTC_ARGS: --version 4
+; RUN: opt -passes='print' -disable-output %s 2>&1 | FileCheck %s
+
+target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
+
+; Test case for https://github.com/llvm/llvm-project/issues/82665.
+define void @indirect_ptr_recurrences_read_write(ptr %A, ptr %B) {
+; CHECK-LABEL: 'indirect_ptr_recurrences_read_write'
+; CHECK-NEXT:loop:
+; CHECK-NEXT:  Report: unsafe dependent memory operations in loop. Use 
#pragma clang loop distribute(enable) to allow loop distribution to attempt to 
isolate the offending operations into a separate loop
+; CHECK-NEXT:  Unsafe indirect dependence.
+; CHECK-NEXT:  Dependences:
+; CHECK-NEXT:IndidrectUnsafe:
+; CHECK-NEXT:%l = load i32, ptr %ptr.recur, align 4, !tbaa !4 ->
+; CHECK-NEXT:store i32 %xor, ptr %ptr.recur, align 4, !tbaa !4
+; CHECK-EMPTY:
+; CHECK-NEXT:  Run-time memory checks:
+; CHECK-NEXT:  Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; CHECK-NEXT:  SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Expressions re-written:
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 1, %entry ], [ %iv.next, %loop ]
+  %ptr.recur = phi ptr [ %A, %entry ], [ %ptr.next, %loop ]
+  %gep.B = getelementptr inbounds ptr, ptr %B, i64 %iv
+  %ptr.next = load ptr, ptr %gep.B, align 8, !tbaa !6
+  %l = load i32, ptr %ptr.recur, align 4, !tbaa !10
+  %xor = xor i32 %l, 1
+  store i32 %xor, ptr %ptr.recur, align 4, !tbaa !10
+  %iv.next = add nuw nsw i64 %iv, 1
+  %ec = icmp eq i64 %iv.next, 5
+  br i1 %ec, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define i32 @indirect_ptr_recurrences_read_only_loop(ptr %A, ptr %B) {
+; CHECK-LABEL: 'indirect_ptr_recurrences_read_only_loop'
+; CHECK-NEXT:loop:
+; CHECK-NEXT:  Memory dependences are safe
+; CHECK-NEXT:  Dependences:
+; CHECK-NEXT:  Run-time memory checks:
+; CHECK-NEXT:  Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; CHECK-NEXT:  SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Expressions re-written:
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 1, %entry ], [ %iv.next, %loop ]
+  %ptr.recur = phi ptr [ %A, %entry ], [ %ptr.next, %loop ]
+  %red = phi i32 [ 0, %entry ], [ %xor, %loop ]
+  %gep.B = getelementptr inbounds ptr, ptr %B, i64 %iv
+  %ptr.next = load ptr, ptr %gep.B, align 8, !tbaa !6
+  %l = load i32, ptr %ptr.recur, align 4, !tbaa !10
+  %xor = xor i32 %l, 1
+  %iv.next = add nuw nsw i64 %iv, 1
+  %ec = icmp eq i64 %iv.next, 5
+  br i1 %ec, label %exit, label %loop
+
+exit:
+  ret i32 %xor
+}
+
+define void @indirect_ptr_recurrences_read_write_may_alias_no_tbaa(ptr %A, ptr 
%B) {
+; CHECK-LABEL: 'indirect_ptr_recurrences_read_write_may_alias_no_tbaa'
+; CHECK-NEXT:loop:
+; CHECK-NEXT:  Report: cannot identify array bounds
+; CHECK-NEXT:  Dependences:
+; CHECK-NEXT:  Run-time memory checks:
+; CHECK-NEXT:  Grouped accesses:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Non vectorizable stores to invariant address were not found 
in loop.
+; CHECK-NEXT:  SCEV assumptions:
+; CHECK-EMPTY:
+; CHECK-NEXT:  Expressions re-written:
+;
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 1, %entry ], [ %iv.next, %loop ]
+  %ptr.recur = phi ptr [ %A, %entry ], [ %ptr.next, %loop ]
+  %gep.B = getelementptr inbounds ptr, ptr %B, i64 %iv
+  %ptr.next = load ptr, ptr %gep.B, align 8, !tbaa !6
+  %l = load i32, ptr %ptr.recur, align 4
+  %xor = 

[llvm-branch-commits] [lld] [llvm][lld][RISCV] Support x3_reg_usage (PR #84598)

2024-03-12 Thread Fangrui Song via llvm-branch-commits

https://github.com/MaskRay edited 
https://github.com/llvm/llvm-project/pull/84598
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [llvm][lld][RISCV] Support x3_reg_usage (PR #84598)

2024-03-12 Thread Fangrui Song via llvm-branch-commits


@@ -1135,11 +1135,34 @@ static void mergeAtomic(DenseMap::iterator it,
   };
 }
 
+static void mergeX3RegUse(DenseMap::iterator it,
+  const InputSectionBase *oldSection,
+  const InputSectionBase *newSection,
+  unsigned int oldTag, unsigned int newTag) {
+  // X3/GP register usage ar incompatible and cannot be merged, with the
+  // exception of the UNKNOWN or 0 value
+  using RISCVAttrs::RISCVX3RegUse::X3RegUsage;
+  if (newTag == X3RegUsage::UNKNOWN)
+return;
+  if (oldTag == X3RegUsage::UNKNOWN) {
+it->getSecond() = newTag;
+return;
+  }
+  if (oldTag != newTag) {
+errorOrWarn(toString(oldSection) + " has x3_reg_usage=" + Twine(oldTag) +

MaskRay wrote:

We seem to use errors for parsing errors. This is probably a warning.

https://github.com/llvm/llvm-project/pull/84598
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [llvm][lld][RISCV] Support x3_reg_usage (PR #84598)

2024-03-12 Thread Fangrui Song via llvm-branch-commits


@@ -412,6 +429,36 @@
 # A6S_A7:   }
 # A6S_A7: }
 
+#--- x3_reg_usage_unknown.s
+.attribute x3_reg_usage, 0
+
+#--- x3_reg_usage_gp.s
+.attribute x3_reg_usage, 1
+
+#--- x3_reg_usage_scs.s
+.attribute x3_reg_usage, 2
+
+#--- x3_reg_usage_tmp.s
+.attribute x3_reg_usage, 3
+
+# X3_REG_SCS_UKNOWN: BuildAttributes {
+# X3_REG_SCS_UKNOWN:   FormatVersion: 0x41

MaskRay wrote:

Add `-NEXT:` whenever appropriate

https://github.com/llvm/llvm-project/pull/84598
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm-objcopy] Simplify --[de]compress-debug-sections and don't compress SHF_ALLOC sections (PR #84885)

2024-03-12 Thread Fangrui Song via llvm-branch-commits


@@ -214,33 +214,34 @@ static Error dumpSectionToFile(StringRef SecName, 
StringRef Filename,
SecName.str().c_str());
 }
 
-static bool isCompressable(const SectionBase &Sec) {
-  return !(Sec.Flags & ELF::SHF_COMPRESSED) &&
- StringRef(Sec.Name).starts_with(".debug");
-}
-
-static Error replaceDebugSections(
-Object &Obj, function_ref ShouldReplace,
-function_ref(const SectionBase *)> AddSection) {
+Error Object::compressOrDecompressSections(const CommonConfig &Config) {
   // Build a list of the debug sections we are going to replace.
   // We can't call `AddSection` while iterating over sections,
   // because it would mutate the sections array.
-  SmallVector ToReplace;
-  for (auto &Sec : Obj.sections())
-if (ShouldReplace(Sec))
-  ToReplace.push_back(&Sec);
-
-  // Build a mapping from original section to a new one.
-  DenseMap FromTo;
-  for (SectionBase *S : ToReplace) {
-Expected NewSection = AddSection(S);
-if (!NewSection)
-  return NewSection.takeError();
-
-FromTo[S] = *NewSection;
+  SmallVector>, 0>

MaskRay wrote:

The default is a smart value to keep `sizeof(SmallVector)` around 64 bytes. 
Someone might send a RFC to change the default to 0 and have another mechanism 
to use the default. I believe `, 0` is used much more than the smart default, 
since in many cases people don't care about extra heap allocations at runtime, 
but I care about code size and the smaller structure of `SmallVector`.

https://github.com/llvm/llvm-project/pull/84885
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] support fir.alloca operations inside of omp reduction ops (PR #84952)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah created https://github.com/llvm/llvm-project/pull/84952

Advise to place the alloca at the start of the first block of whichever region 
(init or combiner) we are currently inside.

It probably isn't safe to put an alloca inside of a combiner region because 
this will be executed multiple times. But that would be a bug to fix in 
Lower/OpenMP.cpp, not here.

OpenMP array reductions 1/6

>From 63c7fe3c522047b4df964a137ffdf0bbc38acec8 Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Wed, 14 Feb 2024 15:22:02 +
Subject: [PATCH] [flang] support fir.alloca operations inside of omp reduction
 ops

Advise to place the alloca at the start of the first block of whichever
region (init or combiner) we are currently inside.

It probably isn't safe to put an alloca inside of a combiner region
because this will be executed multiple times. But that would be a bug to
fix in Lower/OpenMP.cpp, not here.
---
 flang/lib/Optimizer/Builder/FIRBuilder.cpp |  2 ++
 flang/lib/Optimizer/CodeGen/CodeGen.cpp| 11 +--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp 
b/flang/lib/Optimizer/Builder/FIRBuilder.cpp
index 12da7412888a3b..f7327a299d9a5e 100644
--- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp
+++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp
@@ -208,6 +208,8 @@ mlir::Block *fir::FirOpBuilder::getAllocaBlock() {
   .getParentOfType()) {
 return ompOutlineableIface.getAllocaBlock();
   }
+  if (mlir::isa(getRegion().getParentOp()))
+return &getRegion().front();
   if (auto accRecipeIface =
   getRegion().getParentOfType()) {
 return accRecipeIface.getAllocaBlock(getRegion());
diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp 
b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
index f81a08388da722..123eb6e4e6a255 100644
--- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp
+++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
@@ -410,8 +410,15 @@ class FIROpConversion : public 
mlir::ConvertOpToLLVMPattern {
   mlir::ConversionPatternRewriter &rewriter) const {
 auto thisPt = rewriter.saveInsertionPoint();
 mlir::Operation *parentOp = rewriter.getInsertionBlock()->getParentOp();
-mlir::Block *insertBlock = getBlockForAllocaInsert(parentOp);
-rewriter.setInsertionPointToStart(insertBlock);
+if (mlir::isa(parentOp)) {
+  // ReductionDeclareOp has multiple child regions. We want to get the 
first
+  // block of whichever of those regions we are currently in
+  mlir::Region *parentRegion = rewriter.getInsertionBlock()->getParent();
+  rewriter.setInsertionPointToStart(&parentRegion->front());
+} else {
+  mlir::Block *insertBlock = getBlockForAllocaInsert(parentOp);
+  rewriter.setInsertionPointToStart(insertBlock);
+}
 auto size = genI32Constant(loc, rewriter, 1);
 unsigned allocaAs = getAllocaAddressSpace(rewriter);
 unsigned programAs = getProgramAddressSpace(rewriter);

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] support fir.alloca operations inside of omp reduction ops (PR #84952)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-codegen

Author: Tom Eccles (tblah)


Changes

Advise to place the alloca at the start of the first block of whichever region 
(init or combiner) we are currently inside.

It probably isn't safe to put an alloca inside of a combiner region because 
this will be executed multiple times. But that would be a bug to fix in 
Lower/OpenMP.cpp, not here.

OpenMP array reductions 1/6

---
Full diff: https://github.com/llvm/llvm-project/pull/84952.diff


2 Files Affected:

- (modified) flang/lib/Optimizer/Builder/FIRBuilder.cpp (+2) 
- (modified) flang/lib/Optimizer/CodeGen/CodeGen.cpp (+9-2) 


``diff
diff --git a/flang/lib/Optimizer/Builder/FIRBuilder.cpp 
b/flang/lib/Optimizer/Builder/FIRBuilder.cpp
index 12da7412888a3b..f7327a299d9a5e 100644
--- a/flang/lib/Optimizer/Builder/FIRBuilder.cpp
+++ b/flang/lib/Optimizer/Builder/FIRBuilder.cpp
@@ -208,6 +208,8 @@ mlir::Block *fir::FirOpBuilder::getAllocaBlock() {
   .getParentOfType()) {
 return ompOutlineableIface.getAllocaBlock();
   }
+  if (mlir::isa(getRegion().getParentOp()))
+return &getRegion().front();
   if (auto accRecipeIface =
   getRegion().getParentOfType()) {
 return accRecipeIface.getAllocaBlock(getRegion());
diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp 
b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
index f81a08388da722..123eb6e4e6a255 100644
--- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp
+++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
@@ -410,8 +410,15 @@ class FIROpConversion : public 
mlir::ConvertOpToLLVMPattern {
   mlir::ConversionPatternRewriter &rewriter) const {
 auto thisPt = rewriter.saveInsertionPoint();
 mlir::Operation *parentOp = rewriter.getInsertionBlock()->getParentOp();
-mlir::Block *insertBlock = getBlockForAllocaInsert(parentOp);
-rewriter.setInsertionPointToStart(insertBlock);
+if (mlir::isa(parentOp)) {
+  // ReductionDeclareOp has multiple child regions. We want to get the 
first
+  // block of whichever of those regions we are currently in
+  mlir::Region *parentRegion = rewriter.getInsertionBlock()->getParent();
+  rewriter.setInsertionPointToStart(&parentRegion->front());
+} else {
+  mlir::Block *insertBlock = getBlockForAllocaInsert(parentOp);
+  rewriter.setInsertionPointToStart(insertBlock);
+}
 auto size = genI32Constant(loc, rewriter, 1);
 unsigned allocaAs = getAllocaAddressSpace(rewriter);
 unsigned programAs = getProgramAddressSpace(rewriter);

``




https://github.com/llvm/llvm-project/pull/84952
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] run CFG conversion on omp reduction declare ops (PR #84953)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah created https://github.com/llvm/llvm-project/pull/84953

Most FIR passes only look for FIR operations inside of functions (either 
because they run only on func.func or they run on the module but iterate over 
functions internally). But there can also be FIR operations inside of 
fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they only 
contained very simple FIR code which doesn't need most passes to be lowered 
into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure that 
every pass runs on all of these container operations. I will write an RFC for 
this soon.

In the meantime, this pass duplicates the CFG conversion pass to also run on 
omp reduction operations. This is similar to how the AbstractResult pass is 
already duplicated for fir.global operations.

OpenMP array reductions 2/6

>From 192da3c05fd8c0759f280e0895ffc2f09b2203e4 Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Thu, 15 Feb 2024 12:12:29 +
Subject: [PATCH] [flang] run CFG conversion on omp reduction declare ops

Most FIR passes only look for FIR operations inside of functions (either
because they run only on func.func or they run on the module but iterate
over functions internally). But there can also be FIR operations inside
of fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they
only contained very simple FIR code which doesn't need most passes to be
lowered into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure
that every pass runs on all of these container operations. I will write
an RFC for this soon.

In the meantime, this pass duplicates the CFG conversion pass to also
run on omp reduction operations. This is similar to how the
AbstractResult pass is already duplicated for fir.global operations.

Co-authored-by: Mats Petersson 
---
 .../flang/Optimizer/Transforms/Passes.h   |  7 +++--
 .../flang/Optimizer/Transforms/Passes.td  | 12 ++--
 flang/include/flang/Tools/CLOptions.inc   |  4 ++-
 .../Transforms/ControlFlowConverter.cpp   | 30 ++-
 flang/test/Driver/bbc-mlir-pass-pipeline.f90  |  5 +++-
 .../test/Driver/mlir-debug-pass-pipeline.f90  | 16 +-
 flang/test/Driver/mlir-pass-pipeline.f90  | 16 ++
 flang/test/Fir/array-value-copy-2.fir |  4 +--
 flang/test/Fir/basic-program.fir  |  5 +++-
 .../Fir/convert-to-llvm-openmp-and-fir.fir|  2 +-
 flang/test/Fir/loop01.fir |  2 +-
 flang/test/Fir/loop02.fir |  4 +--
 flang/test/Lower/OpenMP/FIR/flush.f90 |  2 +-
 flang/test/Lower/OpenMP/FIR/master.f90|  2 +-
 .../Lower/OpenMP/FIR/parallel-sections.f90|  2 +-
 15 files changed, 76 insertions(+), 37 deletions(-)

diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h 
b/flang/include/flang/Optimizer/Transforms/Passes.h
index e1d22c8c986da7..adf747ebdb400b 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.h
+++ b/flang/include/flang/Optimizer/Transforms/Passes.h
@@ -11,6 +11,7 @@
 
 #include "flang/Optimizer/Dialect/FIROps.h"
 #include "mlir/Dialect/LLVMIR/LLVMAttrs.h"
+#include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "mlir/Pass/Pass.h"
 #include "mlir/Pass/PassRegistry.h"
 #include 
@@ -37,7 +38,8 @@ namespace fir {
 #define GEN_PASS_DECL_ANNOTATECONSTANTOPERANDS
 #define GEN_PASS_DECL_ARRAYVALUECOPY
 #define GEN_PASS_DECL_CHARACTERCONVERSION
-#define GEN_PASS_DECL_CFGCONVERSION
+#define GEN_PASS_DECL_CFGCONVERSIONONFUNC
+#define GEN_PASS_DECL_CFGCONVERSIONONREDUCTION
 #define GEN_PASS_DECL_EXTERNALNAMECONVERSION
 #define GEN_PASS_DECL_MEMREFDATAFLOWOPT
 #define GEN_PASS_DECL_SIMPLIFYINTRINSICS
@@ -53,7 +55,8 @@ std::unique_ptr 
createAbstractResultOnGlobalOptPass();
 std::unique_ptr createAffineDemotionPass();
 std::unique_ptr
 createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {});
-std::unique_ptr createFirToCfgPass();
+std::unique_ptr createFirToCfgOnFuncPass();
+std::unique_ptr createFirToCfgOnReductionPass();
 std::unique_ptr createCharacterConversionPass();
 std::unique_ptr createExternalNameConversionPass();
 std::unique_ptr
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td 
b/flang/include/flang/Optimizer/Transforms/Passes.td
index 5fb576fd876254..e6ea92d814400f 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -145,7 +145,8 @@ def CharacterConversion : Pass<"character-conversion"> {
   ];
 }
 
-def CFGConversion : Pass<"cfg-conversion", "::mlir::func::FuncOp"> {
+class CFGConversionBase
+  : Pass<"cfg-conversion-on-" # optExt # "-opt", operation> {
   let summary = "Convert FIR structured control flow ops to CFG ops.";
   let description = [{
 Trans

[llvm-branch-commits] [flang] [flang] run CFG conversion on omp reduction declare ops (PR #84953)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-driver

Author: Tom Eccles (tblah)


Changes

Most FIR passes only look for FIR operations inside of functions (either 
because they run only on func.func or they run on the module but iterate over 
functions internally). But there can also be FIR operations inside of 
fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they only 
contained very simple FIR code which doesn't need most passes to be lowered 
into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure that 
every pass runs on all of these container operations. I will write an RFC for 
this soon.

In the meantime, this pass duplicates the CFG conversion pass to also run on 
omp reduction operations. This is similar to how the AbstractResult pass is 
already duplicated for fir.global operations.

OpenMP array reductions 2/6

---
Full diff: https://github.com/llvm/llvm-project/pull/84953.diff


15 Files Affected:

- (modified) flang/include/flang/Optimizer/Transforms/Passes.h (+5-2) 
- (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+10-2) 
- (modified) flang/include/flang/Tools/CLOptions.inc (+3-1) 
- (modified) flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp (+22-8) 
- (modified) flang/test/Driver/bbc-mlir-pass-pipeline.f90 (+4-1) 
- (modified) flang/test/Driver/mlir-debug-pass-pipeline.f90 (+9-7) 
- (modified) flang/test/Driver/mlir-pass-pipeline.f90 (+10-6) 
- (modified) flang/test/Fir/array-value-copy-2.fir (+2-2) 
- (modified) flang/test/Fir/basic-program.fir (+4-1) 
- (modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+1-1) 
- (modified) flang/test/Fir/loop01.fir (+1-1) 
- (modified) flang/test/Fir/loop02.fir (+2-2) 
- (modified) flang/test/Lower/OpenMP/FIR/flush.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/FIR/master.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/FIR/parallel-sections.f90 (+1-1) 


``diff
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h 
b/flang/include/flang/Optimizer/Transforms/Passes.h
index e1d22c8c986da7..adf747ebdb400b 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.h
+++ b/flang/include/flang/Optimizer/Transforms/Passes.h
@@ -11,6 +11,7 @@
 
 #include "flang/Optimizer/Dialect/FIROps.h"
 #include "mlir/Dialect/LLVMIR/LLVMAttrs.h"
+#include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "mlir/Pass/Pass.h"
 #include "mlir/Pass/PassRegistry.h"
 #include 
@@ -37,7 +38,8 @@ namespace fir {
 #define GEN_PASS_DECL_ANNOTATECONSTANTOPERANDS
 #define GEN_PASS_DECL_ARRAYVALUECOPY
 #define GEN_PASS_DECL_CHARACTERCONVERSION
-#define GEN_PASS_DECL_CFGCONVERSION
+#define GEN_PASS_DECL_CFGCONVERSIONONFUNC
+#define GEN_PASS_DECL_CFGCONVERSIONONREDUCTION
 #define GEN_PASS_DECL_EXTERNALNAMECONVERSION
 #define GEN_PASS_DECL_MEMREFDATAFLOWOPT
 #define GEN_PASS_DECL_SIMPLIFYINTRINSICS
@@ -53,7 +55,8 @@ std::unique_ptr 
createAbstractResultOnGlobalOptPass();
 std::unique_ptr createAffineDemotionPass();
 std::unique_ptr
 createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {});
-std::unique_ptr createFirToCfgPass();
+std::unique_ptr createFirToCfgOnFuncPass();
+std::unique_ptr createFirToCfgOnReductionPass();
 std::unique_ptr createCharacterConversionPass();
 std::unique_ptr createExternalNameConversionPass();
 std::unique_ptr
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td 
b/flang/include/flang/Optimizer/Transforms/Passes.td
index 5fb576fd876254..e6ea92d814400f 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -145,7 +145,8 @@ def CharacterConversion : Pass<"character-conversion"> {
   ];
 }
 
-def CFGConversion : Pass<"cfg-conversion", "::mlir::func::FuncOp"> {
+class CFGConversionBase
+  : Pass<"cfg-conversion-on-" # optExt # "-opt", operation> {
   let summary = "Convert FIR structured control flow ops to CFG ops.";
   let description = [{
 Transform the `fir.do_loop`, `fir.if`, `fir.iterate_while` and
@@ -154,7 +155,6 @@ def CFGConversion : Pass<"cfg-conversion", 
"::mlir::func::FuncOp"> {
 
 This pass is required before code gen to the LLVM IR dialect.
   }];
-  let constructor = "::fir::createFirToCfgPass()";
   let dependentDialects = [
 "fir::FIROpsDialect", "mlir::func::FuncDialect"
   ];
@@ -165,6 +165,14 @@ def CFGConversion : Pass<"cfg-conversion", 
"::mlir::func::FuncOp"> {
   ];
 }
 
+def CFGConversionOnFunc : CFGConversionBase<"func", "mlir::func::FuncOp"> {
+  let constructor = "::fir::createFirToCfgOnFuncPass()";
+}
+
+def CFGConversionOnReduction : CFGConversionBase<"reduce", 
"mlir::omp::ReductionDeclareOp"> {
+  let constructor = "::fir::createFirToCfgOnReductionPass()";
+}
+
 def ExternalNameConversion : Pass<"external-name-interop", "mlir::ModuleOp"> {
   let summary = "Convert name for external interoperability";
   let description = [{
diff --git a/f

[llvm-branch-commits] [flang] [flang][CodeGen] Run PreCGRewrite on omp reduction declare ops (PR #84954)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah created https://github.com/llvm/llvm-project/pull/84954

OpenMP reduction declare operations can contain FIR code which needs to be 
lowered to LLVM. With array reductions, these regions can contain more 
complicated operations which need PreCGRewriting. A similar extra case was 
already needed for fir::GlobalOp.

OpenMP array reductions 3/6

>From f951d16cf6cb1ab221f47ca2e712020b9af0af87 Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Fri, 1 Mar 2024 16:59:09 +
Subject: [PATCH] [flang][CodeGen] Run PreCGRewrite on omp reduction declare
 ops

OpenMP reduction declare operations can contain FIR code which needs to
be lowered to LLVM. With array reductions, these regions can contain
more complicated operations which need PreCGRewriting. A similar extra
case was already needed for fir::GlobalOp.
---
 flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp | 5 +
 1 file changed, 5 insertions(+)

diff --git a/flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp 
b/flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp
index 0170b56367cf3c..dd935e71762355 100644
--- a/flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp
+++ b/flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp
@@ -22,6 +22,7 @@
 #include "mlir/Transforms/RegionUtils.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Support/Debug.h"
+#include 
 
 namespace fir {
 #define GEN_PASS_DEF_CODEGENREWRITE
@@ -319,6 +320,10 @@ class CodeGenRewrite : public 
fir::impl::CodeGenRewriteBase {
   runOn(func, func.getBody());
 for (auto global : mod.getOps())
   runOn(global, global.getRegion());
+for (auto omp : mod.getOps()) {
+  runOn(omp, omp.getInitializerRegion());
+  runOn(omp, omp.getReductionRegion());
+}
   }
 };
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][CodeGen] Run PreCGRewrite on omp reduction declare ops (PR #84954)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-codegen

Author: Tom Eccles (tblah)


Changes

OpenMP reduction declare operations can contain FIR code which needs to be 
lowered to LLVM. With array reductions, these regions can contain more 
complicated operations which need PreCGRewriting. A similar extra case was 
already needed for fir::GlobalOp.

OpenMP array reductions 3/6

---
Full diff: https://github.com/llvm/llvm-project/pull/84954.diff


1 Files Affected:

- (modified) flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp (+5) 


``diff
diff --git a/flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp 
b/flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp
index 0170b56367cf3c..dd935e71762355 100644
--- a/flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp
+++ b/flang/lib/Optimizer/CodeGen/PreCGRewrite.cpp
@@ -22,6 +22,7 @@
 #include "mlir/Transforms/RegionUtils.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/Support/Debug.h"
+#include 
 
 namespace fir {
 #define GEN_PASS_DEF_CODEGENREWRITE
@@ -319,6 +320,10 @@ class CodeGenRewrite : public 
fir::impl::CodeGenRewriteBase {
   runOn(func, func.getBody());
 for (auto global : mod.getOps())
   runOn(global, global.getRegion());
+for (auto omp : mod.getOps()) {
+  runOn(omp, omp.getInitializerRegion());
+  runOn(omp, omp.getReductionRegion());
+}
   }
 };
 

``




https://github.com/llvm/llvm-project/pull/84954
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah created https://github.com/llvm/llvm-project/pull/84955

It looks like the mappings for call instructions were forgotten here. This 
fixes a bug in OpenMP when in-lining a region containing call operations 
multiple times.

OpenMP array reductions 4/6

>From c62b31262bc619145866a304e10925a35462f5bf Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Wed, 21 Feb 2024 14:22:39 +
Subject: [PATCH] [mlir][LLVM] erase call mappings in forgetMapping()

It looks like the mappings for call instructions were forgotten here.
This fixes a bug in OpenMP when inlining a region containing call
operations multiple times.
---
 mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp 
b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index c00628a420a000..995544238e4a3c 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -716,6 +716,8 @@ void ModuleTranslation::forgetMapping(Region ®ion) {
   branchMapping.erase(&op);
 if (isa(op))
   globalsMapping.erase(&op);
+if (isa(op))
+  callMapping.erase(&op);
 llvm::append_range(
 toProcess,
 llvm::map_range(op.getRegions(), [](Region &r) { return &r; }));

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir

Author: Tom Eccles (tblah)


Changes

It looks like the mappings for call instructions were forgotten here. This 
fixes a bug in OpenMP when in-lining a region containing call operations 
multiple times.

OpenMP array reductions 4/6

---
Full diff: https://github.com/llvm/llvm-project/pull/84955.diff


1 Files Affected:

- (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+2) 


``diff
diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp 
b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index c00628a420a000..995544238e4a3c 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -716,6 +716,8 @@ void ModuleTranslation::forgetMapping(Region ®ion) {
   branchMapping.erase(&op);
 if (isa(op))
   globalsMapping.erase(&op);
+if (isa(op))
+  callMapping.erase(&op);
 llvm::append_range(
 toProcess,
 llvm::map_range(op.getRegions(), [](Region &r) { return &r; }));

``




https://github.com/llvm/llvm-project/pull/84955
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-mlir-llvm

Author: Tom Eccles (tblah)


Changes

It looks like the mappings for call instructions were forgotten here. This 
fixes a bug in OpenMP when in-lining a region containing call operations 
multiple times.

OpenMP array reductions 4/6

---
Full diff: https://github.com/llvm/llvm-project/pull/84955.diff


1 Files Affected:

- (modified) mlir/lib/Target/LLVMIR/ModuleTranslation.cpp (+2) 


``diff
diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp 
b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index c00628a420a000..995544238e4a3c 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -716,6 +716,8 @@ void ModuleTranslation::forgetMapping(Region ®ion) {
   branchMapping.erase(&op);
 if (isa(op))
   globalsMapping.erase(&op);
+if (isa(op))
+  callMapping.erase(&op);
 llvm::append_range(
 toProcess,
 llvm::map_range(op.getRegions(), [](Region &r) { return &r; }));

``




https://github.com/llvm/llvm-project/pull/84955
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][NFC] move extractSequenceType helper out of OpenACC to share code (PR #84957)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah created https://github.com/llvm/llvm-project/pull/84957

Moving extractSequenceType to FIRType.h so that this can also be used from 
OpenMP.

OpenMP array reductions 5/6

>From 2ff12fa0a580cb060f208d173d9af72bfa49d3b2 Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Fri, 1 Mar 2024 15:47:24 +
Subject: [PATCH] [flang][NFC] move extractSequenceType helper out of OpenACC
 to share code

Moving extractSequenceType to FIRType.h so that this can also be used
from OpenMP.
---
 .../include/flang/Optimizer/Dialect/FIRType.h |  3 +++
 flang/lib/Lower/OpenACC.cpp   | 21 ---
 flang/lib/Optimizer/Dialect/FIRType.cpp   | 12 +++
 3 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/flang/include/flang/Optimizer/Dialect/FIRType.h 
b/flang/include/flang/Optimizer/Dialect/FIRType.h
index a526b4ddf3b98c..7fcd9c1babf24f 100644
--- a/flang/include/flang/Optimizer/Dialect/FIRType.h
+++ b/flang/include/flang/Optimizer/Dialect/FIRType.h
@@ -237,6 +237,9 @@ inline mlir::Type unwrapSequenceType(mlir::Type t) {
   return t;
 }
 
+/// Return the nested sequence type if any.
+mlir::Type extractSequenceType(mlir::Type ty);
+
 inline mlir::Type unwrapRefType(mlir::Type t) {
   if (auto eleTy = dyn_cast_ptrEleTy(t))
 return eleTy;
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index d2c6006ecf914a..6539de4d88304c 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -406,19 +406,6 @@ fir::ShapeOp genShapeOp(mlir::OpBuilder &builder, 
fir::SequenceType seqTy,
   return builder.create(loc, extents);
 }
 
-/// Return the nested sequence type if any.
-static mlir::Type extractSequenceType(mlir::Type ty) {
-  if (mlir::isa(ty))
-return ty;
-  if (auto boxTy = mlir::dyn_cast(ty))
-return extractSequenceType(boxTy.getEleTy());
-  if (auto heapTy = mlir::dyn_cast(ty))
-return extractSequenceType(heapTy.getEleTy());
-  if (auto ptrTy = mlir::dyn_cast(ty))
-return extractSequenceType(ptrTy.getEleTy());
-  return mlir::Type{};
-}
-
 template 
 static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe,
  mlir::Type ty, mlir::Location loc) {
@@ -454,7 +441,7 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder 
&builder, RecipeOp recipe,
   }
 }
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 if (!innerTy)
   TODO(loc, "Unsupported boxed type in OpenACC privatization");
 fir::FirOpBuilder firBuilder{builder, recipe.getOperation()};
@@ -688,7 +675,7 @@ mlir::acc::FirstprivateRecipeOp 
Fortran::lower::createOrGetFirstprivateRecipe(
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
 fir::FirOpBuilder firBuilder{builder, recipe.getOperation()};
 llvm::SmallVector tripletArgs;
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 fir::SequenceType seqTy =
 mlir::dyn_cast_or_null(innerTy);
 if (!seqTy)
@@ -1018,7 +1005,7 @@ static mlir::Value 
genReductionInitRegion(fir::FirOpBuilder &builder,
   return declareOp.getBase();
 }
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 if (!mlir::isa(innerTy))
   TODO(loc, "Unsupported boxed type for reduction");
 // Create the private copy from the initial fir.box.
@@ -1230,7 +1217,7 @@ static void genCombiner(fir::FirOpBuilder &builder, 
mlir::Location loc,
 builder.create(loc, res, addr1);
 builder.setInsertionPointAfter(loops[0]);
   } else if (auto boxTy = mlir::dyn_cast(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 fir::SequenceType seqTy =
 mlir::dyn_cast_or_null(innerTy);
 if (!seqTy)
diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp 
b/flang/lib/Optimizer/Dialect/FIRType.cpp
index 8a2c681d958609..5c4cad6d208344 100644
--- a/flang/lib/Optimizer/Dialect/FIRType.cpp
+++ b/flang/lib/Optimizer/Dialect/FIRType.cpp
@@ -254,6 +254,18 @@ bool hasDynamicSize(mlir::Type t) {
   return false;
 }
 
+mlir::Type extractSequenceType(mlir::Type ty) {
+  if (mlir::isa(ty))
+return ty;
+  if (auto boxTy = mlir::dyn_cast(ty))
+return extractSequenceType(boxTy.getEleTy());
+  if (auto heapTy = mlir::dyn_cast(ty))
+return extractSequenceType(heapTy.getEleTy());
+  if (auto ptrTy = mlir::dyn_cast(ty))
+return extractSequenceType(ptrTy.getEleTy());
+  return mlir::Type{};
+}
+
 bool isPointerType(mlir::Type ty) {
   if (auto refTy = fir::dyn_cast_ptrEleTy(ty))
 ty = refTy;

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinf

[llvm-branch-commits] [flang] [flang][NFC] move extractSequenceType helper out of OpenACC to share code (PR #84957)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-openacc

Author: Tom Eccles (tblah)


Changes

Moving extractSequenceType to FIRType.h so that this can also be used from 
OpenMP.

OpenMP array reductions 5/6

---
Full diff: https://github.com/llvm/llvm-project/pull/84957.diff


3 Files Affected:

- (modified) flang/include/flang/Optimizer/Dialect/FIRType.h (+3) 
- (modified) flang/lib/Lower/OpenACC.cpp (+4-17) 
- (modified) flang/lib/Optimizer/Dialect/FIRType.cpp (+12) 


``diff
diff --git a/flang/include/flang/Optimizer/Dialect/FIRType.h 
b/flang/include/flang/Optimizer/Dialect/FIRType.h
index a526b4ddf3b98c..7fcd9c1babf24f 100644
--- a/flang/include/flang/Optimizer/Dialect/FIRType.h
+++ b/flang/include/flang/Optimizer/Dialect/FIRType.h
@@ -237,6 +237,9 @@ inline mlir::Type unwrapSequenceType(mlir::Type t) {
   return t;
 }
 
+/// Return the nested sequence type if any.
+mlir::Type extractSequenceType(mlir::Type ty);
+
 inline mlir::Type unwrapRefType(mlir::Type t) {
   if (auto eleTy = dyn_cast_ptrEleTy(t))
 return eleTy;
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index d2c6006ecf914a..6539de4d88304c 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -406,19 +406,6 @@ fir::ShapeOp genShapeOp(mlir::OpBuilder &builder, 
fir::SequenceType seqTy,
   return builder.create(loc, extents);
 }
 
-/// Return the nested sequence type if any.
-static mlir::Type extractSequenceType(mlir::Type ty) {
-  if (mlir::isa(ty))
-return ty;
-  if (auto boxTy = mlir::dyn_cast(ty))
-return extractSequenceType(boxTy.getEleTy());
-  if (auto heapTy = mlir::dyn_cast(ty))
-return extractSequenceType(heapTy.getEleTy());
-  if (auto ptrTy = mlir::dyn_cast(ty))
-return extractSequenceType(ptrTy.getEleTy());
-  return mlir::Type{};
-}
-
 template 
 static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe,
  mlir::Type ty, mlir::Location loc) {
@@ -454,7 +441,7 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder 
&builder, RecipeOp recipe,
   }
 }
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 if (!innerTy)
   TODO(loc, "Unsupported boxed type in OpenACC privatization");
 fir::FirOpBuilder firBuilder{builder, recipe.getOperation()};
@@ -688,7 +675,7 @@ mlir::acc::FirstprivateRecipeOp 
Fortran::lower::createOrGetFirstprivateRecipe(
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
 fir::FirOpBuilder firBuilder{builder, recipe.getOperation()};
 llvm::SmallVector tripletArgs;
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 fir::SequenceType seqTy =
 mlir::dyn_cast_or_null(innerTy);
 if (!seqTy)
@@ -1018,7 +1005,7 @@ static mlir::Value 
genReductionInitRegion(fir::FirOpBuilder &builder,
   return declareOp.getBase();
 }
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 if (!mlir::isa(innerTy))
   TODO(loc, "Unsupported boxed type for reduction");
 // Create the private copy from the initial fir.box.
@@ -1230,7 +1217,7 @@ static void genCombiner(fir::FirOpBuilder &builder, 
mlir::Location loc,
 builder.create(loc, res, addr1);
 builder.setInsertionPointAfter(loops[0]);
   } else if (auto boxTy = mlir::dyn_cast(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 fir::SequenceType seqTy =
 mlir::dyn_cast_or_null(innerTy);
 if (!seqTy)
diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp 
b/flang/lib/Optimizer/Dialect/FIRType.cpp
index 8a2c681d958609..5c4cad6d208344 100644
--- a/flang/lib/Optimizer/Dialect/FIRType.cpp
+++ b/flang/lib/Optimizer/Dialect/FIRType.cpp
@@ -254,6 +254,18 @@ bool hasDynamicSize(mlir::Type t) {
   return false;
 }
 
+mlir::Type extractSequenceType(mlir::Type ty) {
+  if (mlir::isa(ty))
+return ty;
+  if (auto boxTy = mlir::dyn_cast(ty))
+return extractSequenceType(boxTy.getEleTy());
+  if (auto heapTy = mlir::dyn_cast(ty))
+return extractSequenceType(heapTy.getEleTy());
+  if (auto ptrTy = mlir::dyn_cast(ty))
+return extractSequenceType(ptrTy.getEleTy());
+  return mlir::Type{};
+}
+
 bool isPointerType(mlir::Type ty) {
   if (auto refTy = fir::dyn_cast_ptrEleTy(ty))
 ty = refTy;

``




https://github.com/llvm/llvm-project/pull/84957
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][NFC] move extractSequenceType helper out of OpenACC to share code (PR #84957)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)


Changes

Moving extractSequenceType to FIRType.h so that this can also be used from 
OpenMP.

OpenMP array reductions 5/6

---
Full diff: https://github.com/llvm/llvm-project/pull/84957.diff


3 Files Affected:

- (modified) flang/include/flang/Optimizer/Dialect/FIRType.h (+3) 
- (modified) flang/lib/Lower/OpenACC.cpp (+4-17) 
- (modified) flang/lib/Optimizer/Dialect/FIRType.cpp (+12) 


``diff
diff --git a/flang/include/flang/Optimizer/Dialect/FIRType.h 
b/flang/include/flang/Optimizer/Dialect/FIRType.h
index a526b4ddf3b98c..7fcd9c1babf24f 100644
--- a/flang/include/flang/Optimizer/Dialect/FIRType.h
+++ b/flang/include/flang/Optimizer/Dialect/FIRType.h
@@ -237,6 +237,9 @@ inline mlir::Type unwrapSequenceType(mlir::Type t) {
   return t;
 }
 
+/// Return the nested sequence type if any.
+mlir::Type extractSequenceType(mlir::Type ty);
+
 inline mlir::Type unwrapRefType(mlir::Type t) {
   if (auto eleTy = dyn_cast_ptrEleTy(t))
 return eleTy;
diff --git a/flang/lib/Lower/OpenACC.cpp b/flang/lib/Lower/OpenACC.cpp
index d2c6006ecf914a..6539de4d88304c 100644
--- a/flang/lib/Lower/OpenACC.cpp
+++ b/flang/lib/Lower/OpenACC.cpp
@@ -406,19 +406,6 @@ fir::ShapeOp genShapeOp(mlir::OpBuilder &builder, 
fir::SequenceType seqTy,
   return builder.create(loc, extents);
 }
 
-/// Return the nested sequence type if any.
-static mlir::Type extractSequenceType(mlir::Type ty) {
-  if (mlir::isa(ty))
-return ty;
-  if (auto boxTy = mlir::dyn_cast(ty))
-return extractSequenceType(boxTy.getEleTy());
-  if (auto heapTy = mlir::dyn_cast(ty))
-return extractSequenceType(heapTy.getEleTy());
-  if (auto ptrTy = mlir::dyn_cast(ty))
-return extractSequenceType(ptrTy.getEleTy());
-  return mlir::Type{};
-}
-
 template 
 static void genPrivateLikeInitRegion(mlir::OpBuilder &builder, RecipeOp recipe,
  mlir::Type ty, mlir::Location loc) {
@@ -454,7 +441,7 @@ static void genPrivateLikeInitRegion(mlir::OpBuilder 
&builder, RecipeOp recipe,
   }
 }
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 if (!innerTy)
   TODO(loc, "Unsupported boxed type in OpenACC privatization");
 fir::FirOpBuilder firBuilder{builder, recipe.getOperation()};
@@ -688,7 +675,7 @@ mlir::acc::FirstprivateRecipeOp 
Fortran::lower::createOrGetFirstprivateRecipe(
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
 fir::FirOpBuilder firBuilder{builder, recipe.getOperation()};
 llvm::SmallVector tripletArgs;
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 fir::SequenceType seqTy =
 mlir::dyn_cast_or_null(innerTy);
 if (!seqTy)
@@ -1018,7 +1005,7 @@ static mlir::Value 
genReductionInitRegion(fir::FirOpBuilder &builder,
   return declareOp.getBase();
 }
   } else if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 if (!mlir::isa(innerTy))
   TODO(loc, "Unsupported boxed type for reduction");
 // Create the private copy from the initial fir.box.
@@ -1230,7 +1217,7 @@ static void genCombiner(fir::FirOpBuilder &builder, 
mlir::Location loc,
 builder.create(loc, res, addr1);
 builder.setInsertionPointAfter(loops[0]);
   } else if (auto boxTy = mlir::dyn_cast(ty)) {
-mlir::Type innerTy = extractSequenceType(boxTy);
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
 fir::SequenceType seqTy =
 mlir::dyn_cast_or_null(innerTy);
 if (!seqTy)
diff --git a/flang/lib/Optimizer/Dialect/FIRType.cpp 
b/flang/lib/Optimizer/Dialect/FIRType.cpp
index 8a2c681d958609..5c4cad6d208344 100644
--- a/flang/lib/Optimizer/Dialect/FIRType.cpp
+++ b/flang/lib/Optimizer/Dialect/FIRType.cpp
@@ -254,6 +254,18 @@ bool hasDynamicSize(mlir::Type t) {
   return false;
 }
 
+mlir::Type extractSequenceType(mlir::Type ty) {
+  if (mlir::isa(ty))
+return ty;
+  if (auto boxTy = mlir::dyn_cast(ty))
+return extractSequenceType(boxTy.getEleTy());
+  if (auto heapTy = mlir::dyn_cast(ty))
+return extractSequenceType(heapTy.getEleTy());
+  if (auto ptrTy = mlir::dyn_cast(ty))
+return extractSequenceType(ptrTy.getEleTy());
+  return mlir::Type{};
+}
+
 bool isPointerType(mlir::Type ty) {
   if (auto refTy = fir::dyn_cast_ptrEleTy(ty))
 ty = refTy;

``




https://github.com/llvm/llvm-project/pull/84957
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah created https://github.com/llvm/llvm-project/pull/84958

This has been tested with arrays with compile-time constant bounds. Allocatable 
arrays and arrays with non-constant bounds are not yet supported. User-defined 
reduction functions are also not yet supported.

The design is intended to work for arrays with non-constant bounds too without 
a lot of extra work (mostly there are bugs in OpenMPIRBuilder I haven't fixed 
yet).

We need some way to get these runtime bounds into the reduction init and 
combiner regions. To keep things simple for now I opted to always box the array 
arguments so the box can be passed as one argument and the lower bounds and 
extents read from the box. This has the disadvantage of resulting in 
fir.box_dim operations inside of the critical section. If these prove to be a 
performance issue, we could follow OpenACC reading box lower bounds and extents 
before the reduction and passing them as block arguments to the reduction init 
and combiner regions. I would prefer to keep things simple for now.

Note: this implementation only works when the HLFIR lowering is used. I don't 
think it is worth supporting FIR-only lowering because the plan is for that to 
be removed soon.

OpenMP array reductions 6/6

>From bd668cd95d95d1e5b9c8436875c14878c98902ff Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Mon, 12 Feb 2024 14:03:00 +
Subject: [PATCH] [flang][OpenMP] lower simple array reductions

This has been tested with arrays with compile-time constant bounds.
Allocatable arrays and arrays with non-constant bounds are not yet
supported. User-defined reduction functions are also not yet supported.

The design is intended to work for arrays with non-constant bounds too
without a lot of extra work (mostly there are bugs in OpenMPIRBuilder I
haven't fixed yet).

We need some way to get these runtime bounds into the reduction init and
combiner regions. To keep things simple for now I opted to always box the
array arguments so the box can be passed as one argument and the lower
bounds and extents read from the box. This has the disadvantage of
resulting in fir.box_dim operations inside of the critical section. If
these prove to be a performance issue, we could follow OpenACC reading
box lower bounds and extents before the reduction and passing them as
block arguments to the reduction init and combiner regions. I would
prefer to keep things simple for now.

Note: this implementation only works when the HLFIR lowering is used. I
don't think it is worth supporting FIR-only lowering because the plan is
for that to be removed soon.
---
 flang/lib/Lower/OpenMP/ReductionProcessor.cpp | 283 ++
 flang/lib/Lower/OpenMP/ReductionProcessor.h   |   2 +-
 .../Lower/OpenMP/Todo/reduction-arrays.f90|  15 -
 .../Lower/OpenMP/parallel-reduction-array.f90 |  74 +
 .../Lower/OpenMP/wsloop-reduction-array.f90   |  84 ++
 5 files changed, 389 insertions(+), 69 deletions(-)
 delete mode 100644 flang/test/Lower/OpenMP/Todo/reduction-arrays.f90
 create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-array.f90
 create mode 100644 flang/test/Lower/OpenMP/wsloop-reduction-array.f90

diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp 
b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
index e6a63dd4b939ce..de2f11f5b9512e 100644
--- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
@@ -13,6 +13,7 @@
 #include "ReductionProcessor.h"
 
 #include "flang/Lower/AbstractConverter.h"
+#include "flang/Optimizer/Builder/HLFIRTools.h"
 #include "flang/Optimizer/Builder/Todo.h"
 #include "flang/Optimizer/Dialect/FIRType.h"
 #include "flang/Optimizer/HLFIR/HLFIROps.h"
@@ -92,10 +93,42 @@ std::string 
ReductionProcessor::getReductionName(llvm::StringRef name,
   if (isByRef)
 byrefAddition = "_byref";
 
-  return (llvm::Twine(name) +
-  (ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
-  llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
-  .str();
+  if (fir::isa_trivial(ty))
+return (llvm::Twine(name) +
+(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
+llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
+.str();
+
+  // creates a name like reduction_i_64_box_ux4x3
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+// TODO: support for allocatable boxes:
+// !fir.box>>
+fir::SequenceType seqTy = fir::unwrapRefType(boxTy.getEleTy())
+  .dyn_cast_or_null();
+if (!seqTy)
+  return {};
+
+std::string prefix = getReductionName(
+name, fir::unwrapSeqOrBoxedSeqType(ty), /*isByRef=*/false);
+if (prefix.empty())
+  return {};
+std::stringstream tyStr;
+tyStr << prefix << "_box_";
+bool first = true;
+for (std::int64_t extent : seqTy.getShape()) {
+  if (first)
+first = false;
+  else
+tyStr << "x";
+  if (exte

[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-12 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)


Changes

This has been tested with arrays with compile-time constant bounds. Allocatable 
arrays and arrays with non-constant bounds are not yet supported. User-defined 
reduction functions are also not yet supported.

The design is intended to work for arrays with non-constant bounds too without 
a lot of extra work (mostly there are bugs in OpenMPIRBuilder I haven't fixed 
yet).

We need some way to get these runtime bounds into the reduction init and 
combiner regions. To keep things simple for now I opted to always box the array 
arguments so the box can be passed as one argument and the lower bounds and 
extents read from the box. This has the disadvantage of resulting in 
fir.box_dim operations inside of the critical section. If these prove to be a 
performance issue, we could follow OpenACC reading box lower bounds and extents 
before the reduction and passing them as block arguments to the reduction init 
and combiner regions. I would prefer to keep things simple for now.

Note: this implementation only works when the HLFIR lowering is used. I don't 
think it is worth supporting FIR-only lowering because the plan is for that to 
be removed soon.

OpenMP array reductions 6/6

---

Patch is 29.34 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/84958.diff


5 Files Affected:

- (modified) flang/lib/Lower/OpenMP/ReductionProcessor.cpp (+230-53) 
- (modified) flang/lib/Lower/OpenMP/ReductionProcessor.h (+1-1) 
- (removed) flang/test/Lower/OpenMP/Todo/reduction-arrays.f90 (-15) 
- (added) flang/test/Lower/OpenMP/parallel-reduction-array.f90 (+74) 
- (added) flang/test/Lower/OpenMP/wsloop-reduction-array.f90 (+84) 


``diff
diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp 
b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
index e6a63dd4b939ce..de2f11f5b9512e 100644
--- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
@@ -13,6 +13,7 @@
 #include "ReductionProcessor.h"
 
 #include "flang/Lower/AbstractConverter.h"
+#include "flang/Optimizer/Builder/HLFIRTools.h"
 #include "flang/Optimizer/Builder/Todo.h"
 #include "flang/Optimizer/Dialect/FIRType.h"
 #include "flang/Optimizer/HLFIR/HLFIROps.h"
@@ -92,10 +93,42 @@ std::string 
ReductionProcessor::getReductionName(llvm::StringRef name,
   if (isByRef)
 byrefAddition = "_byref";
 
-  return (llvm::Twine(name) +
-  (ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
-  llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
-  .str();
+  if (fir::isa_trivial(ty))
+return (llvm::Twine(name) +
+(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
+llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
+.str();
+
+  // creates a name like reduction_i_64_box_ux4x3
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+// TODO: support for allocatable boxes:
+// !fir.box>>
+fir::SequenceType seqTy = fir::unwrapRefType(boxTy.getEleTy())
+  .dyn_cast_or_null();
+if (!seqTy)
+  return {};
+
+std::string prefix = getReductionName(
+name, fir::unwrapSeqOrBoxedSeqType(ty), /*isByRef=*/false);
+if (prefix.empty())
+  return {};
+std::stringstream tyStr;
+tyStr << prefix << "_box_";
+bool first = true;
+for (std::int64_t extent : seqTy.getShape()) {
+  if (first)
+first = false;
+  else
+tyStr << "x";
+  if (extent == seqTy.getUnknownExtent())
+tyStr << 'u'; // I'm not sure that '?' is safe in symbol names
+  else
+tyStr << extent;
+}
+return (tyStr.str() + byrefAddition).str();
+  }
+
+  return {};
 }
 
 std::string ReductionProcessor::getReductionName(
@@ -283,6 +316,156 @@ mlir::Value ReductionProcessor::createScalarCombiner(
   return reductionOp;
 }
 
+/// Create reduction combiner region for reduction variables which are boxed
+/// arrays
+static void genBoxCombiner(fir::FirOpBuilder &builder, mlir::Location loc,
+   ReductionProcessor::ReductionIdentifier redId,
+   fir::BaseBoxType boxTy, mlir::Value lhs,
+   mlir::Value rhs) {
+  fir::SequenceType seqTy =
+  mlir::dyn_cast_or_null(boxTy.getEleTy());
+  // TODO: support allocatable arrays: !fir.box>>
+  if (!seqTy || seqTy.hasUnknownShape())
+TODO(loc, "Unsupported boxed type in OpenMP reduction");
+
+  // load fir.ref>
+  mlir::Value lhsAddr = lhs;
+  lhs = builder.create(loc, lhs);
+  rhs = builder.create(loc, rhs);
+
+  const unsigned rank = seqTy.getDimension();
+  llvm::SmallVector extents;
+  extents.reserve(rank);
+  llvm::SmallVector lbAndExtents;
+  lbAndExtents.reserve(rank * 2);
+
+  // Get box lowerbounds and extents:
+  mlir::Type idxTy = builder.getIndexType();
+  for (unsigned i = 0; i < rank; ++i) {
+/

[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-12 Thread Kiran Chandramohan via llvm-branch-commits

https://github.com/kiranchandramohan approved this pull request.

LGTM. Please wait a day incase others have comments.

https://github.com/llvm/llvm-project/pull/84955
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] support fir.alloca operations inside of omp reduction ops (PR #84952)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/84952
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][NFC] move extractSequenceType helper out of OpenACC to share code (PR #84957)

2024-03-12 Thread Valentin Clement バレンタイン クレメン via llvm-branch-commits

https://github.com/clementval approved this pull request.


https://github.com/llvm/llvm-project/pull/84957
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang] run CFG conversion on omp reduction declare ops (PR #84953)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/84953
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][CodeGen] Run PreCGRewrite on omp reduction declare ops (PR #84954)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/84954
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/84955
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][NFC] move extractSequenceType helper out of OpenACC to share code (PR #84957)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/84957
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah edited https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert unique clauses in ClauseProcessor (PR #81622)

2024-03-12 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/81622
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Tom Eccles via llvm-branch-commits


@@ -87,50 +87,44 @@ getSimdModifier(const omp::clause::Schedule &clause) {
 
 static void
 genAllocateClause(Fortran::lower::AbstractConverter &converter,
-  const Fortran::parser::OmpAllocateClause &ompAllocateClause,
+  const omp::clause::Allocate &clause,
   llvm::SmallVectorImpl &allocatorOperands,
   llvm::SmallVectorImpl &allocateOperands) {
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
   mlir::Location currentLocation = converter.getCurrentLocation();
   Fortran::lower::StatementContext stmtCtx;
 
   mlir::Value allocatorOperand;
-  const Fortran::parser::OmpObjectList &ompObjectList =
-  std::get(ompAllocateClause.t);
-  const auto &allocateModifier = std::get<
-  std::optional>(
-  ompAllocateClause.t);
+  const omp::ObjectList &objectList = std::get(clause.t);
+  const auto &modifier =
+  std::get>(clause.t);
 
   // If the allocate modifier is present, check if we only use the allocator
   // submodifier.  ALIGN in this context is unimplemented
   const bool onlyAllocator =
-  allocateModifier &&
-  std::holds_alternative<
-  Fortran::parser::OmpAllocateClause::AllocateModifier::Allocator>(
-  allocateModifier->u);
+  modifier &&
+  std::holds_alternative(
+  modifier->u);
 
-  if (allocateModifier && !onlyAllocator) {
+  if (modifier && !onlyAllocator) {
 TODO(currentLocation, "OmpAllocateClause ALIGN modifier");
   }
 
   // Check if allocate clause has allocator specified. If so, add it
   // to list of allocators, otherwise, add default allocator to
   // list of allocators.
   if (onlyAllocator) {
-const auto &allocatorValue = std::get<
-Fortran::parser::OmpAllocateClause::AllocateModifier::Allocator>(
-allocateModifier->u);
-allocatorOperand = fir::getBase(converter.genExprValue(
-*Fortran::semantics::GetExpr(allocatorValue.v), stmtCtx));
-allocatorOperands.insert(allocatorOperands.end(), ompObjectList.v.size(),
- allocatorOperand);
+const auto &value =
+std::get(modifier->u);
+mlir::Value operand =
+fir::getBase(converter.genExprValue(value.v, stmtCtx));

tblah wrote:

Why did you drop `Fortran::semantics::GetExpr`?

https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Tom Eccles via llvm-branch-commits


@@ -181,45 +172,41 @@ genDependKindAttr(fir::FirOpBuilder &firOpBuilder,
   pbKind);
 }
 
-static mlir::Value getIfClauseOperand(
-Fortran::lower::AbstractConverter &converter,
-const Fortran::parser::OmpClause::If *ifClause,
-Fortran::parser::OmpIfClause::DirectiveNameModifier directiveName,
-mlir::Location clauseLocation) {
+static mlir::Value
+getIfClauseOperand(Fortran::lower::AbstractConverter &converter,
+   const omp::clause::If &clause,
+   omp::clause::If::DirectiveNameModifier directiveName,
+   mlir::Location clauseLocation) {
   // Only consider the clause if it's intended for the given directive.
-  auto &directive = std::get<
-  std::optional>(
-  ifClause->v.t);
+  auto &directive =
+  
std::get>(clause.t);
   if (directive && directive.value() != directiveName)
 return nullptr;
 
   Fortran::lower::StatementContext stmtCtx;
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
-  auto &expr = std::get(ifClause->v.t);
   mlir::Value ifVal = fir::getBase(
-  converter.genExprValue(*Fortran::semantics::GetExpr(expr), stmtCtx));
+  converter.genExprValue(std::get(clause.t), stmtCtx));
   return firOpBuilder.createConvert(clauseLocation, firOpBuilder.getI1Type(),
 ifVal);
 }
 
 static void
 addUseDeviceClause(Fortran::lower::AbstractConverter &converter,
-   const Fortran::parser::OmpObjectList &useDeviceClause,
+   const omp::ObjectList &objects,
llvm::SmallVectorImpl &operands,
llvm::SmallVectorImpl &useDeviceTypes,
llvm::SmallVectorImpl &useDeviceLocs,
llvm::SmallVectorImpl
&useDeviceSymbols) {
-  genObjectList(useDeviceClause, converter, operands);
+  genObjectList(objects, converter, operands);
   for (mlir::Value &operand : operands) {
 checkMapType(operand.getLoc(), operand.getType());
 useDeviceTypes.push_back(operand.getType());
 useDeviceLocs.push_back(operand.getLoc());
   }
-  for (const Fortran::parser::OmpObject &ompObject : useDeviceClause.v) {
-Fortran::semantics::Symbol *sym = getOmpObjectSymbol(ompObject);
-useDeviceSymbols.push_back(sym);
-  }
+  for (const omp::Object &object : objects)
+useDeviceSymbols.push_back(object.id());

tblah wrote:

Why have you dropped the call to `getOmpObjectSymbol`?

https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Tom Eccles via llvm-branch-commits


@@ -181,45 +172,41 @@ genDependKindAttr(fir::FirOpBuilder &firOpBuilder,
   pbKind);
 }
 
-static mlir::Value getIfClauseOperand(
-Fortran::lower::AbstractConverter &converter,
-const Fortran::parser::OmpClause::If *ifClause,
-Fortran::parser::OmpIfClause::DirectiveNameModifier directiveName,
-mlir::Location clauseLocation) {
+static mlir::Value
+getIfClauseOperand(Fortran::lower::AbstractConverter &converter,
+   const omp::clause::If &clause,
+   omp::clause::If::DirectiveNameModifier directiveName,
+   mlir::Location clauseLocation) {
   // Only consider the clause if it's intended for the given directive.
-  auto &directive = std::get<
-  std::optional>(
-  ifClause->v.t);
+  auto &directive =
+  
std::get>(clause.t);
   if (directive && directive.value() != directiveName)
 return nullptr;
 
   Fortran::lower::StatementContext stmtCtx;
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
-  auto &expr = std::get(ifClause->v.t);
   mlir::Value ifVal = fir::getBase(
-  converter.genExprValue(*Fortran::semantics::GetExpr(expr), stmtCtx));
+  converter.genExprValue(std::get(clause.t), stmtCtx));

tblah wrote:

Here too

https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-12 Thread Christian Ulmann via llvm-branch-commits

https://github.com/Dinistro approved this pull request.


https://github.com/llvm/llvm-project/pull/84955
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] release/18.x: [mlir][NFC] Apply rule of five to *Pass classes (#80998) (PR #83971)

2024-03-12 Thread Mehdi Amini via llvm-branch-commits

https://github.com/joker-eph approved this pull request.


https://github.com/llvm/llvm-project/pull/83971
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-12 Thread Mehdi Amini via llvm-branch-commits

joker-eph wrote:

We need a test here

https://github.com/llvm/llvm-project/pull/84955
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [DSE] Remove malloc from EarliestEscapeInfo before removing. (#84157) (PR #84946)

2024-03-12 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/84946
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18x: [clang] Avoid -Wshadow warning when init-capture named same as class … (PR #84912)

2024-03-12 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic edited https://github.com/llvm/llvm-project/pull/84912
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #83834)

2024-03-12 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic edited https://github.com/llvm/llvm-project/pull/83834
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [X86] Add missing subvector_subreg_lowering for BF16 (#83720) (PR #84491)

2024-03-12 Thread Nikita Popov via llvm-branch-commits

https://github.com/nikic closed https://github.com/llvm/llvm-project/pull/84491
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18x: [clang] Avoid -Wshadow warning when init-capture named same as class … (PR #84912)

2024-03-12 Thread Erich Keane via llvm-branch-commits

erichkeane wrote:

As far as I can tell, this isn't fixing a regression in Clang17, and thus isn't 
really a candidate for inclusion into the 18.x branches.

https://github.com/llvm/llvm-project/pull/84912
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18x: [clang] Avoid -Wshadow warning when init-capture named same as class … (PR #84912)

2024-03-12 Thread Neil Henning via llvm-branch-commits

sheredom wrote:

Yeah indeed - its not a clang 17 regression. It's a _clang 16_ regression. At 
the minute we're holding our internal compiler toolchain on clang 16, and for 
our users we've got to special case disable the warning entirely for clang 17.

https://github.com/llvm/llvm-project/pull/84912
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Krzysztof Parzyszek via llvm-branch-commits


@@ -181,45 +172,41 @@ genDependKindAttr(fir::FirOpBuilder &firOpBuilder,
   pbKind);
 }
 
-static mlir::Value getIfClauseOperand(
-Fortran::lower::AbstractConverter &converter,
-const Fortran::parser::OmpClause::If *ifClause,
-Fortran::parser::OmpIfClause::DirectiveNameModifier directiveName,
-mlir::Location clauseLocation) {
+static mlir::Value
+getIfClauseOperand(Fortran::lower::AbstractConverter &converter,
+   const omp::clause::If &clause,
+   omp::clause::If::DirectiveNameModifier directiveName,
+   mlir::Location clauseLocation) {
   // Only consider the clause if it's intended for the given directive.
-  auto &directive = std::get<
-  std::optional>(
-  ifClause->v.t);
+  auto &directive =
+  
std::get>(clause.t);
   if (directive && directive.value() != directiveName)
 return nullptr;
 
   Fortran::lower::StatementContext stmtCtx;
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
-  auto &expr = std::get(ifClause->v.t);
   mlir::Value ifVal = fir::getBase(
-  converter.genExprValue(*Fortran::semantics::GetExpr(expr), stmtCtx));
+  converter.genExprValue(std::get(clause.t), stmtCtx));
   return firOpBuilder.createConvert(clauseLocation, firOpBuilder.getI1Type(),
 ifVal);
 }
 
 static void
 addUseDeviceClause(Fortran::lower::AbstractConverter &converter,
-   const Fortran::parser::OmpObjectList &useDeviceClause,
+   const omp::ObjectList &objects,
llvm::SmallVectorImpl &operands,
llvm::SmallVectorImpl &useDeviceTypes,
llvm::SmallVectorImpl &useDeviceLocs,
llvm::SmallVectorImpl
&useDeviceSymbols) {
-  genObjectList(useDeviceClause, converter, operands);
+  genObjectList(objects, converter, operands);
   for (mlir::Value &operand : operands) {
 checkMapType(operand.getLoc(), operand.getType());
 useDeviceTypes.push_back(operand.getType());
 useDeviceLocs.push_back(operand.getLoc());
   }
-  for (const Fortran::parser::OmpObject &ompObject : useDeviceClause.v) {
-Fortran::semantics::Symbol *sym = getOmpObjectSymbol(ompObject);
-useDeviceSymbols.push_back(sym);
-  }
+  for (const omp::Object &object : objects)
+useDeviceSymbols.push_back(object.id());

kparzysz wrote:

Because `object.id()` returns `Symbol*`.  Also, same reason as above.

https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Krzysztof Parzyszek via llvm-branch-commits


@@ -87,50 +87,44 @@ getSimdModifier(const omp::clause::Schedule &clause) {
 
 static void
 genAllocateClause(Fortran::lower::AbstractConverter &converter,
-  const Fortran::parser::OmpAllocateClause &ompAllocateClause,
+  const omp::clause::Allocate &clause,
   llvm::SmallVectorImpl &allocatorOperands,
   llvm::SmallVectorImpl &allocateOperands) {
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
   mlir::Location currentLocation = converter.getCurrentLocation();
   Fortran::lower::StatementContext stmtCtx;
 
   mlir::Value allocatorOperand;
-  const Fortran::parser::OmpObjectList &ompObjectList =
-  std::get(ompAllocateClause.t);
-  const auto &allocateModifier = std::get<
-  std::optional>(
-  ompAllocateClause.t);
+  const omp::ObjectList &objectList = std::get(clause.t);
+  const auto &modifier =
+  std::get>(clause.t);
 
   // If the allocate modifier is present, check if we only use the allocator
   // submodifier.  ALIGN in this context is unimplemented
   const bool onlyAllocator =
-  allocateModifier &&
-  std::holds_alternative<
-  Fortran::parser::OmpAllocateClause::AllocateModifier::Allocator>(
-  allocateModifier->u);
+  modifier &&
+  std::holds_alternative(
+  modifier->u);
 
-  if (allocateModifier && !onlyAllocator) {
+  if (modifier && !onlyAllocator) {
 TODO(currentLocation, "OmpAllocateClause ALIGN modifier");
   }
 
   // Check if allocate clause has allocator specified. If so, add it
   // to list of allocators, otherwise, add default allocator to
   // list of allocators.
   if (onlyAllocator) {
-const auto &allocatorValue = std::get<
-Fortran::parser::OmpAllocateClause::AllocateModifier::Allocator>(
-allocateModifier->u);
-allocatorOperand = fir::getBase(converter.genExprValue(
-*Fortran::semantics::GetExpr(allocatorValue.v), stmtCtx));
-allocatorOperands.insert(allocatorOperands.end(), ompObjectList.v.size(),
- allocatorOperand);
+const auto &value =
+std::get(modifier->u);
+mlir::Value operand =
+fir::getBase(converter.genExprValue(value.v, stmtCtx));

kparzysz wrote:

Because it works on `parser` objects, which we no longer have here.

https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz edited 
https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Krzysztof Parzyszek via llvm-branch-commits


@@ -181,45 +172,41 @@ genDependKindAttr(fir::FirOpBuilder &firOpBuilder,
   pbKind);
 }
 
-static mlir::Value getIfClauseOperand(
-Fortran::lower::AbstractConverter &converter,
-const Fortran::parser::OmpClause::If *ifClause,
-Fortran::parser::OmpIfClause::DirectiveNameModifier directiveName,
-mlir::Location clauseLocation) {
+static mlir::Value
+getIfClauseOperand(Fortran::lower::AbstractConverter &converter,
+   const omp::clause::If &clause,
+   omp::clause::If::DirectiveNameModifier directiveName,
+   mlir::Location clauseLocation) {
   // Only consider the clause if it's intended for the given directive.
-  auto &directive = std::get<
-  std::optional>(
-  ifClause->v.t);
+  auto &directive =
+  
std::get>(clause.t);
   if (directive && directive.value() != directiveName)
 return nullptr;
 
   Fortran::lower::StatementContext stmtCtx;
   fir::FirOpBuilder &firOpBuilder = converter.getFirOpBuilder();
-  auto &expr = std::get(ifClause->v.t);
   mlir::Value ifVal = fir::getBase(
-  converter.genExprValue(*Fortran::semantics::GetExpr(expr), stmtCtx));
+  converter.genExprValue(std::get(clause.t), stmtCtx));
   return firOpBuilder.createConvert(clauseLocation, firOpBuilder.getI1Type(),
 ifVal);
 }
 
 static void
 addUseDeviceClause(Fortran::lower::AbstractConverter &converter,
-   const Fortran::parser::OmpObjectList &useDeviceClause,
+   const omp::ObjectList &objects,
llvm::SmallVectorImpl &operands,
llvm::SmallVectorImpl &useDeviceTypes,
llvm::SmallVectorImpl &useDeviceLocs,
llvm::SmallVectorImpl
&useDeviceSymbols) {
-  genObjectList(useDeviceClause, converter, operands);
+  genObjectList(objects, converter, operands);
   for (mlir::Value &operand : operands) {
 checkMapType(operand.getLoc(), operand.getType());
 useDeviceTypes.push_back(operand.getType());
 useDeviceLocs.push_back(operand.getLoc());
   }
-  for (const Fortran::parser::OmpObject &ompObject : useDeviceClause.v) {
-Fortran::semantics::Symbol *sym = getOmpObjectSymbol(ompObject);
-useDeviceSymbols.push_back(sym);
-  }
+  for (const omp::Object &object : objects)
+useDeviceSymbols.push_back(object.id());

kparzysz wrote:

All things that were previously in `parser::Something` were have been 
translated into the new clauses: `OmpObject` is now `Symbol*` plus designator 
(`MaybeExpr`), other things are mostly `SomeExpr`.  The old code that computed 
these is no longer needed in places where it used to be.

https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] Convert repeatable clauses (except Map) in ClauseProc… (PR #81623)

2024-03-12 Thread Krzysztof Parzyszek via llvm-branch-commits

https://github.com/kparzysz edited 
https://github.com/llvm/llvm-project/pull/81623
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC] [hwasan] factor get[PC|FP] out of HWASan class (PR #84404)

2024-03-12 Thread Vitaly Buka via llvm-branch-commits


@@ -1246,32 +1244,16 @@ Value 
*HWAddressSanitizer::getHwasanThreadSlotPtr(IRBuilder<> &IRB, Type *Ty) {
   return nullptr;
 }
 
-Value *HWAddressSanitizer::getPC(IRBuilder<> &IRB) {
-  if (TargetTriple.getArch() == Triple::aarch64)
-return readRegister(IRB, "pc");
-  return IRB.CreatePtrToInt(IRB.GetInsertBlock()->getParent(), IntptrTy);
-}
-
-Value *HWAddressSanitizer::getFP(IRBuilder<> &IRB) {
-  if (!CachedSP) {
-// FIXME: use addressofreturnaddress (but implement it in aarch64 backend
-// first).
-Function *F = IRB.GetInsertBlock()->getParent();
-Module *M = F->getParent();
-auto *GetStackPointerFn = Intrinsic::getDeclaration(
-M, Intrinsic::frameaddress,
-IRB.getPtrTy(M->getDataLayout().getAllocaAddrSpace()));
-CachedSP = IRB.CreatePtrToInt(
-IRB.CreateCall(GetStackPointerFn, {Constant::getNullValue(Int32Ty)}),
-IntptrTy);
-  }
+Value *HWAddressSanitizer::getCachedSP(IRBuilder<> &IRB) {
+  if (!CachedSP)
+CachedSP = memtag::getSP(IRB);

vitalybuka wrote:

It's not NFC, it's FP (frameaddress) -> SP

https://github.com/llvm/llvm-project/pull/84404
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC] [hwasan] factor get[PC|FP] out of HWASan class (PR #84404)

2024-03-12 Thread Vitaly Buka via llvm-branch-commits


@@ -236,5 +238,37 @@ void alignAndPadAlloca(memtag::AllocaInfo &Info, 
llvm::Align Alignment) {
   Info.AI = NewAI;
 }
 
+Value *readRegister(IRBuilder<> &IRB, StringRef Name) {
+  Module *M = IRB.GetInsertBlock()->getParent()->getParent();
+  Function *ReadRegister = Intrinsic::getDeclaration(
+  M, Intrinsic::read_register, IRB.getIntPtrTy(M->getDataLayout()));
+  MDNode *MD =
+  MDNode::get(M->getContext(), {MDString::get(M->getContext(), Name)});
+  Value *Args[] = {MetadataAsValue::get(M->getContext(), MD)};
+  return IRB.CreateCall(ReadRegister, Args);
+}
+
+Value *getPC(const Triple &TargetTriple, IRBuilder<> &IRB) {
+  Module *M = IRB.GetInsertBlock()->getParent()->getParent();
+  if (TargetTriple.getArch() == Triple::aarch64)
+return memtag::readRegister(IRB, "pc");
+  return IRB.CreatePtrToInt(IRB.GetInsertBlock()->getParent(),
+IRB.getIntPtrTy(M->getDataLayout()));
+}
+
+Value *getSP(IRBuilder<> &IRB) {

vitalybuka wrote:

so it should be getFP

https://github.com/llvm/llvm-project/pull/84404
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC] [hwasan] factor get[PC|FP] out of HWASan class (PR #84404)

2024-03-12 Thread Vitaly Buka via llvm-branch-commits


@@ -236,5 +238,37 @@ void alignAndPadAlloca(memtag::AllocaInfo &Info, 
llvm::Align Alignment) {
   Info.AI = NewAI;
 }
 
+Value *readRegister(IRBuilder<> &IRB, StringRef Name) {
+  Module *M = IRB.GetInsertBlock()->getParent()->getParent();
+  Function *ReadRegister = Intrinsic::getDeclaration(
+  M, Intrinsic::read_register, IRB.getIntPtrTy(M->getDataLayout()));
+  MDNode *MD =
+  MDNode::get(M->getContext(), {MDString::get(M->getContext(), Name)});
+  Value *Args[] = {MetadataAsValue::get(M->getContext(), MD)};
+  return IRB.CreateCall(ReadRegister, Args);
+}
+
+Value *getPC(const Triple &TargetTriple, IRBuilder<> &IRB) {
+  Module *M = IRB.GetInsertBlock()->getParent()->getParent();
+  if (TargetTriple.getArch() == Triple::aarch64)
+return memtag::readRegister(IRB, "pc");
+  return IRB.CreatePtrToInt(IRB.GetInsertBlock()->getParent(),
+IRB.getIntPtrTy(M->getDataLayout()));
+}
+
+Value *getSP(IRBuilder<> &IRB) {
+  // FIXME: use addressofreturnaddress (but implement it in aarch64 backend

vitalybuka wrote:

plase remove FIXME
looks like we already rely on the FP behaviour

https://github.com/llvm/llvm-project/pull/84404
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] [llvm] Backport fixes for ARM64EC import libraries (PR #84590)

2024-03-12 Thread Daniel Paoliello via llvm-branch-commits

https://github.com/dpaoliello updated 
https://github.com/llvm/llvm-project/pull/84590

>From e84cbe53501ae106c25ca7233e48ad3c5daf539a Mon Sep 17 00:00:00 2001
From: Jacek Caban 
Date: Tue, 6 Feb 2024 13:47:58 +0100
Subject: [PATCH 1/4] [llvm-readobj][Object][COFF] Print COFF import library
 symbol export name. (#78769)

getExportName implementation is based on lld-link. In its current form,
it's mostly about convenience, but it will be more useful for EXPORTAS
support, for which export name is not possible to deduce from other
printed properties.
---
 lld/test/COFF/def-export-cpp.s|  1 +
 lld/test/COFF/def-export-stdcall.s| 13 ++
 lld/test/COFF/dllexport.s |  4 +++
 llvm/include/llvm/Object/COFFImportFile.h |  1 +
 llvm/lib/Object/COFFImportFile.cpp| 26 +++
 .../tools/llvm-dlltool/coff-decorated.def |  7 +
 llvm/test/tools/llvm-dlltool/coff-exports.def |  3 +++
 llvm/test/tools/llvm-dlltool/coff-noname.def  |  1 +
 .../llvm-dlltool/no-leading-underscore.def|  2 ++
 llvm/test/tools/llvm-lib/arm64ec-implib.test  |  2 ++
 .../tools/llvm-readobj/COFF/file-headers.test |  1 +
 llvm/tools/llvm-readobj/COFFImportDumper.cpp  |  3 +++
 12 files changed, 64 insertions(+)

diff --git a/lld/test/COFF/def-export-cpp.s b/lld/test/COFF/def-export-cpp.s
index e00b35b1c5b39b..370b8ddba4104b 100644
--- a/lld/test/COFF/def-export-cpp.s
+++ b/lld/test/COFF/def-export-cpp.s
@@ -10,6 +10,7 @@
 
 # IMPLIB: File: foo.dll
 # IMPLIB: Name type: undecorate
+# IMPLIB-NEXT: Export name: GetPathOnDisk
 # IMPLIB-NEXT: Symbol: __imp_?GetPathOnDisk@@YA_NPEA_W@Z
 # IMPLIB-NEXT: Symbol: ?GetPathOnDisk@@YA_NPEA_W@Z
 
diff --git a/lld/test/COFF/def-export-stdcall.s 
b/lld/test/COFF/def-export-stdcall.s
index f015e205c74a33..7e4e04c77cbe7a 100644
--- a/lld/test/COFF/def-export-stdcall.s
+++ b/lld/test/COFF/def-export-stdcall.s
@@ -6,15 +6,19 @@
 # RUN: llvm-readobj --coff-exports %t.dll | FileCheck -check-prefix 
UNDECORATED-EXPORTS %s
 
 # UNDECORATED-IMPLIB: Name type: noprefix
+# UNDECORATED-IMPLIB-NEXT: Export name: _underscored
 # UNDECORATED-IMPLIB-NEXT: __imp___underscored
 # UNDECORATED-IMPLIB-NEXT: __underscored
 # UNDECORATED-IMPLIB: Name type: undecorate
+# UNDECORATED-IMPLIB-NEXT: Export name: fastcall
 # UNDECORATED-IMPLIB-NEXT: __imp_@fastcall@8
 # UNDECORATED-IMPLIB-NEXT: fastcall@8
 # UNDECORATED-IMPLIB: Name type: undecorate
+# UNDECORATED-IMPLIB-NEXT: Export name: stdcall
 # UNDECORATED-IMPLIB-NEXT: __imp__stdcall@8
 # UNDECORATED-IMPLIB-NEXT: _stdcall@8
 # UNDECORATED-IMPLIB: Name type: undecorate
+# UNDECORATED-IMPLIB-NEXT: Export name: vectorcall
 # UNDECORATED-IMPLIB-NEXT: __imp_vectorcall@@8
 # UNDECORATED-IMPLIB-NEXT: vectorcall@@8
 
@@ -30,12 +34,15 @@
 # RUN: llvm-readobj --coff-exports %t.dll | FileCheck -check-prefix 
DECORATED-EXPORTS %s
 
 # DECORATED-IMPLIB: Name type: name
+# DECORATED-IMPLIB-NEXT: Export name: @fastcall@8
 # DECORATED-IMPLIB-NEXT: __imp_@fastcall@8
 # DECORATED-IMPLIB-NEXT: @fastcall@8
 # DECORATED-IMPLIB: Name type: name
+# DECORATED-IMPLIB-NEXT: Export name: _stdcall@8
 # DECORATED-IMPLIB-NEXT: __imp__stdcall@8
 # DECORATED-IMPLIB-NEXT: _stdcall@8
 # DECORATED-IMPLIB: Name type: name
+# DECORATED-IMPLIB-NEXT: Export name: vectorcall@@8
 # DECORATED-IMPLIB-NEXT: __imp_vectorcall@@8
 # DECORATED-IMPLIB-NEXT: vectorcall@@8
 
@@ -51,14 +58,17 @@
 # RUN: llvm-readobj --coff-exports %t.dll | FileCheck -check-prefix 
DECORATED-MINGW-EXPORTS %s
 
 # DECORATED-MINGW-IMPLIB: Name type: name
+# DECORATED-MINGW-IMPLIB-NEXT: Export name: @fastcall@8
 # DECORATED-MINGW-IMPLIB-NEXT: __imp_@fastcall@8
 # DECORATED-MINGW-IMPLIB-NEXT: fastcall@8
 # DECORATED-MINGW-IMPLIB: Name type: noprefix
+# DECORATED-MINGW-IMPLIB-NEXT: Export name: stdcall@8
 # DECORATED-MINGW-IMPLIB-NEXT: __imp__stdcall@8
 # DECORATED-MINGW-IMPLIB-NEXT: _stdcall@8
 # GNU tools don't support vectorcall, but this test is just to track that
 # lld's behaviour remains consistent over time.
 # DECORATED-MINGW-IMPLIB: Name type: name
+# DECORATED-MINGW-IMPLIB-NEXT: Export name: vectorcall@@8
 # DECORATED-MINGW-IMPLIB-NEXT: __imp_vectorcall@@8
 # DECORATED-MINGW-IMPLIB-NEXT: vectorcall@@8
 
@@ -75,14 +85,17 @@
 # RUN: llvm-readobj --coff-exports %t.dll | FileCheck -check-prefix 
MINGW-KILL-AT-EXPORTS %s
 
 # MINGW-KILL-AT-IMPLIB: Name type: noprefix
+# MINGW-KILL-AT-IMPLIB: Export name: fastcall
 # MINGW-KILL-AT-IMPLIB: __imp__fastcall
 # MINGW-KILL-AT-IMPLIB-NEXT: _fastcall
 # MINGW-KILL-AT-IMPLIB: Name type: noprefix
+# MINGW-KILL-AT-IMPLIB-NEXT: Export name: stdcall
 # MINGW-KILL-AT-IMPLIB-NEXT: __imp__stdcall
 # MINGW-KILL-AT-IMPLIB-NEXT: _stdcall
 # GNU tools don't support vectorcall, but this test is just to track that
 # lld's behaviour remains consistent over time.
 # MINGW-KILL-AT-IMPLIB: Name type: noprefix
+# MINGW-KILL-AT-IMPLIB-NEXT: Export name: vectorcall
 # MINGW-KILL-AT-IMPLIB-NEXT: __imp__vectorcall
 # MINGW-KILL-AT-IMPLIB-NEXT

[llvm-branch-commits] [lld] [llvm] Backport fixes for ARM64EC import libraries (PR #84590)

2024-03-12 Thread Daniel Paoliello via llvm-branch-commits

dpaoliello wrote:

 has been merged into master 
and cherry-picked here, so this is ready to land.

> One thing we could consider is to skip .def file parser part of it. 

I'd rather not modify commits since they've already been tested in master and 
it makes subsequent cherry-picks more difficult.

https://github.com/llvm/llvm-project/pull/84590
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [NFC] [hwasan] factor get[PC|FP] out of HWASan class (PR #84404)

2024-03-12 Thread Florian Mayer via llvm-branch-commits


@@ -1246,32 +1244,16 @@ Value 
*HWAddressSanitizer::getHwasanThreadSlotPtr(IRBuilder<> &IRB, Type *Ty) {
   return nullptr;
 }
 
-Value *HWAddressSanitizer::getPC(IRBuilder<> &IRB) {
-  if (TargetTriple.getArch() == Triple::aarch64)
-return readRegister(IRB, "pc");
-  return IRB.CreatePtrToInt(IRB.GetInsertBlock()->getParent(), IntptrTy);
-}
-
-Value *HWAddressSanitizer::getFP(IRBuilder<> &IRB) {
-  if (!CachedSP) {
-// FIXME: use addressofreturnaddress (but implement it in aarch64 backend
-// first).
-Function *F = IRB.GetInsertBlock()->getParent();
-Module *M = F->getParent();
-auto *GetStackPointerFn = Intrinsic::getDeclaration(
-M, Intrinsic::frameaddress,
-IRB.getPtrTy(M->getDataLayout().getAllocaAddrSpace()));
-CachedSP = IRB.CreatePtrToInt(
-IRB.CreateCall(GetStackPointerFn, {Constant::getNullValue(Int32Ty)}),
-IntptrTy);
-  }
+Value *HWAddressSanitizer::getCachedSP(IRBuilder<> &IRB) {
+  if (!CachedSP)
+CachedSP = memtag::getSP(IRB);

fmayer wrote:

it's NFC, i just made the name consistent.

before, we had this

```
Value *SP = getFP(IRB);
```

https://github.com/llvm/llvm-project/pull/84404
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >