[llvm-branch-commits] [libcxx] [libc++][chrono] Adds the sys_info class. (PR #85619)

2024-03-18 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-libcxx

Author: Mark de Wever (mordante)


Changes

Adds the sys_info class and time_zone::get_info(). The code still has a few 
quirks and has not been optimized for performance yet.

The returned sys_info is compared against the output of the zdump tool in the 
test giving confidence the implementation is correct.

Implements parts of:
- P0355 Extending  to Calendars and Time Zones

Implements:
- LWG The sys_info range should be affected by save

---

Patch is 119.62 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/85619.diff


17 Files Affected:

- (modified) libcxx/docs/Status/Cxx2cIssues.csv (+1) 
- (modified) libcxx/include/CMakeLists.txt (+1) 
- (added) libcxx/include/__chrono/sys_info.h (+53) 
- (modified) libcxx/include/__chrono/time_zone.h (+9) 
- (modified) libcxx/include/chrono (+13) 
- (modified) libcxx/include/libcxx.imp (+1) 
- (modified) libcxx/modules/std/chrono.inc (+2) 
- (modified) libcxx/src/include/tzdb/types_private.h (+13-2) 
- (modified) libcxx/src/include/tzdb/tzdb_list_private.h (+7) 
- (modified) libcxx/src/time_zone.cpp (+855) 
- (modified) 
libcxx/test/libcxx/diagnostics/chrono.nodiscard_extensions.compile.pass.cpp 
(+1) 
- (modified) 
libcxx/test/libcxx/diagnostics/chrono.nodiscard_extensions.verify.cpp (+2) 
- (added) 
libcxx/test/libcxx/time/time.zone/time.zone.timezone/time.zone.members/get_info.sys_time.pass.cpp
 (+140) 
- (added) 
libcxx/test/std/time/time.zone/time.zone.info/time.zone.info.sys/sys_info.members.compile.pass.cpp
 (+33) 
- (added) 
libcxx/test/std/time/time.zone/time.zone.timezone/time.zone.members/get_info.sys_time.pass.cpp
 (+1374) 
- (added) 
libcxx/test/std/time/time.zone/time.zone.timezone/time.zone.members/sys_info.zdump.pass.cpp
 (+124) 
- (modified) libcxx/utils/libcxx/test/features.py (+6) 


``diff
diff --git a/libcxx/docs/Status/Cxx2cIssues.csv 
b/libcxx/docs/Status/Cxx2cIssues.csv
index 58e995809777c1..a6aa70ecc866e0 100644
--- a/libcxx/docs/Status/Cxx2cIssues.csv
+++ b/libcxx/docs/Status/Cxx2cIssues.csv
@@ -41,4 +41,5 @@
 "`4001 `__","``iota_view`` should provide 
``empty``","Kona November 2023","","","|ranges|"
 "","","","","",""
 "`3343 `__","Ordering of calls to ``unlock()`` and 
``notify_all()`` in Effects element of ``notify_all_at_thread_exit()`` should 
be reversed","Not Yet Adopted","|Complete|","16.0",""
+"","","The sys_info range should be affected by save","Not Yet 
Adopted","|Complete|","19.0"
 "","","","","",""
diff --git a/libcxx/include/CMakeLists.txt b/libcxx/include/CMakeLists.txt
index 5d9a693d152e4a..24cef9cd877c62 100644
--- a/libcxx/include/CMakeLists.txt
+++ b/libcxx/include/CMakeLists.txt
@@ -289,6 +289,7 @@ set(files
   __chrono/ostream.h
   __chrono/parser_std_format_spec.h
   __chrono/statically_widen.h
+  __chrono/sys_info.h
   __chrono/steady_clock.h
   __chrono/system_clock.h
   __chrono/time_point.h
diff --git a/libcxx/include/__chrono/sys_info.h 
b/libcxx/include/__chrono/sys_info.h
new file mode 100644
index 00..16ec5dd254d59a
--- /dev/null
+++ b/libcxx/include/__chrono/sys_info.h
@@ -0,0 +1,53 @@
+// -*- C++ -*-
+//===--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+
+// For information see https://libcxx.llvm.org/DesignDocs/TimeZone.html
+
+#ifndef _LIBCPP___CHRONO_SYS_INFO_H
+#define _LIBCPP___CHRONO_SYS_INFO_H
+
+#include 
+// Enable the contents of the header only when libc++ was built with 
experimental features enabled.
+#if !defined(_LIBCPP_HAS_NO_INCOMPLETE_TZDB)
+
+#  include <__chrono/duration.h>
+#  include <__chrono/system_clock.h>
+#  include <__chrono/time_point.h>
+#  include <__config>
+#  include 
+
+#  if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#pragma GCC system_header
+#  endif
+
+_LIBCPP_BEGIN_NAMESPACE_STD
+
+#  if _LIBCPP_STD_VER >= 20 && !defined(_LIBCPP_HAS_NO_TIME_ZONE_DATABASE) && 
!defined(_LIBCPP_HAS_NO_FILESYSTEM) &&   \
+  !defined(_LIBCPP_HAS_NO_LOCALIZATION)
+
+namespace chrono {
+
+struct sys_info {
+  sys_seconds begin;
+  sys_seconds end;
+  seconds offset;
+  minutes save;
+  string abbrev;
+};
+
+} // namespace chrono
+
+#  endif // _LIBCPP_STD_VER >= 20 && 
!defined(_LIBCPP_HAS_NO_TIME_ZONE_DATABASE) && 
!defined(_LIBCPP_HAS_NO_FILESYSTEM)
+ // && !defined(_LIBCPP_HAS_NO_LOCALIZATION)
+
+_LIBCPP_END_NAMESPACE_STD
+
+#endif // !defined(_LIBCPP_HAS_NO_INCOMPLETE_TZDB)
+
+#endif // _LIBCPP___CHRONO_SYS_INFO_H
diff --git a/libcxx/include/__chrono/time_zone.h 
b/libcxx/include/__chrono/time_zone.h
index 7d97327a6c8e99..a5af6297a4cf92 100644
--- a/libcxx/include/__chro

[llvm-branch-commits] [libcxx] [libc++][chrono] Adds the sys_info class. (PR #85619)

2024-03-18 Thread Mark de Wever via llvm-branch-commits


@@ -41,4 +41,5 @@
 "`4001 `__","``iota_view`` should provide 
``empty``","Kona November 2023","","","|ranges|"
 "","","","","",""
 "`3343 `__","Ordering of calls to ``unlock()`` and 
``notify_all()`` in Effects element of ``notify_all_at_thread_exit()`` should 
be reversed","Not Yet Adopted","|Complete|","16.0",""
+"","","The sys_info range should be affected by save","Not Yet 
Adopted","|Complete|","19.0"

mordante wrote:

This has been discussed with Howard and I'll file an LWG issue.

https://github.com/llvm/llvm-project/pull/85619
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Tom Eccles via llvm-branch-commits


@@ -283,13 +316,166 @@ mlir::Value ReductionProcessor::createScalarCombiner(
   return reductionOp;
 }
 
+/// Create reduction combiner region for reduction variables which are boxed
+/// arrays
+static void genBoxCombiner(fir::FirOpBuilder &builder, mlir::Location loc,
+   ReductionProcessor::ReductionIdentifier redId,
+   fir::BaseBoxType boxTy, mlir::Value lhs,
+   mlir::Value rhs) {
+  fir::SequenceType seqTy =
+  mlir::dyn_cast_or_null(boxTy.getEleTy());
+  // TODO: support allocatable arrays: !fir.box>>
+  if (!seqTy || seqTy.hasUnknownShape())
+TODO(loc, "Unsupported boxed type in OpenMP reduction");
+
+  // load fir.ref>
+  mlir::Value lhsAddr = lhs;
+  lhs = builder.create(loc, lhs);
+  rhs = builder.create(loc, rhs);
+
+  const unsigned rank = seqTy.getDimension();
+  llvm::SmallVector extents;
+  extents.reserve(rank);
+  llvm::SmallVector lbAndExtents;
+  lbAndExtents.reserve(rank * 2);
+
+  // Get box lowerbounds and extents:
+  mlir::Type idxTy = builder.getIndexType();
+  for (unsigned i = 0; i < rank; ++i) {
+// TODO: ideally we want to hoist box reads out of the critical section.
+// We could do this by having box dimensions in block arguments like
+// OpenACC does
+mlir::Value dim = builder.createIntegerConstant(loc, idxTy, i);
+auto dimInfo =
+builder.create(loc, idxTy, idxTy, idxTy, lhs, dim);
+extents.push_back(dimInfo.getExtent());
+lbAndExtents.push_back(dimInfo.getLowerBound());
+lbAndExtents.push_back(dimInfo.getExtent());
+  }
+
+  auto shapeShiftTy = fir::ShapeShiftType::get(builder.getContext(), rank);
+  auto shapeShift =
+  builder.create(loc, shapeShiftTy, lbAndExtents);
+
+  // Iterate over array elements, applying the equivalent scalar reduction:
+
+  // A hlfir::elemental here gets inlined with a temporary so create the
+  // loop nest directly.
+  // This function already controls all of the code in this region so we
+  // know this won't miss any opportuinties for clever elemental inlining
+  hlfir::LoopNest nest =
+  hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true);
+  builder.setInsertionPointToStart(nest.innerLoop.getBody());
+  mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy());
+  auto lhsEleAddr = builder.create(
+  loc, refTy, lhs, shapeShift, /*slice=*/mlir::Value{},
+  nest.oneBasedIndices, /*typeparms=*/mlir::ValueRange{});
+  auto rhsEleAddr = builder.create(
+  loc, refTy, rhs, shapeShift, /*slice=*/mlir::Value{},
+  nest.oneBasedIndices, /*typeparms=*/mlir::ValueRange{});
+  auto lhsEle = builder.create(loc, lhsEleAddr);
+  auto rhsEle = builder.create(loc, rhsEleAddr);
+  mlir::Value scalarReduction = ReductionProcessor::createScalarCombiner(
+  builder, loc, redId, refTy, lhsEle, rhsEle);
+  builder.create(loc, scalarReduction, lhsEleAddr);
+
+  builder.setInsertionPointAfter(nest.outerLoop);
+  builder.create(loc, lhsAddr);
+}
+
+// generate combiner region for reduction operations
+static void genCombiner(fir::FirOpBuilder &builder, mlir::Location loc,
+ReductionProcessor::ReductionIdentifier redId,
+mlir::Type ty, mlir::Value lhs, mlir::Value rhs,
+bool isByRef) {
+  ty = fir::unwrapRefType(ty);
+
+  if (fir::isa_trivial(ty)) {
+mlir::Value lhsLoaded = builder.loadIfRef(loc, lhs);
+mlir::Value rhsLoaded = builder.loadIfRef(loc, rhs);
+
+mlir::Value result = ReductionProcessor::createScalarCombiner(
+builder, loc, redId, ty, lhsLoaded, rhsLoaded);
+if (isByRef) {
+  builder.create(loc, result, lhs);
+  builder.create(loc, lhs);
+} else {
+  builder.create(loc, result);
+}
+return;
+  }
+  // all arrays should have been boxed
+  if (auto boxTy = mlir::dyn_cast(ty)) {
+genBoxCombiner(builder, loc, redId, boxTy, lhs, rhs);
+return;
+  }
+
+  TODO(loc, "OpenMP genCombiner for unsupported reduction variable type");
+}
+
+static mlir::Value
+createReductionInitRegion(fir::FirOpBuilder &builder, mlir::Location loc,
+  const ReductionProcessor::ReductionIdentifier redId,
+  mlir::Type type, bool isByRef) {
+  mlir::Type ty = fir::unwrapRefType(type);
+  mlir::Value initValue = ReductionProcessor::getReductionInitValue(
+  loc, fir::unwrapSeqOrBoxedSeqType(ty), redId, builder);
+
+  if (fir::isa_trivial(ty)) {
+if (isByRef) {
+  mlir::Value alloca = builder.create(loc, ty);
+  builder.createStoreWithConvert(loc, initValue, alloca);
+  return alloca;
+}
+// by val
+return initValue;
+  }
+
+  // all arrays are boxed
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+assert(isByRef && "passing arrays by value is unsupported");
+// TODO: support allocatable arrays: !fir.box>>
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
+if (!mlir::isa(innerTy))
+  

[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Tom Eccles via llvm-branch-commits


@@ -0,0 +1,74 @@
+! RUN: bbc -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s
+! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s
+
+program reduce
+integer, dimension(3) :: i = 0
+
+!$omp parallel reduction(+:i)
+i(1) = 1
+i(2) = 2
+i(3) = 3

tblah wrote:

I was just trying to create a test with a very simple body. Do you want a test 
with the reduction operation in the body as well?

https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Tom Eccles via llvm-branch-commits


@@ -92,10 +93,42 @@ std::string 
ReductionProcessor::getReductionName(llvm::StringRef name,
   if (isByRef)
 byrefAddition = "_byref";
 
-  return (llvm::Twine(name) +
-  (ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
-  llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
-  .str();
+  if (fir::isa_trivial(ty))
+return (llvm::Twine(name) +
+(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
+llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
+.str();
+
+  // creates a name like reduction_i_64_box_ux4x3
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+// TODO: support for allocatable boxes:
+// !fir.box>>
+fir::SequenceType seqTy = fir::unwrapRefType(boxTy.getEleTy())
+  .dyn_cast_or_null();
+if (!seqTy)
+  return {};
+
+std::string prefix = getReductionName(
+name, fir::unwrapSeqOrBoxedSeqType(ty), /*isByRef=*/false);
+if (prefix.empty())
+  return {};
+std::stringstream tyStr;
+tyStr << prefix << "_box_";
+bool first = true;
+for (std::int64_t extent : seqTy.getShape()) {
+  if (first)
+first = false;
+  else
+tyStr << "x";
+  if (extent == seqTy.getUnknownExtent())
+tyStr << 'u'; // I'm not sure that '?' is safe in symbol names
+  else
+tyStr << extent;
+}
+return (tyStr.str() + byrefAddition).str();
+  }
+
+  return {};

tblah wrote:

Yeah that looks much better, thanks! I will do it in a separate patch because I 
think it might change existing names.

https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/84958

>From bd668cd95d95d1e5b9c8436875c14878c98902ff Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Mon, 12 Feb 2024 14:03:00 +
Subject: [PATCH 1/4] [flang][OpenMP] lower simple array reductions

This has been tested with arrays with compile-time constant bounds.
Allocatable arrays and arrays with non-constant bounds are not yet
supported. User-defined reduction functions are also not yet supported.

The design is intended to work for arrays with non-constant bounds too
without a lot of extra work (mostly there are bugs in OpenMPIRBuilder I
haven't fixed yet).

We need some way to get these runtime bounds into the reduction init and
combiner regions. To keep things simple for now I opted to always box the
array arguments so the box can be passed as one argument and the lower
bounds and extents read from the box. This has the disadvantage of
resulting in fir.box_dim operations inside of the critical section. If
these prove to be a performance issue, we could follow OpenACC reading
box lower bounds and extents before the reduction and passing them as
block arguments to the reduction init and combiner regions. I would
prefer to keep things simple for now.

Note: this implementation only works when the HLFIR lowering is used. I
don't think it is worth supporting FIR-only lowering because the plan is
for that to be removed soon.
---
 flang/lib/Lower/OpenMP/ReductionProcessor.cpp | 283 ++
 flang/lib/Lower/OpenMP/ReductionProcessor.h   |   2 +-
 .../Lower/OpenMP/Todo/reduction-arrays.f90|  15 -
 .../Lower/OpenMP/parallel-reduction-array.f90 |  74 +
 .../Lower/OpenMP/wsloop-reduction-array.f90   |  84 ++
 5 files changed, 389 insertions(+), 69 deletions(-)
 delete mode 100644 flang/test/Lower/OpenMP/Todo/reduction-arrays.f90
 create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-array.f90
 create mode 100644 flang/test/Lower/OpenMP/wsloop-reduction-array.f90

diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp 
b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
index e6a63dd4b939ce..de2f11f5b9512e 100644
--- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
@@ -13,6 +13,7 @@
 #include "ReductionProcessor.h"
 
 #include "flang/Lower/AbstractConverter.h"
+#include "flang/Optimizer/Builder/HLFIRTools.h"
 #include "flang/Optimizer/Builder/Todo.h"
 #include "flang/Optimizer/Dialect/FIRType.h"
 #include "flang/Optimizer/HLFIR/HLFIROps.h"
@@ -92,10 +93,42 @@ std::string 
ReductionProcessor::getReductionName(llvm::StringRef name,
   if (isByRef)
 byrefAddition = "_byref";
 
-  return (llvm::Twine(name) +
-  (ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
-  llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
-  .str();
+  if (fir::isa_trivial(ty))
+return (llvm::Twine(name) +
+(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
+llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
+.str();
+
+  // creates a name like reduction_i_64_box_ux4x3
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+// TODO: support for allocatable boxes:
+// !fir.box>>
+fir::SequenceType seqTy = fir::unwrapRefType(boxTy.getEleTy())
+  .dyn_cast_or_null();
+if (!seqTy)
+  return {};
+
+std::string prefix = getReductionName(
+name, fir::unwrapSeqOrBoxedSeqType(ty), /*isByRef=*/false);
+if (prefix.empty())
+  return {};
+std::stringstream tyStr;
+tyStr << prefix << "_box_";
+bool first = true;
+for (std::int64_t extent : seqTy.getShape()) {
+  if (first)
+first = false;
+  else
+tyStr << "x";
+  if (extent == seqTy.getUnknownExtent())
+tyStr << 'u'; // I'm not sure that '?' is safe in symbol names
+  else
+tyStr << extent;
+}
+return (tyStr.str() + byrefAddition).str();
+  }
+
+  return {};
 }
 
 std::string ReductionProcessor::getReductionName(
@@ -283,6 +316,156 @@ mlir::Value ReductionProcessor::createScalarCombiner(
   return reductionOp;
 }
 
+/// Create reduction combiner region for reduction variables which are boxed
+/// arrays
+static void genBoxCombiner(fir::FirOpBuilder &builder, mlir::Location loc,
+   ReductionProcessor::ReductionIdentifier redId,
+   fir::BaseBoxType boxTy, mlir::Value lhs,
+   mlir::Value rhs) {
+  fir::SequenceType seqTy =
+  mlir::dyn_cast_or_null(boxTy.getEleTy());
+  // TODO: support allocatable arrays: !fir.box>>
+  if (!seqTy || seqTy.hasUnknownShape())
+TODO(loc, "Unsupported boxed type in OpenMP reduction");
+
+  // load fir.ref>
+  mlir::Value lhsAddr = lhs;
+  lhs = builder.create(loc, lhs);
+  rhs = builder.create(loc, rhs);
+
+  const unsigned rank = seqTy.getDimension();
+  llvm::SmallVector extents;
+  extents.reserve(

[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Kiran Chandramohan via llvm-branch-commits


@@ -92,10 +93,42 @@ std::string 
ReductionProcessor::getReductionName(llvm::StringRef name,
   if (isByRef)
 byrefAddition = "_byref";
 
-  return (llvm::Twine(name) +
-  (ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
-  llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
-  .str();
+  if (fir::isa_trivial(ty))
+return (llvm::Twine(name) +
+(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
+llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
+.str();
+
+  // creates a name like reduction_i_64_box_ux4x3
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+// TODO: support for allocatable boxes:
+// !fir.box>>
+fir::SequenceType seqTy = fir::unwrapRefType(boxTy.getEleTy())
+  .dyn_cast_or_null();
+if (!seqTy)
+  return {};
+
+std::string prefix = getReductionName(
+name, fir::unwrapSeqOrBoxedSeqType(ty), /*isByRef=*/false);
+if (prefix.empty())
+  return {};
+std::stringstream tyStr;
+tyStr << prefix << "_box_";
+bool first = true;
+for (std::int64_t extent : seqTy.getShape()) {
+  if (first)
+first = false;
+  else
+tyStr << "x";
+  if (extent == seqTy.getUnknownExtent())
+tyStr << 'u'; // I'm not sure that '?' is safe in symbol names
+  else
+tyStr << extent;
+}
+return (tyStr.str() + byrefAddition).str();
+  }
+
+  return {};

kiranchandramohan wrote:

OK, that is fine.

https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Kiran Chandramohan via llvm-branch-commits


@@ -0,0 +1,74 @@
+! RUN: bbc -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s
+! RUN: %flang_fc1 -emit-hlfir -fopenmp -o - %s 2>&1 | FileCheck %s
+
+program reduce
+integer, dimension(3) :: i = 0
+
+!$omp parallel reduction(+:i)
+i(1) = 1
+i(2) = 2
+i(3) = 3

kiranchandramohan wrote:

Yes, it would be good to have the test match the default way that it will be 
used.

https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Kiran Chandramohan via llvm-branch-commits


@@ -283,13 +316,166 @@ mlir::Value ReductionProcessor::createScalarCombiner(
   return reductionOp;
 }
 
+/// Create reduction combiner region for reduction variables which are boxed
+/// arrays
+static void genBoxCombiner(fir::FirOpBuilder &builder, mlir::Location loc,
+   ReductionProcessor::ReductionIdentifier redId,
+   fir::BaseBoxType boxTy, mlir::Value lhs,
+   mlir::Value rhs) {
+  fir::SequenceType seqTy =
+  mlir::dyn_cast_or_null(boxTy.getEleTy());
+  // TODO: support allocatable arrays: !fir.box>>
+  if (!seqTy || seqTy.hasUnknownShape())
+TODO(loc, "Unsupported boxed type in OpenMP reduction");
+
+  // load fir.ref>
+  mlir::Value lhsAddr = lhs;
+  lhs = builder.create(loc, lhs);
+  rhs = builder.create(loc, rhs);
+
+  const unsigned rank = seqTy.getDimension();
+  llvm::SmallVector extents;
+  extents.reserve(rank);
+  llvm::SmallVector lbAndExtents;
+  lbAndExtents.reserve(rank * 2);
+
+  // Get box lowerbounds and extents:
+  mlir::Type idxTy = builder.getIndexType();
+  for (unsigned i = 0; i < rank; ++i) {
+// TODO: ideally we want to hoist box reads out of the critical section.
+// We could do this by having box dimensions in block arguments like
+// OpenACC does
+mlir::Value dim = builder.createIntegerConstant(loc, idxTy, i);
+auto dimInfo =
+builder.create(loc, idxTy, idxTy, idxTy, lhs, dim);
+extents.push_back(dimInfo.getExtent());
+lbAndExtents.push_back(dimInfo.getLowerBound());
+lbAndExtents.push_back(dimInfo.getExtent());
+  }
+
+  auto shapeShiftTy = fir::ShapeShiftType::get(builder.getContext(), rank);
+  auto shapeShift =
+  builder.create(loc, shapeShiftTy, lbAndExtents);
+
+  // Iterate over array elements, applying the equivalent scalar reduction:
+
+  // A hlfir::elemental here gets inlined with a temporary so create the
+  // loop nest directly.
+  // This function already controls all of the code in this region so we
+  // know this won't miss any opportuinties for clever elemental inlining
+  hlfir::LoopNest nest =
+  hlfir::genLoopNest(loc, builder, extents, /*isUnordered=*/true);
+  builder.setInsertionPointToStart(nest.innerLoop.getBody());
+  mlir::Type refTy = fir::ReferenceType::get(seqTy.getEleTy());
+  auto lhsEleAddr = builder.create(
+  loc, refTy, lhs, shapeShift, /*slice=*/mlir::Value{},
+  nest.oneBasedIndices, /*typeparms=*/mlir::ValueRange{});
+  auto rhsEleAddr = builder.create(
+  loc, refTy, rhs, shapeShift, /*slice=*/mlir::Value{},
+  nest.oneBasedIndices, /*typeparms=*/mlir::ValueRange{});
+  auto lhsEle = builder.create(loc, lhsEleAddr);
+  auto rhsEle = builder.create(loc, rhsEleAddr);
+  mlir::Value scalarReduction = ReductionProcessor::createScalarCombiner(
+  builder, loc, redId, refTy, lhsEle, rhsEle);
+  builder.create(loc, scalarReduction, lhsEleAddr);
+
+  builder.setInsertionPointAfter(nest.outerLoop);
+  builder.create(loc, lhsAddr);
+}
+
+// generate combiner region for reduction operations
+static void genCombiner(fir::FirOpBuilder &builder, mlir::Location loc,
+ReductionProcessor::ReductionIdentifier redId,
+mlir::Type ty, mlir::Value lhs, mlir::Value rhs,
+bool isByRef) {
+  ty = fir::unwrapRefType(ty);
+
+  if (fir::isa_trivial(ty)) {
+mlir::Value lhsLoaded = builder.loadIfRef(loc, lhs);
+mlir::Value rhsLoaded = builder.loadIfRef(loc, rhs);
+
+mlir::Value result = ReductionProcessor::createScalarCombiner(
+builder, loc, redId, ty, lhsLoaded, rhsLoaded);
+if (isByRef) {
+  builder.create(loc, result, lhs);
+  builder.create(loc, lhs);
+} else {
+  builder.create(loc, result);
+}
+return;
+  }
+  // all arrays should have been boxed
+  if (auto boxTy = mlir::dyn_cast(ty)) {
+genBoxCombiner(builder, loc, redId, boxTy, lhs, rhs);
+return;
+  }
+
+  TODO(loc, "OpenMP genCombiner for unsupported reduction variable type");
+}
+
+static mlir::Value
+createReductionInitRegion(fir::FirOpBuilder &builder, mlir::Location loc,
+  const ReductionProcessor::ReductionIdentifier redId,
+  mlir::Type type, bool isByRef) {
+  mlir::Type ty = fir::unwrapRefType(type);
+  mlir::Value initValue = ReductionProcessor::getReductionInitValue(
+  loc, fir::unwrapSeqOrBoxedSeqType(ty), redId, builder);
+
+  if (fir::isa_trivial(ty)) {
+if (isByRef) {
+  mlir::Value alloca = builder.create(loc, ty);
+  builder.createStoreWithConvert(loc, initValue, alloca);
+  return alloca;
+}
+// by val
+return initValue;
+  }
+
+  // all arrays are boxed
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+assert(isByRef && "passing arrays by value is unsupported");
+// TODO: support allocatable arrays: !fir.box>>
+mlir::Type innerTy = fir::extractSequenceType(boxTy);
+if (!mlir::isa(innerTy))
+  

[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/84958

>From bd668cd95d95d1e5b9c8436875c14878c98902ff Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Mon, 12 Feb 2024 14:03:00 +
Subject: [PATCH 1/5] [flang][OpenMP] lower simple array reductions

This has been tested with arrays with compile-time constant bounds.
Allocatable arrays and arrays with non-constant bounds are not yet
supported. User-defined reduction functions are also not yet supported.

The design is intended to work for arrays with non-constant bounds too
without a lot of extra work (mostly there are bugs in OpenMPIRBuilder I
haven't fixed yet).

We need some way to get these runtime bounds into the reduction init and
combiner regions. To keep things simple for now I opted to always box the
array arguments so the box can be passed as one argument and the lower
bounds and extents read from the box. This has the disadvantage of
resulting in fir.box_dim operations inside of the critical section. If
these prove to be a performance issue, we could follow OpenACC reading
box lower bounds and extents before the reduction and passing them as
block arguments to the reduction init and combiner regions. I would
prefer to keep things simple for now.

Note: this implementation only works when the HLFIR lowering is used. I
don't think it is worth supporting FIR-only lowering because the plan is
for that to be removed soon.
---
 flang/lib/Lower/OpenMP/ReductionProcessor.cpp | 283 ++
 flang/lib/Lower/OpenMP/ReductionProcessor.h   |   2 +-
 .../Lower/OpenMP/Todo/reduction-arrays.f90|  15 -
 .../Lower/OpenMP/parallel-reduction-array.f90 |  74 +
 .../Lower/OpenMP/wsloop-reduction-array.f90   |  84 ++
 5 files changed, 389 insertions(+), 69 deletions(-)
 delete mode 100644 flang/test/Lower/OpenMP/Todo/reduction-arrays.f90
 create mode 100644 flang/test/Lower/OpenMP/parallel-reduction-array.f90
 create mode 100644 flang/test/Lower/OpenMP/wsloop-reduction-array.f90

diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp 
b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
index e6a63dd4b939ce..de2f11f5b9512e 100644
--- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
@@ -13,6 +13,7 @@
 #include "ReductionProcessor.h"
 
 #include "flang/Lower/AbstractConverter.h"
+#include "flang/Optimizer/Builder/HLFIRTools.h"
 #include "flang/Optimizer/Builder/Todo.h"
 #include "flang/Optimizer/Dialect/FIRType.h"
 #include "flang/Optimizer/HLFIR/HLFIROps.h"
@@ -92,10 +93,42 @@ std::string 
ReductionProcessor::getReductionName(llvm::StringRef name,
   if (isByRef)
 byrefAddition = "_byref";
 
-  return (llvm::Twine(name) +
-  (ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
-  llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
-  .str();
+  if (fir::isa_trivial(ty))
+return (llvm::Twine(name) +
+(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
+llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
+.str();
+
+  // creates a name like reduction_i_64_box_ux4x3
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+// TODO: support for allocatable boxes:
+// !fir.box>>
+fir::SequenceType seqTy = fir::unwrapRefType(boxTy.getEleTy())
+  .dyn_cast_or_null();
+if (!seqTy)
+  return {};
+
+std::string prefix = getReductionName(
+name, fir::unwrapSeqOrBoxedSeqType(ty), /*isByRef=*/false);
+if (prefix.empty())
+  return {};
+std::stringstream tyStr;
+tyStr << prefix << "_box_";
+bool first = true;
+for (std::int64_t extent : seqTy.getShape()) {
+  if (first)
+first = false;
+  else
+tyStr << "x";
+  if (extent == seqTy.getUnknownExtent())
+tyStr << 'u'; // I'm not sure that '?' is safe in symbol names
+  else
+tyStr << extent;
+}
+return (tyStr.str() + byrefAddition).str();
+  }
+
+  return {};
 }
 
 std::string ReductionProcessor::getReductionName(
@@ -283,6 +316,156 @@ mlir::Value ReductionProcessor::createScalarCombiner(
   return reductionOp;
 }
 
+/// Create reduction combiner region for reduction variables which are boxed
+/// arrays
+static void genBoxCombiner(fir::FirOpBuilder &builder, mlir::Location loc,
+   ReductionProcessor::ReductionIdentifier redId,
+   fir::BaseBoxType boxTy, mlir::Value lhs,
+   mlir::Value rhs) {
+  fir::SequenceType seqTy =
+  mlir::dyn_cast_or_null(boxTy.getEleTy());
+  // TODO: support allocatable arrays: !fir.box>>
+  if (!seqTy || seqTy.hasUnknownShape())
+TODO(loc, "Unsupported boxed type in OpenMP reduction");
+
+  // load fir.ref>
+  mlir::Value lhsAddr = lhs;
+  lhs = builder.create(loc, lhs);
+  rhs = builder.create(loc, rhs);
+
+  const unsigned rank = seqTy.getDimension();
+  llvm::SmallVector extents;
+  extents.reserve(

[llvm-branch-commits] [clang] release/18.x: Reland Print library module manifest path again (#84881) (PR #85637)

2024-03-18 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/85637
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: Reland Print library module manifest path again (#84881) (PR #85637)

2024-03-18 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/85637

Backport 0e1e1fc8f0e

Requested by: @ChuanqiXu9

>From a2e558cdad8855eab19aa90854d29c6510a22eb0 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu 
Date: Sun, 17 Mar 2024 21:14:53 +0800
Subject: [PATCH] Reland Print library module manifest path again  (#84881)

Following of https://github.com/llvm/llvm-project/pull/82160

The reason why the above PR fails is that the `--sysroot` has lower
priority than the libc++ built from the same source. On the one hand, it
matches the codes behavior. We will add the built libc++ project paths
in the ToolChain class. But we will only add the path related to sysroot
in Linux class, which is derived from the ToolChain classes. So the
paths of just built libc++ is in the front of the paths relative to
sysroot. On the other hand, the behavior should be good from the higher
level. Since the just built libc++ has the same version number with the
just built clang, so it makes sense that these 2 compilers just matches.

So for patch it self, I hacked it by using resource dir in the test
since the resource dir has the higher priority, which is not strongly
correct since we won't do that in practice.

@kaz7 would you like to test on your environment to avoid this get
reverted again?

On the libc++ side, it shows that it lacks a `modules.json` file for the
just built libc++ directory. If we don't have that, it will be
problematic to use std modules from the just built clang and libc++
pair. Then it is not good. And I feel it may be problematic for future
compiler/standard library developers. So I feel this is somewhat a
libc++ issue that need to be fixed.

Also if we don't like the hacked test in the current patch, we must wait
for libc++ to fix this to proceed. But I feel this is somewhat odd since
the test of clang shouldn't dependent on libc++.

CC: @mordante

-

Co-authored-by: Mark de Wever 
(cherry picked from commit 0e1e1fc8f0eae6ebdce40ef2154aa90e35e5bed7)
---
 clang/include/clang/Driver/Driver.h   | 10 +
 clang/include/clang/Driver/Options.td |  3 ++
 clang/lib/Driver/Driver.cpp   | 44 +++
 ...les-print-library-module-manifest-path.cpp | 36 +++
 4 files changed, 93 insertions(+)
 create mode 100644 
clang/test/Driver/modules-print-library-module-manifest-path.cpp

diff --git a/clang/include/clang/Driver/Driver.h 
b/clang/include/clang/Driver/Driver.h
index 3ee1bcf2a69c9b..595f104e406d37 100644
--- a/clang/include/clang/Driver/Driver.h
+++ b/clang/include/clang/Driver/Driver.h
@@ -602,6 +602,16 @@ class Driver {
   // FIXME: This should be in CompilationInfo.
   std::string GetProgramPath(StringRef Name, const ToolChain &TC) const;
 
+  /// Lookup the path to the Standard library module manifest.
+  ///
+  /// \param C - The compilation.
+  /// \param TC - The tool chain for additional information on
+  /// directories to search.
+  //
+  // FIXME: This should be in CompilationInfo.
+  std::string GetStdModuleManifestPath(const Compilation &C,
+   const ToolChain &TC) const;
+
   /// HandleAutocompletions - Handle --autocomplete by searching and printing
   /// possible flags, descriptions, and its arguments.
   void HandleAutocompletions(StringRef PassedFlags) const;
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 175bedbfb4d01c..9b1c7cf6b8f35c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5320,6 +5320,9 @@ def print_resource_dir : Flag<["-", "--"], 
"print-resource-dir">,
 def print_search_dirs : Flag<["-", "--"], "print-search-dirs">,
   HelpText<"Print the paths used for finding libraries and programs">,
   Visibility<[ClangOption, CLOption]>;
+def print_std_module_manifest_path : Flag<["-", "--"], 
"print-library-module-manifest-path">,
+  HelpText<"Print the path for the C++ Standard library module manifest">,
+  Visibility<[ClangOption, CLOption]>;
 def print_targets : Flag<["-", "--"], "print-targets">,
   HelpText<"Print the registered targets">,
   Visibility<[ClangOption, CLOption]>;
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 93cddf742d521d..dee5446a50ea59 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -2194,6 +2194,12 @@ bool Driver::HandleImmediateArgs(const Compilation &C) {
 return false;
   }
 
+  if (C.getArgs().hasArg(options::OPT_print_std_module_manifest_path)) {
+llvm::outs() << GetStdModuleManifestPath(C, C.getDefaultToolChain())
+ << '\n';
+return false;
+  }
+
   if (C.getArgs().hasArg(options::OPT_print_runtime_dir)) {
 if (std::optional RuntimePath = TC.getRuntimePath())
   llvm::outs() << *RuntimePath << '\n';
@@ -6166,6 +6172,44 @@ std::string Driver::GetProgramPath(StringRef Name, const 
ToolChain &TC) const {
   return std::string(Name);
 }
 
+std::string Driver

[llvm-branch-commits] [clang] release/18.x: Reland Print library module manifest path again (#84881) (PR #85637)

2024-03-18 Thread via llvm-branch-commits

llvmbot wrote:

@mordante What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/85637
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/18.x: Reland Print library module manifest path again (#84881) (PR #85637)

2024-03-18 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang-driver

Author: None (llvmbot)


Changes

Backport 0e1e1fc8f0e

Requested by: @ChuanqiXu9

---
Full diff: https://github.com/llvm/llvm-project/pull/85637.diff


4 Files Affected:

- (modified) clang/include/clang/Driver/Driver.h (+10) 
- (modified) clang/include/clang/Driver/Options.td (+3) 
- (modified) clang/lib/Driver/Driver.cpp (+44) 
- (added) clang/test/Driver/modules-print-library-module-manifest-path.cpp 
(+36) 


``diff
diff --git a/clang/include/clang/Driver/Driver.h 
b/clang/include/clang/Driver/Driver.h
index 3ee1bcf2a69c9b..595f104e406d37 100644
--- a/clang/include/clang/Driver/Driver.h
+++ b/clang/include/clang/Driver/Driver.h
@@ -602,6 +602,16 @@ class Driver {
   // FIXME: This should be in CompilationInfo.
   std::string GetProgramPath(StringRef Name, const ToolChain &TC) const;
 
+  /// Lookup the path to the Standard library module manifest.
+  ///
+  /// \param C - The compilation.
+  /// \param TC - The tool chain for additional information on
+  /// directories to search.
+  //
+  // FIXME: This should be in CompilationInfo.
+  std::string GetStdModuleManifestPath(const Compilation &C,
+   const ToolChain &TC) const;
+
   /// HandleAutocompletions - Handle --autocomplete by searching and printing
   /// possible flags, descriptions, and its arguments.
   void HandleAutocompletions(StringRef PassedFlags) const;
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 175bedbfb4d01c..9b1c7cf6b8f35c 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -5320,6 +5320,9 @@ def print_resource_dir : Flag<["-", "--"], 
"print-resource-dir">,
 def print_search_dirs : Flag<["-", "--"], "print-search-dirs">,
   HelpText<"Print the paths used for finding libraries and programs">,
   Visibility<[ClangOption, CLOption]>;
+def print_std_module_manifest_path : Flag<["-", "--"], 
"print-library-module-manifest-path">,
+  HelpText<"Print the path for the C++ Standard library module manifest">,
+  Visibility<[ClangOption, CLOption]>;
 def print_targets : Flag<["-", "--"], "print-targets">,
   HelpText<"Print the registered targets">,
   Visibility<[ClangOption, CLOption]>;
diff --git a/clang/lib/Driver/Driver.cpp b/clang/lib/Driver/Driver.cpp
index 93cddf742d521d..dee5446a50ea59 100644
--- a/clang/lib/Driver/Driver.cpp
+++ b/clang/lib/Driver/Driver.cpp
@@ -2194,6 +2194,12 @@ bool Driver::HandleImmediateArgs(const Compilation &C) {
 return false;
   }
 
+  if (C.getArgs().hasArg(options::OPT_print_std_module_manifest_path)) {
+llvm::outs() << GetStdModuleManifestPath(C, C.getDefaultToolChain())
+ << '\n';
+return false;
+  }
+
   if (C.getArgs().hasArg(options::OPT_print_runtime_dir)) {
 if (std::optional RuntimePath = TC.getRuntimePath())
   llvm::outs() << *RuntimePath << '\n';
@@ -6166,6 +6172,44 @@ std::string Driver::GetProgramPath(StringRef Name, const 
ToolChain &TC) const {
   return std::string(Name);
 }
 
+std::string Driver::GetStdModuleManifestPath(const Compilation &C,
+ const ToolChain &TC) const {
+  std::string error = "";
+
+  switch (TC.GetCXXStdlibType(C.getArgs())) {
+  case ToolChain::CST_Libcxx: {
+std::string lib = GetFilePath("libc++.so", TC);
+
+// Note when there are multiple flavours of libc++ the module json needs to
+// look at the command-line arguments for the proper json.
+// These flavours do not exist at the moment, but there are plans to
+// provide a variant that is built with sanitizer instrumentation enabled.
+
+// For example
+//  StringRef modules = [&] {
+//const SanitizerArgs &Sanitize = TC.getSanitizerArgs(C.getArgs());
+//if (Sanitize.needsAsanRt())
+//  return "modules-asan.json";
+//return "modules.json";
+//  }();
+
+SmallString<128> path(lib.begin(), lib.end());
+llvm::sys::path::remove_filename(path);
+llvm::sys::path::append(path, "modules.json");
+if (TC.getVFS().exists(path))
+  return static_cast(path);
+
+return error;
+  }
+
+  case ToolChain::CST_Libstdcxx:
+// libstdc++ does not provide Standard library modules yet.
+return error;
+  }
+
+  return error;
+}
+
 std::string Driver::GetTemporaryPath(StringRef Prefix, StringRef Suffix) const 
{
   SmallString<128> Path;
   std::error_code EC = llvm::sys::fs::createTemporaryFile(Prefix, Suffix, 
Path);
diff --git a/clang/test/Driver/modules-print-library-module-manifest-path.cpp 
b/clang/test/Driver/modules-print-library-module-manifest-path.cpp
new file mode 100644
index 00..24797002b80f53
--- /dev/null
+++ b/clang/test/Driver/modules-print-library-module-manifest-path.cpp
@@ -0,0 +1,36 @@
+// Test that -print-library-module-manifest-path finds the correct file.
+
+// RUN: rm -rf %t && split-file %s %t && cd %t
+// RUN: mkdir -p %t/Inputs/usr/li

[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Kiran Chandramohan via llvm-branch-commits

https://github.com/kiranchandramohan approved this pull request.

For compile-time constant bounds we should switch to a non-box version sometime 
later.

Thanks for the changes.

https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Kiran Chandramohan via llvm-branch-commits


@@ -1108,6 +1108,35 @@ hlfir::createTempFromMold(mlir::Location loc, 
fir::FirOpBuilder &builder,
   return {hlfir::Entity{declareOp.getBase()}, isHeapAlloc};
 }
 
+hlfir::Entity hlfir::createStackTempFromMold(mlir::Location loc,

kiranchandramohan wrote:

@ergawy FYI: This patch has a `createStackTempFromMold` which might be helpful.

https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Kiran Chandramohan via llvm-branch-commits

https://github.com/kiranchandramohan edited 
https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [PowerPC] Update chain uses when emitting lxsizx (#84892) (PR #85648)

2024-03-18 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/85648

Backport e5b20c83e5ba25e6e0650df30352ce54c2f6ea2f

Requested by: @nikic

>From 480314c4fc429df0b6c0496aa92bf91a8552a2f3 Mon Sep 17 00:00:00 2001
From: Qiu Chaofan 
Date: Mon, 18 Mar 2024 22:31:05 +0800
Subject: [PATCH] [PowerPC] Update chain uses when emitting lxsizx (#84892)

(cherry picked from commit e5b20c83e5ba25e6e0650df30352ce54c2f6ea2f)
---
 llvm/lib/Target/PowerPC/PPCISelLowering.cpp   |  1 +
 .../CodeGen/PowerPC/scalar-double-ldst.ll | 58 +++
 .../test/CodeGen/PowerPC/scalar-float-ldst.ll | 58 +++
 3 files changed, 117 insertions(+)

diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp 
b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index aec58d1c0dcb9f..85f1e670045b92 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -14942,6 +14942,7 @@ SDValue PPCTargetLowering::combineFPToIntToFP(SDNode *N,
 SDValue Ld = DAG.getMemIntrinsicNode(PPCISD::LXSIZX, dl,
  DAG.getVTList(MVT::f64, MVT::Other),
  Ops, MVT::i8, LDN->getMemOperand());
+DAG.makeEquivalentMemoryOrdering(LDN, Ld);
 
 // For signed conversion, we need to sign-extend the value in the VSR
 if (Signed) {
diff --git a/llvm/test/CodeGen/PowerPC/scalar-double-ldst.ll 
b/llvm/test/CodeGen/PowerPC/scalar-double-ldst.ll
index 6f68679325c579..798637b6840f1e 100644
--- a/llvm/test/CodeGen/PowerPC/scalar-double-ldst.ll
+++ b/llvm/test/CodeGen/PowerPC/scalar-double-ldst.ll
@@ -7281,3 +7281,61 @@ entry:
   store double %str, ptr inttoptr (i64 1 to ptr), align 4096
   ret void
 }
+
+define dso_local void @st_reversed_double_from_i8(ptr %ptr) {
+; CHECK-P10-LABEL: st_reversed_double_from_i8:
+; CHECK-P10:   # %bb.0: # %entry
+; CHECK-P10-NEXT:li r4, 8
+; CHECK-P10-NEXT:lxsibzx f0, 0, r3
+; CHECK-P10-NEXT:xxspltidp vs2, -1023410176
+; CHECK-P10-NEXT:lxsibzx f1, r3, r4
+; CHECK-P10-NEXT:xscvuxddp f0, f0
+; CHECK-P10-NEXT:xscvuxddp f1, f1
+; CHECK-P10-NEXT:xsadddp f0, f0, f2
+; CHECK-P10-NEXT:xsadddp f1, f1, f2
+; CHECK-P10-NEXT:stfd f1, 0(r3)
+; CHECK-P10-NEXT:stfd f0, 8(r3)
+; CHECK-P10-NEXT:blr
+;
+; CHECK-P9-LABEL: st_reversed_double_from_i8:
+; CHECK-P9:   # %bb.0: # %entry
+; CHECK-P9-NEXT:li r4, 8
+; CHECK-P9-NEXT:lxsibzx f0, 0, r3
+; CHECK-P9-NEXT:lxsibzx f1, r3, r4
+; CHECK-P9-NEXT:addis r4, r2, .LCPI300_0@toc@ha
+; CHECK-P9-NEXT:lfs f2, .LCPI300_0@toc@l(r4)
+; CHECK-P9-NEXT:xscvuxddp f0, f0
+; CHECK-P9-NEXT:xscvuxddp f1, f1
+; CHECK-P9-NEXT:xsadddp f0, f0, f2
+; CHECK-P9-NEXT:xsadddp f1, f1, f2
+; CHECK-P9-NEXT:stfd f0, 8(r3)
+; CHECK-P9-NEXT:stfd f1, 0(r3)
+; CHECK-P9-NEXT:blr
+;
+; CHECK-P8-LABEL: st_reversed_double_from_i8:
+; CHECK-P8:   # %bb.0: # %entry
+; CHECK-P8-NEXT:lbz r4, 0(r3)
+; CHECK-P8-NEXT:lbz r5, 8(r3)
+; CHECK-P8-NEXT:mtfprwz f0, r4
+; CHECK-P8-NEXT:mtfprwz f1, r5
+; CHECK-P8-NEXT:addis r4, r2, .LCPI300_0@toc@ha
+; CHECK-P8-NEXT:lfs f2, .LCPI300_0@toc@l(r4)
+; CHECK-P8-NEXT:xscvuxddp f0, f0
+; CHECK-P8-NEXT:xscvuxddp f1, f1
+; CHECK-P8-NEXT:xsadddp f0, f0, f2
+; CHECK-P8-NEXT:xsadddp f1, f1, f2
+; CHECK-P8-NEXT:stfd f1, 0(r3)
+; CHECK-P8-NEXT:stfd f0, 8(r3)
+; CHECK-P8-NEXT:blr
+entry:
+  %idx = getelementptr inbounds i8, ptr %ptr, i64 8
+  %i0 = load i8, ptr %ptr, align 1
+  %i1 = load i8, ptr %idx, align 1
+  %f0 = uitofp i8 %i0 to double
+  %f1 = uitofp i8 %i1 to double
+  %a0 = fadd double %f0, -1.28e+02
+  %a1 = fadd double %f1, -1.28e+02
+  store double %a1, ptr %ptr, align 8
+  store double %a0, ptr %idx, align 8
+  ret void
+}
diff --git a/llvm/test/CodeGen/PowerPC/scalar-float-ldst.ll 
b/llvm/test/CodeGen/PowerPC/scalar-float-ldst.ll
index 824dd4c4db6cb7..f3960573421298 100644
--- a/llvm/test/CodeGen/PowerPC/scalar-float-ldst.ll
+++ b/llvm/test/CodeGen/PowerPC/scalar-float-ldst.ll
@@ -7271,3 +7271,61 @@ entry:
   store double %conv, ptr inttoptr (i64 1 to ptr), align 4096
   ret void
 }
+
+define dso_local void @st_reversed_float_from_i8(ptr %ptr) {
+; CHECK-P10-LABEL: st_reversed_float_from_i8:
+; CHECK-P10:   # %bb.0: # %entry
+; CHECK-P10-NEXT:li r4, 8
+; CHECK-P10-NEXT:lxsibzx f0, 0, r3
+; CHECK-P10-NEXT:xxspltidp vs2, -1023410176
+; CHECK-P10-NEXT:lxsibzx f1, r3, r4
+; CHECK-P10-NEXT:xscvuxdsp f0, f0
+; CHECK-P10-NEXT:xscvuxdsp f1, f1
+; CHECK-P10-NEXT:xsaddsp f0, f0, f2
+; CHECK-P10-NEXT:xsaddsp f1, f1, f2
+; CHECK-P10-NEXT:stfs f0, 8(r3)
+; CHECK-P10-NEXT:stfs f1, 0(r3)
+; CHECK-P10-NEXT:blr
+;
+; CHECK-P9-LABEL: st_reversed_float_from_i8:
+; CHECK-P9:   # %bb.0: # %entry
+; CHECK-P9-NEXT:li r4, 8
+; CHECK-P9-NEXT:lxsibzx f0, 0, r3
+; CHECK-P9-NEXT:lxsibzx f1, r3, r4
+; CHECK-P9-NEXT:addi

[llvm-branch-commits] [llvm] release/18.x: [PowerPC] Update chain uses when emitting lxsizx (#84892) (PR #85648)

2024-03-18 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/85648
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [PowerPC] Update chain uses when emitting lxsizx (#84892) (PR #85648)

2024-03-18 Thread via llvm-branch-commits

llvmbot wrote:

@cuviper What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/85648
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [llvm][AArch64] Autoupgrade function attributes from Module attributes. (#82763) (PR #84039)

2024-03-18 Thread Nick Desaulniers via llvm-branch-commits

nickdesaulniers wrote:

We'll have a 3rd iteration of the patch to backport. This version should not be.

https://github.com/llvm/llvm-project/pull/84039
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [PowerPC] Update chain uses when emitting lxsizx (#84892) (PR #85648)

2024-03-18 Thread Josh Stone via llvm-branch-commits

cuviper wrote:

Fine by me -- I even tested the fix back on LLVM 17 for my original reproducer 
-- but I think the bot really wants to hear from the PR author @ecnelises.

https://github.com/llvm/llvm-project/pull/85648
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [PowerPC] Update chain uses when emitting lxsizx (#84892) (PR #85648)

2024-03-18 Thread Qiu Chaofan via llvm-branch-commits

https://github.com/ecnelises approved this pull request.

LGTM, thanks.

https://github.com/llvm/llvm-project/pull/85648
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [PowerPC] Update chain uses when emitting lxsizx (#84892) (PR #85648)

2024-03-18 Thread Josh Stone via llvm-branch-commits

https://github.com/cuviper approved this pull request.


https://github.com/llvm/llvm-project/pull/85648
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] simplify getReductionName (PR #85666)

2024-03-18 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-flang-fir-hlfir

Author: Tom Eccles (tblah)


Changes

Re-use fir::getTypeAsString instead of creating something new here. This spells 
integer names like i32 instead of i_32 so there is a lot of test churn.

The idea was suggested by Kiran: 
https://github.com/llvm/llvm-project/pull/84958#discussion_r1527604388

---

Patch is 113.34 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/85666.diff


37 Files Affected:

- (modified) flang/lib/Lower/OpenMP/ReductionProcessor.cpp (+21-52) 
- (modified) flang/lib/Lower/OpenMP/ReductionProcessor.h (+4-3) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-add-byref.f90 
(+11-11) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-add.f90 (+11-11) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-iand-byref.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-ieor-byref.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-ior-byref.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-max-byref.f90 (+4-4) 
- (modified) flang/test/Lower/OpenMP/FIR/wsloop-reduction-min-byref.f90 (+4-4) 
- (modified) flang/test/Lower/OpenMP/default-clause-byref.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/default-clause.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/parallel-reduction-array.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/parallel-reduction-array2.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/parallel-wsloop-reduction-byref.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/parallel-wsloop-reduction.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-add-byref.f90 (+11-11) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-add-hlfir-byref.f90 
(+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-add-hlfir.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-add.f90 (+11-11) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-array.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-array2.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-iand-byref.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-iand.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-ieor-byref.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-ior-byref.f90 (+2-4) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-ior.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-2-byref.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-2.f90 (+1-1) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-byref.f90 (+5-5) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-hlfir-byref.f90 
(+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max-hlfir.f90 (+2-2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-max.f90 (+5-5) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-min-byref.f90 (+5-5) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-min.f90 (+5-5) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-min2.f90 (+2) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-mul-byref.f90 (+11-11) 
- (modified) flang/test/Lower/OpenMP/wsloop-reduction-mul.f90 (+11-11) 


``diff
diff --git a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp 
b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
index 7f737dc8aa31db..701f1f5da95526 100644
--- a/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ReductionProcessor.cpp
@@ -83,8 +83,10 @@ bool ReductionProcessor::supportedIntrinsicProcReduction(
   return redType;
 }
 
-std::string ReductionProcessor::getReductionName(llvm::StringRef name,
- mlir::Type ty, bool isByRef) {
+std::string
+ReductionProcessor::getReductionName(llvm::StringRef name,
+ const fir::KindMapping &kindMap,
+ mlir::Type ty, bool isByRef) {
   ty = fir::unwrapRefType(ty);
 
   // extra string to distinguish reduction functions for variables passed by
@@ -93,47 +95,12 @@ std::string 
ReductionProcessor::getReductionName(llvm::StringRef name,
   if (isByRef)
 byrefAddition = "_byref";
 
-  if (fir::isa_trivial(ty))
-return (llvm::Twine(name) +
-(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
-llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
-.str();
-
-  // creates a name like reduction_i_64_box_ux4x3
-  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
-// TODO: support for allocatable boxes:
-// !fir.box>>
-fir::SequenceType seqTy = fir::unwrapRefType(boxTy.getEleTy())
-  .dyn_cast_or_null();
-if (!seqTy)
-  return {};
-
-std::string prefix = getReductionName(
-name, fir::unwrapSeqOrBoxedSeqType(ty), /*isByRef=*/false);
-if (prefix.empty())
-  return {};
-std::strin

[llvm-branch-commits] [flang] [flang][OpenMP] lower simple array reductions (PR #84958)

2024-03-18 Thread Tom Eccles via llvm-branch-commits


@@ -92,10 +93,42 @@ std::string 
ReductionProcessor::getReductionName(llvm::StringRef name,
   if (isByRef)
 byrefAddition = "_byref";
 
-  return (llvm::Twine(name) +
-  (ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
-  llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
-  .str();
+  if (fir::isa_trivial(ty))
+return (llvm::Twine(name) +
+(ty.isIntOrIndex() ? llvm::Twine("_i_") : llvm::Twine("_f_")) +
+llvm::Twine(ty.getIntOrFloatBitWidth()) + byrefAddition)
+.str();
+
+  // creates a name like reduction_i_64_box_ux4x3
+  if (auto boxTy = mlir::dyn_cast_or_null(ty)) {
+// TODO: support for allocatable boxes:
+// !fir.box>>
+fir::SequenceType seqTy = fir::unwrapRefType(boxTy.getEleTy())
+  .dyn_cast_or_null();
+if (!seqTy)
+  return {};
+
+std::string prefix = getReductionName(
+name, fir::unwrapSeqOrBoxedSeqType(ty), /*isByRef=*/false);
+if (prefix.empty())
+  return {};
+std::stringstream tyStr;
+tyStr << prefix << "_box_";
+bool first = true;
+for (std::int64_t extent : seqTy.getShape()) {
+  if (first)
+first = false;
+  else
+tyStr << "x";
+  if (extent == seqTy.getUnknownExtent())
+tyStr << 'u'; // I'm not sure that '?' is safe in symbol names
+  else
+tyStr << extent;
+}
+return (tyStr.str() + byrefAddition).str();
+  }
+
+  return {};

tblah wrote:

https://github.com/llvm/llvm-project/pull/85666

https://github.com/llvm/llvm-project/pull/84958
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] simplify getReductionName (PR #85666)

2024-03-18 Thread Kiran Chandramohan via llvm-branch-commits

https://github.com/kiranchandramohan approved this pull request.

> The idea was suggested by Kiran: 
> https://github.com/llvm/llvm-project/pull/84958#discussion_r1527604388

It was implement by @clementval and recommended by him in other patches. I was 
just forwarding the suggestion.

Thanks for the patch. LGTM.

https://github.com/llvm/llvm-project/pull/85666
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [flang] [flang][OpenMP] simplify getReductionName (PR #85666)

2024-03-18 Thread Valentin Clement バレンタイン クレメン via llvm-branch-commits

clementval wrote:

Great to see this re-used!

https://github.com/llvm/llvm-project/pull/85666
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-18 Thread Tom Eccles via llvm-branch-commits

https://github.com/tblah updated https://github.com/llvm/llvm-project/pull/84955

>From 0b2f5fee61d170b0a2197fd5da92f0e84b3b14f4 Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Wed, 21 Feb 2024 14:22:39 +
Subject: [PATCH 1/3] [mlir][LLVM] erase call mappings in forgetMapping()

It looks like the mappings for call instructions were forgotten here.
This fixes a bug in OpenMP when inlining a region containing call
operations multiple times.
---
 mlir/lib/Target/LLVMIR/ModuleTranslation.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp 
b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
index c00628a420a000..995544238e4a3c 100644
--- a/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleTranslation.cpp
@@ -716,6 +716,8 @@ void ModuleTranslation::forgetMapping(Region ®ion) {
   branchMapping.erase(&op);
 if (isa(op))
   globalsMapping.erase(&op);
+if (isa(op))
+  callMapping.erase(&op);
 llvm::append_range(
 toProcess,
 llvm::map_range(op.getRegions(), [](Region &r) { return &r; }));

>From d6955d43534d23aaf58c10090c1a5a2167018796 Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Fri, 15 Mar 2024 14:05:53 +
Subject: [PATCH 2/3] Add test

---
 .../Target/LLVMIR/openmp-reduction-call.mlir  | 47 +++
 1 file changed, 47 insertions(+)
 create mode 100644 mlir/test/Target/LLVMIR/openmp-reduction-call.mlir

diff --git a/mlir/test/Target/LLVMIR/openmp-reduction-call.mlir 
b/mlir/test/Target/LLVMIR/openmp-reduction-call.mlir
new file mode 100644
index 00..60419a85f66aa2
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-reduction-call.mlir
@@ -0,0 +1,47 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Test that we don't crash when there is a call operation in the combiner
+
+omp.reduction.declare @add_f32 : f32
+init {
+^bb0(%arg: f32):
+  %0 = llvm.mlir.constant(0.0 : f32) : f32
+  omp.yield (%0 : f32)
+}
+combiner {
+^bb1(%arg0: f32, %arg1: f32):
+// test this call here:
+  llvm.call @test_call() : () -> ()
+  %1 = llvm.fadd %arg0, %arg1 : f32
+  omp.yield (%1 : f32)
+}
+atomic {
+^bb2(%arg2: !llvm.ptr, %arg3: !llvm.ptr):
+  %2 = llvm.load %arg3 : !llvm.ptr -> f32
+  llvm.atomicrmw fadd %arg2, %2 monotonic : !llvm.ptr, f32
+  omp.yield
+}
+
+llvm.func @simple_reduction(%lb : i64, %ub : i64, %step : i64) {
+  %c1 = llvm.mlir.constant(1 : i32) : i32
+  %0 = llvm.alloca %c1 x i32 : (i32) -> !llvm.ptr
+  omp.parallel {
+omp.wsloop reduction(@add_f32 %0 -> %prv : !llvm.ptr)
+for (%iv) : i64 = (%lb) to (%ub) step (%step) {
+  %1 = llvm.mlir.constant(2.0 : f32) : f32
+  %2 = llvm.load %prv : !llvm.ptr -> f32
+  %3 = llvm.fadd %1, %2 : f32
+  llvm.store %3, %prv : f32, !llvm.ptr
+  omp.yield
+}
+omp.terminator
+  }
+  llvm.return
+}
+
+llvm.func @test_call() -> ()
+
+// Call to the troublesome function will have been inlined twice: once into
+// main and once into the outlined function
+// CHECK: call void @test_call()
+// CHECK: call void @test_call()

>From 02550e13fc8e9d3e653b8e689a4479cfdd0bfcc8 Mon Sep 17 00:00:00 2001
From: Tom Eccles 
Date: Mon, 18 Mar 2024 17:16:55 +
Subject: [PATCH 3/3] Simplify test

---
 .../Target/LLVMIR/openmp-reduction-call.mlir  | 20 +--
 1 file changed, 5 insertions(+), 15 deletions(-)

diff --git a/mlir/test/Target/LLVMIR/openmp-reduction-call.mlir 
b/mlir/test/Target/LLVMIR/openmp-reduction-call.mlir
index 60419a85f66aa2..9aa9ac7e4b5a52 100644
--- a/mlir/test/Target/LLVMIR/openmp-reduction-call.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-reduction-call.mlir
@@ -15,25 +15,15 @@ combiner {
   %1 = llvm.fadd %arg0, %arg1 : f32
   omp.yield (%1 : f32)
 }
-atomic {
-^bb2(%arg2: !llvm.ptr, %arg3: !llvm.ptr):
-  %2 = llvm.load %arg3 : !llvm.ptr -> f32
-  llvm.atomicrmw fadd %arg2, %2 monotonic : !llvm.ptr, f32
-  omp.yield
-}
 
 llvm.func @simple_reduction(%lb : i64, %ub : i64, %step : i64) {
   %c1 = llvm.mlir.constant(1 : i32) : i32
   %0 = llvm.alloca %c1 x i32 : (i32) -> !llvm.ptr
-  omp.parallel {
-omp.wsloop reduction(@add_f32 %0 -> %prv : !llvm.ptr)
-for (%iv) : i64 = (%lb) to (%ub) step (%step) {
-  %1 = llvm.mlir.constant(2.0 : f32) : f32
-  %2 = llvm.load %prv : !llvm.ptr -> f32
-  %3 = llvm.fadd %1, %2 : f32
-  llvm.store %3, %prv : f32, !llvm.ptr
-  omp.yield
-}
+  omp.parallel reduction(@add_f32 %0 -> %prv : !llvm.ptr) {
+%1 = llvm.mlir.constant(2.0 : f32) : f32
+%2 = llvm.load %prv : !llvm.ptr -> f32
+%3 = llvm.fadd %1, %2 : f32
+llvm.store %3, %prv : f32, !llvm.ptr
 omp.terminator
   }
   llvm.return

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-18 Thread Tom Eccles via llvm-branch-commits

tblah wrote:

Sorry about the force push. Github wasn't showing the new commit (02550e1). It 
is just a rebase onto the target branch.

https://github.com/llvm/llvm-project/pull/84955
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][LLVM] erase call mappings in forgetMapping() (PR #84955)

2024-03-18 Thread Mehdi Amini via llvm-branch-commits


@@ -0,0 +1,47 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Test that we don't crash when there is a call operation in the combiner
+
+omp.reduction.declare @add_f32 : f32
+init {
+^bb0(%arg: f32):
+  %0 = llvm.mlir.constant(0.0 : f32) : f32
+  omp.yield (%0 : f32)
+}
+combiner {
+^bb1(%arg0: f32, %arg1: f32):
+// test this call here:
+  llvm.call @test_call() : () -> ()
+  %1 = llvm.fadd %arg0, %arg1 : f32
+  omp.yield (%1 : f32)
+}
+atomic {
+^bb2(%arg2: !llvm.ptr, %arg3: !llvm.ptr):
+  %2 = llvm.load %arg3 : !llvm.ptr -> f32
+  llvm.atomicrmw fadd %arg2, %2 monotonic : !llvm.ptr, f32
+  omp.yield
+}
+
+llvm.func @simple_reduction(%lb : i64, %ub : i64, %step : i64) {
+  %c1 = llvm.mlir.constant(1 : i32) : i32
+  %0 = llvm.alloca %c1 x i32 : (i32) -> !llvm.ptr
+  omp.parallel {
+omp.wsloop reduction(@add_f32 %0 -> %prv : !llvm.ptr)
+for (%iv) : i64 = (%lb) to (%ub) step (%step) {
+  %1 = llvm.mlir.constant(2.0 : f32) : f32
+  %2 = llvm.load %prv : !llvm.ptr -> f32
+  %3 = llvm.fadd %1, %2 : f32
+  llvm.store %3, %prv : f32, !llvm.ptr
+  omp.yield
+}
+omp.terminator
+  }
+  llvm.return
+}
+
+llvm.func @test_call() -> ()

joker-eph wrote:

The OpenMP translation predates the LLVMTranslationInterface, but it really 
should evolve to this model: the OpenMP specific component should be better 
identified and isolated really.
Anyway that's not a thing for this PR, thank for clarifying!

https://github.com/llvm/llvm-project/pull/84955
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [cmake] Reenable FatLTO for Fuchsia toolchains (PR #85709)

2024-03-18 Thread Paul Kirth via llvm-branch-commits

https://github.com/ilovepi created 
https://github.com/llvm/llvm-project/pull/85709

We can reenable this now that FatLTO won't be enabled for Mac platforms



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [cmake] Reenable FatLTO for Fuchsia toolchains (PR #85709)

2024-03-18 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: Paul Kirth (ilovepi)


Changes

We can reenable this now that FatLTO won't be enabled for Mac platforms


---
Full diff: https://github.com/llvm/llvm-project/pull/85709.diff


1 Files Affected:

- (modified) clang/cmake/caches/Fuchsia-stage2.cmake (+1-1) 


``diff
diff --git a/clang/cmake/caches/Fuchsia-stage2.cmake 
b/clang/cmake/caches/Fuchsia-stage2.cmake
index bc1a05345ce237..d5546e20873b3c 100644
--- a/clang/cmake/caches/Fuchsia-stage2.cmake
+++ b/clang/cmake/caches/Fuchsia-stage2.cmake
@@ -11,7 +11,7 @@ set(LLVM_ENABLE_RUNTIMES 
"compiler-rt;libcxx;libcxxabi;libunwind" CACHE STRING "
 
 set(LLVM_ENABLE_BACKTRACES OFF CACHE BOOL "")
 set(LLVM_ENABLE_DIA_SDK OFF CACHE BOOL "")
-set(LLVM_ENABLE_FATLTO OFF CACHE BOOL "")
+set(LLVM_ENABLE_FATLTO ON CACHE BOOL "")
 set(LLVM_ENABLE_HTTPLIB ON CACHE BOOL "")
 set(LLVM_ENABLE_LIBCXX ON CACHE BOOL "")
 set(LLVM_ENABLE_LIBEDIT OFF CACHE BOOL "")

``




https://github.com/llvm/llvm-project/pull/85709
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [cmake] Reenable FatLTO for Fuchsia toolchains (PR #85709)

2024-03-18 Thread Paul Kirth via llvm-branch-commits

ilovepi wrote:

Note this is stacked on top of #85708

https://github.com/llvm/llvm-project/pull/85709
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 7fd9979 - [InstCombine] Drop UB-implying attrs/metadata after speculating an instruction (#85542)

2024-03-18 Thread via llvm-branch-commits

Author: Yingwei Zheng
Date: 2024-03-17T06:20:57Z
New Revision: 7fd9979eb9cf9d93d76e12e85aa1ad847794ebbc

URL: 
https://github.com/llvm/llvm-project/commit/7fd9979eb9cf9d93d76e12e85aa1ad847794ebbc
DIFF: 
https://github.com/llvm/llvm-project/commit/7fd9979eb9cf9d93d76e12e85aa1ad847794ebbc.diff

LOG: [InstCombine] Drop UB-implying attrs/metadata after speculating an 
instruction (#85542)

When speculating an instruction in `InstCombinerImpl::FoldOpIntoSelect`,
the call may result in undefined behavior. This patch drops all
UB-implying attrs/metadata to fix this.

Fixes #85536.

(cherry picked from commit 252d01952c087cf0d141f7f281cf60efeb98be41)

Added: 


Modified: 
llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
llvm/test/Transforms/InstCombine/intrinsic-select.ll

Removed: 




diff  --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp 
b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 5d207dcfd18dd4..6f0cf9d9c8f187 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -1455,6 +1455,7 @@ static Value *foldOperationIntoSelectOperand(Instruction 
&I, SelectInst *SI,
  Value *NewOp, InstCombiner &IC) {
   Instruction *Clone = I.clone();
   Clone->replaceUsesOfWith(SI, NewOp);
+  Clone->dropUBImplyingAttrsAndMetadata();
   IC.InsertNewInstBefore(Clone, SI->getIterator());
   return Clone;
 }

diff  --git a/llvm/test/Transforms/InstCombine/intrinsic-select.ll 
b/llvm/test/Transforms/InstCombine/intrinsic-select.ll
index a203b28bcb82a8..f37226bbd5b09c 100644
--- a/llvm/test/Transforms/InstCombine/intrinsic-select.ll
+++ b/llvm/test/Transforms/InstCombine/intrinsic-select.ll
@@ -240,3 +240,43 @@ define i32 @vec_to_scalar_select_vector(<2 x i1> %b) {
   %c = call i32 @llvm.vector.reduce.add.v2i32(<2 x i32> %s)
   ret i32 %c
 }
+
+define i8 @test_drop_noundef(i1 %cond, i8 %val) {
+; CHECK-LABEL: @test_drop_noundef(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[TMP0:%.*]] = call i8 @llvm.smin.i8(i8 [[VAL:%.*]], i8 0)
+; CHECK-NEXT:[[RET:%.*]] = select i1 [[COND:%.*]], i8 -1, i8 [[TMP0]]
+; CHECK-NEXT:ret i8 [[RET]]
+;
+entry:
+  %sel = select i1 %cond, i8 -1, i8 %val
+  %ret = call noundef i8 @llvm.smin.i8(i8 %sel, i8 0)
+  ret i8 %ret
+}
+
+define i1 @pr85536(i32 %a) {
+; CHECK-LABEL: @pr85536(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[CMP1:%.*]] = icmp ult i32 [[A:%.*]], 31
+; CHECK-NEXT:[[SHL1:%.*]] = shl nsw i32 -1, [[A]]
+; CHECK-NEXT:[[ZEXT:%.*]] = zext i32 [[SHL1]] to i64
+; CHECK-NEXT:[[SHL2:%.*]] = shl i64 [[ZEXT]], 48
+; CHECK-NEXT:[[SHR:%.*]] = ashr exact i64 [[SHL2]], 48
+; CHECK-NEXT:[[TMP0:%.*]] = call i64 @llvm.smin.i64(i64 [[SHR]], i64 0)
+; CHECK-NEXT:[[TMP1:%.*]] = and i64 [[TMP0]], 65535
+; CHECK-NEXT:[[RET1:%.*]] = icmp eq i64 [[TMP1]], 0
+; CHECK-NEXT:[[RET:%.*]] = select i1 [[CMP1]], i1 [[RET1]], i1 false
+; CHECK-NEXT:ret i1 [[RET]]
+;
+entry:
+  %cmp1 = icmp ugt i32 %a, 30
+  %shl1 = shl nsw i32 -1, %a
+  %zext = zext i32 %shl1 to i64
+  %shl2 = shl i64 %zext, 48
+  %shr = ashr exact i64 %shl2, 48
+  %sel = select i1 %cmp1, i64 -1, i64 %shr
+  %smin = call noundef i64 @llvm.smin.i64(i64 %sel, i64 0)
+  %masked = and i64 %smin, 65535
+  %ret = icmp eq i64 %masked, 0
+  ret i1 %ret
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/18.x: [InstCombine] Drop UB-implying attrs/metadata after speculating an instruction (#85542) (PR #85562)

2024-03-18 Thread Tom Stellard via llvm-branch-commits

https://github.com/tstellar closed 
https://github.com/llvm/llvm-project/pull/85562
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-03-18 Thread Craig Topper via llvm-branch-commits


@@ -101,6 +101,11 @@ static cl::opt EnableMISchedLoadClustering(
 cl::desc("Enable load clustering in the machine scheduler"),
 cl::init(false));
 
+static cl::opt
+EnableSelectOpt("riscv-select-opt", cl::Hidden,

topperc wrote:

> I think the impact won't be large, since the pass is early out before these 
> analysises actully run when enableSelectOptimize returns false .

The pass manager will run the analysis passes before the runOnFunction in the 
select optimize pass gets called. Unless those analysis passes do lazy updates 
and only compute something when they are queried, they will run before the 
early out.

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] Don't check COMPILER_RT_STANDALONE_BUILD for test deps (PR #83651)

2024-03-18 Thread Alexander Richardson via llvm-branch-commits

https://github.com/arichardson updated 
https://github.com/llvm/llvm-project/pull/83651


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] Don't check COMPILER_RT_STANDALONE_BUILD for test deps (PR #83651)

2024-03-18 Thread Alexander Richardson via llvm-branch-commits

https://github.com/arichardson updated 
https://github.com/llvm/llvm-project/pull/83651


___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-03-18 Thread Craig Topper via llvm-branch-commits


@@ -1046,6 +1046,14 @@ def FeatureFastUnalignedAccess
 def FeaturePostRAScheduler : SubtargetFeature<"use-postra-scheduler",
 "UsePostRAScheduler", "true", "Schedule again after register allocation">;
 
+def FeaturePredictableSelectIsExpensive
+  : SubtargetFeature<"predictable-select-expensive", 
"PredictableSelectIsExpensive",
+ "true", "Prefer likely predicted branches over selects">;
+
+def FeatureEnableSelectOptimize
+  : SubtargetFeature<"enable-select-opt", "EnableSelectOptimize", "true",
+"Enable the select optimize pass for select loop 
heuristics">;

topperc wrote:

This needs to be indented 1 more space

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)

2024-03-18 Thread Craig Topper via llvm-branch-commits

topperc wrote:

> > JFYI, I don't find the AArch64 data particularly convincing for RISCV. The 
> > magnitude of the change even on AArch64 is small, and could easily be swung 
> > one direction or the other by differences in implementation between the 
> > backends.
> 
> Yeah! The result will differ for different targets/CPUs. One RISCV data for 
> SPEC 2006 (which is not universal I think) on an OoO RISCV CPU, options: 
> `-march=rv64gc_zba_zbb_zicond -O3`:
> 
> ```
> 400.perlbench0.538%
> 401.bzip20.018%
> 403.gcc  0.105%
> 429.mcf  1.028%
> 445.gobmk-0.221%
> 456.hmmer1.582%
> 458.sjeng-0.026%
> 462.libquantum   -0.090%
> 464.h264ref  0.905%
> 471.omnetpp  -0.776%
> 473.astar0.205%
> ```
> 
> The geomean is: 0.295%. The result can be better with PGO I think (haven't 
> tried it). Some related discussions: 
> https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization. So I think we 
> can be just like AArch64, make it a tune feature and processors can add it if 
> needed.

Do we have any data without Zicond? The worst case Zicond sequence is 
czero.eqz+czero.nez+or which is kind of expensive. Curious if this is pointing 
to Zicond being used too aggressively.

https://github.com/llvm/llvm-project/pull/80124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AArch64][PAC] Lower authenticated calls with ptrauth bundles. (PR #85736)

2024-03-18 Thread Ahmed Bougacha via llvm-branch-commits

https://github.com/ahmedbougacha created 
https://github.com/llvm/llvm-project/pull/85736

This adds codegen support for the "ptrauth" operand bundles, which can
be used to augment indirect calls with the equivalent of an
`@llvm.ptrauth.auth` intrinsic call on the call target (possibly
preceded by an `@llvm.ptrauth.blend` on the auth discriminator if
applicable.)

This allows the generation of combined authenticating calls
on AArch64 (in the BLRA* PAuth instructions), while avoiding
the raw just-authenticated function pointer from being
exposed to attackers.

This is done by threading a PtrAuthInfo descriptor through
the call lowering infrastructure.

Note that this also applies to the other forms of indirect calls,
notably invokes, rvmarker, and tail calls.  Tail-calls in particular
bring some additional complexity, with the intersecting register
constraints of BTI and PAC discriminator computation.

This also adopts an x8+ allocation order for GPR64noip, matching GPR64.

>From ad3c074aa9a8d527edb34dd69b690afb5e4f6011 Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha 
Date: Mon, 27 Sep 2021 08:00:00 -0700
Subject: [PATCH 1/2] [AArch64] Adopt x8+ allocation order for GPR64noip.

73078ecd381 added GPR64noip for hwasan pseudos.
Give it an allocation order that prefers allocating from x8 and up,
to match GPR64: this allows for easier regalloc, as x0-x7 are
likely to be used for parameter passing.
---
 llvm/lib/Target/AArch64/AArch64RegisterInfo.td | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td 
b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
index fef1748021b07c..9240ad479cbd63 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
@@ -234,7 +234,10 @@ def tcGPRnotx16 : RegisterClass<"AArch64", [i64], 64, (sub 
tcGPR64, X16)>;
 // Register set that excludes registers that are reserved for procedure calls.
 // This is used for pseudo-instructions that are actually implemented using a
 // procedure call.
-def GPR64noip : RegisterClass<"AArch64", [i64], 64, (sub GPR64, X16, X17, LR)>;
+def GPR64noip : RegisterClass<"AArch64", [i64], 64, (sub GPR64, X16, X17, LR)> 
{
+  let AltOrders = [(rotl GPR64noip, 8)];
+  let AltOrderSelect = [{ return 1; }];
+}
 
 // GPR register classes for post increment amount of vector load/store that
 // has alternate printing when Rm=31 and prints a constant immediate value

>From 25cb02dc4441fac54b7a69af833ba0bf18b1696c Mon Sep 17 00:00:00 2001
From: Ahmed Bougacha 
Date: Wed, 24 Jan 2024 15:03:49 -0800
Subject: [PATCH 2/2] [AArch64][PAC] Lower authenticated calls with ptrauth
 bundles.

This adds codegen support for the "ptrauth" operand bundles, which can
be used to augment indirect calls with the equivalent of an
`@llvm.ptrauth.auth` intrinsic call on the call target (possibly
preceded by an `@llvm.ptrauth.blend` on the auth discriminator if
applicable.)

This allows the generation of combined authenticating calls
on AArch64 (in the BLRA* PAuth instructions), while avoiding
the raw just-authenticated function pointer from being
exposed to attackers.

This is done by threading a PtrAuthInfo descriptor through
the call lowering infrastructure.

Note that this also applies to the other forms of indirect calls,
notably invokes, rvmarker, and tail calls.  Tail-calls in particular
bring some additional complexity, with the intersecting register
constraints of BTI and PAC discriminator computation.
---
 .../llvm/CodeGen/GlobalISel/CallLowering.h|   7 +
 llvm/include/llvm/CodeGen/TargetLowering.h|  18 ++
 llvm/lib/CodeGen/GlobalISel/CallLowering.cpp  |   2 +
 llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp  |  16 +-
 .../SelectionDAG/SelectionDAGBuilder.cpp  |  51 -
 .../SelectionDAG/SelectionDAGBuilder.h|   6 +-
 llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp |  90 
 .../AArch64/AArch64ExpandPseudoInsts.cpp  |  43 +++-
 .../Target/AArch64/AArch64ISelLowering.cpp| 103 ++---
 llvm/lib/Target/AArch64/AArch64ISelLowering.h |  12 ++
 llvm/lib/Target/AArch64/AArch64InstrInfo.cpp  |   2 +
 llvm/lib/Target/AArch64/AArch64InstrInfo.td   |  80 +++
 .../AArch64/GISel/AArch64CallLowering.cpp |  88 ++--
 .../AArch64/GISel/AArch64GlobalISelUtils.cpp  |   2 +-
 .../AArch64/GlobalISel/ptrauth-invoke.ll  | 183 
 ...ranch-target-enforcement-indirect-calls.ll |   4 +-
 llvm/test/CodeGen/AArch64/ptrauth-bti-call.ll | 105 ++
 .../CodeGen/AArch64/ptrauth-call-rv-marker.ll | 154 ++
 llvm/test/CodeGen/AArch64/ptrauth-call.ll | 195 ++
 llvm/test/CodeGen/AArch64/ptrauth-invoke.ll   | 189 +
 20 files changed, 1291 insertions(+), 59 deletions(-)
 create mode 100644 llvm/test/CodeGen/AArch64/GlobalISel/ptrauth-invoke.ll
 create mode 100644 llvm/test/CodeGen/AArch64/ptrauth-bti-call.ll
 create mode 100644 llvm/test/CodeGen/AArch64/ptrauth-call-rv-marker.ll
 create 

[llvm-branch-commits] [llvm] [AArch64][PAC] Lower authenticated calls with ptrauth bundles. (PR #85736)

2024-03-18 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: Ahmed Bougacha (ahmedbougacha)


Changes

This adds codegen support for the "ptrauth" operand bundles, which can
be used to augment indirect calls with the equivalent of an
`@llvm.ptrauth.auth` intrinsic call on the call target (possibly
preceded by an `@llvm.ptrauth.blend` on the auth discriminator if
applicable.)

This allows the generation of combined authenticating calls
on AArch64 (in the BLRA* PAuth instructions), while avoiding
the raw just-authenticated function pointer from being
exposed to attackers.

This is done by threading a PtrAuthInfo descriptor through
the call lowering infrastructure.

Note that this also applies to the other forms of indirect calls,
notably invokes, rvmarker, and tail calls.  Tail-calls in particular
bring some additional complexity, with the intersecting register
constraints of BTI and PAC discriminator computation.

This also adopts an x8+ allocation order for GPR64noip, matching GPR64.

---

Patch is 72.74 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/85736.diff


21 Files Affected:

- (modified) llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h (+7) 
- (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+18) 
- (modified) llvm/lib/CodeGen/GlobalISel/CallLowering.cpp (+2) 
- (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+15-1) 
- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+46-5) 
- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h (+5-1) 
- (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+90) 
- (modified) llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp (+39-4) 
- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+79-24) 
- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (+12) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+2) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+80) 
- (modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.td (+4-1) 
- (modified) llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp (+68-20) 
- (modified) llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp (+1-1) 
- (added) llvm/test/CodeGen/AArch64/GlobalISel/ptrauth-invoke.ll (+183) 
- (modified) 
llvm/test/CodeGen/AArch64/branch-target-enforcement-indirect-calls.ll (+1-3) 
- (added) llvm/test/CodeGen/AArch64/ptrauth-bti-call.ll (+105) 
- (added) llvm/test/CodeGen/AArch64/ptrauth-call-rv-marker.ll (+154) 
- (added) llvm/test/CodeGen/AArch64/ptrauth-call.ll (+195) 
- (added) llvm/test/CodeGen/AArch64/ptrauth-invoke.ll (+189) 


``diff
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h 
b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
index 4c187a3068d823..e02c7435a1b6d2 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
@@ -46,6 +46,10 @@ class CallLowering {
 
   virtual void anchor();
 public:
+  struct PointerAuthInfo {
+Register Discriminator;
+uint64_t Key;
+  };
   struct BaseArgInfo {
 Type *Ty;
 SmallVector Flags;
@@ -125,6 +129,8 @@ class CallLowering {
 
 MDNode *KnownCallees = nullptr;
 
+std::optional PAI;
+
 /// True if the call must be tail call optimized.
 bool IsMustTailCall = false;
 
@@ -587,6 +593,7 @@ class CallLowering {
   bool lowerCall(MachineIRBuilder &MIRBuilder, const CallBase &Call,
  ArrayRef ResRegs,
  ArrayRef> ArgRegs, Register SwiftErrorVReg,
+ std::optional PAI,
  Register ConvergenceCtrlToken,
  std::function GetCalleeReg) const;
 
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h 
b/llvm/include/llvm/CodeGen/TargetLowering.h
index 2f164a460db843..e47a4e12d33cbc 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -4290,6 +4290,9 @@ class TargetLowering : public TargetLoweringBase {
   /// Return true if the target supports kcfi operand bundles.
   virtual bool supportKCFIBundles() const { return false; }
 
+  /// Return true if the target supports ptrauth operand bundles.
+  virtual bool supportPtrAuthBundles() const { return false; }
+
   /// Perform necessary initialization to handle a subset of CSRs explicitly
   /// via copies. This function is called at the beginning of instruction
   /// selection.
@@ -4401,6 +4404,14 @@ class TargetLowering : public TargetLoweringBase {
 llvm_unreachable("Not Implemented");
   }
 
+  /// This structure contains the information necessary for lowering
+  /// pointer-authenticating indirect calls.  It is equivalent to the "ptrauth"
+  /// operand bundle found on the call instruction, if any.
+  struct PtrAuthInfo {
+uint64_t Key;
+SDValue Discriminator;
+  };
+
   /// This structure contains all information that is necessary for lowering
   /// calls. It is passed to TLI::LowerCallTo when the SelectionDAG builder

[llvm-branch-commits] [llvm] [AArch64][PAC] Lower authenticated calls with ptrauth bundles. (PR #85736)

2024-03-18 Thread via llvm-branch-commits

llvmbot wrote:



@llvm/pr-subscribers-llvm-selectiondag

@llvm/pr-subscribers-llvm-globalisel

Author: Ahmed Bougacha (ahmedbougacha)


Changes

This adds codegen support for the "ptrauth" operand bundles, which can
be used to augment indirect calls with the equivalent of an
`@llvm.ptrauth.auth` intrinsic call on the call target (possibly
preceded by an `@llvm.ptrauth.blend` on the auth discriminator if
applicable.)

This allows the generation of combined authenticating calls
on AArch64 (in the BLRA* PAuth instructions), while avoiding
the raw just-authenticated function pointer from being
exposed to attackers.

This is done by threading a PtrAuthInfo descriptor through
the call lowering infrastructure.

Note that this also applies to the other forms of indirect calls,
notably invokes, rvmarker, and tail calls.  Tail-calls in particular
bring some additional complexity, with the intersecting register
constraints of BTI and PAC discriminator computation.

This also adopts an x8+ allocation order for GPR64noip, matching GPR64.

---

Patch is 72.74 KiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/85736.diff


21 Files Affected:

- (modified) llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h (+7) 
- (modified) llvm/include/llvm/CodeGen/TargetLowering.h (+18) 
- (modified) llvm/lib/CodeGen/GlobalISel/CallLowering.cpp (+2) 
- (modified) llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp (+15-1) 
- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (+46-5) 
- (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h (+5-1) 
- (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+90) 
- (modified) llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp (+39-4) 
- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+79-24) 
- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.h (+12) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+2) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+80) 
- (modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.td (+4-1) 
- (modified) llvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp (+68-20) 
- (modified) llvm/lib/Target/AArch64/GISel/AArch64GlobalISelUtils.cpp (+1-1) 
- (added) llvm/test/CodeGen/AArch64/GlobalISel/ptrauth-invoke.ll (+183) 
- (modified) 
llvm/test/CodeGen/AArch64/branch-target-enforcement-indirect-calls.ll (+1-3) 
- (added) llvm/test/CodeGen/AArch64/ptrauth-bti-call.ll (+105) 
- (added) llvm/test/CodeGen/AArch64/ptrauth-call-rv-marker.ll (+154) 
- (added) llvm/test/CodeGen/AArch64/ptrauth-call.ll (+195) 
- (added) llvm/test/CodeGen/AArch64/ptrauth-invoke.ll (+189) 


``diff
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h 
b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
index 4c187a3068d823..e02c7435a1b6d2 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
@@ -46,6 +46,10 @@ class CallLowering {
 
   virtual void anchor();
 public:
+  struct PointerAuthInfo {
+Register Discriminator;
+uint64_t Key;
+  };
   struct BaseArgInfo {
 Type *Ty;
 SmallVector Flags;
@@ -125,6 +129,8 @@ class CallLowering {
 
 MDNode *KnownCallees = nullptr;
 
+std::optional PAI;
+
 /// True if the call must be tail call optimized.
 bool IsMustTailCall = false;
 
@@ -587,6 +593,7 @@ class CallLowering {
   bool lowerCall(MachineIRBuilder &MIRBuilder, const CallBase &Call,
  ArrayRef ResRegs,
  ArrayRef> ArgRegs, Register SwiftErrorVReg,
+ std::optional PAI,
  Register ConvergenceCtrlToken,
  std::function GetCalleeReg) const;
 
diff --git a/llvm/include/llvm/CodeGen/TargetLowering.h 
b/llvm/include/llvm/CodeGen/TargetLowering.h
index 2f164a460db843..e47a4e12d33cbc 100644
--- a/llvm/include/llvm/CodeGen/TargetLowering.h
+++ b/llvm/include/llvm/CodeGen/TargetLowering.h
@@ -4290,6 +4290,9 @@ class TargetLowering : public TargetLoweringBase {
   /// Return true if the target supports kcfi operand bundles.
   virtual bool supportKCFIBundles() const { return false; }
 
+  /// Return true if the target supports ptrauth operand bundles.
+  virtual bool supportPtrAuthBundles() const { return false; }
+
   /// Perform necessary initialization to handle a subset of CSRs explicitly
   /// via copies. This function is called at the beginning of instruction
   /// selection.
@@ -4401,6 +4404,14 @@ class TargetLowering : public TargetLoweringBase {
 llvm_unreachable("Not Implemented");
   }
 
+  /// This structure contains the information necessary for lowering
+  /// pointer-authenticating indirect calls.  It is equivalent to the "ptrauth"
+  /// operand bundle found on the call instruction, if any.
+  struct PtrAuthInfo {
+uint64_t Key;
+SDValue Discriminator;
+  };
+
   /// This structure contains all information that is necessary for lowering
   /// calls. It is passed to TLI::Low