[llvm-branch-commits] [clang] clang/AMDGPU: Set noalias.addrspace metadata on atomicrmw (PR #102462)

2024-08-08 Thread Harald van Dijk via llvm-branch-commits


@@ -647,6 +647,14 @@ class LangOptions : public LangOptionsBase {
 return ConvergentFunctions;
   }
 
+  /// Return true if atomicrmw operations targeting allocations in private
+  /// memory are undefined.
+  bool threadPrivateMemoryAtomicsAreUndefined() const {
+// Should be false for OpenMP.
+// TODO: Should this be true for SYCL?

hvdijk wrote:

I think this is meant to be true for SYCL in the future, but not just yet? 
[SYCL 2020 
4.15.3](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:atomic-references)
 specifies

> The `sycl::atomic_ref` class also has a template parameter `AddressSpace`, 
> which allows the application to make an assertion about the address space of 
> the object of type `T` that it references. The default value for this 
> parameter is `access::address_space::generic_space`, which indicates that the 
> object could be in either the global or local address spaces. If the 
> application knows the address space, it can set this template parameter to 
> either `access::address_space::global_space` or 
> `access::address_space::local_space` as an assertion to the implementation. 
> Specifying the address space via this template parameter may allow the 
> implementation to perform certain optimizations. Specifying an address space 
> that does not match the object’s actual address space results in undefined 
> behavior.

It does not specifically call out the private address space as being undefined, 
but it says an address space that does not match what is specified is 
undefined, and provides no way to specify the private address space, so I think 
the end result is the same.

However, at the moment, the [legacy atomic 
types](https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:atom-types-depr)
 are also still available and the same logic cannot be applied to those, so 
barring any explicit statement that the private address space is undefined, I 
think it will be necessary to assume for now that this is well-defined.

https://github.com/llvm/llvm-project/pull/102462
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] f453793 - Suppress non-conforming GNU paste extension in all standard-conforming modes

2021-01-24 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2021-01-25T00:56:45Z
New Revision: f4537935dcdbf390c863591cf556e76c3abab9c1

URL: 
https://github.com/llvm/llvm-project/commit/f4537935dcdbf390c863591cf556e76c3abab9c1
DIFF: 
https://github.com/llvm/llvm-project/commit/f4537935dcdbf390c863591cf556e76c3abab9c1.diff

LOG: Suppress non-conforming GNU paste extension in all standard-conforming 
modes

The GNU token paste extension that removes the comma in , ## __VA_ARGS__
conflicts with C99/C++11's requirements when a variadic macro has no
named parameters: according to the standard, an invocation as FOO()
gives it a single empty argument, and concatenation of anything with an
empty argument is well-defined. For this reason, the GNU extension was
already disabled in C99 standard-conforming mode. It was not yet
disabled in C++11 standard-conforming mode.

The associated comment suggested that GCC keeps this extension enabled
in C90/C++03 standard-conforming mode, but it actually does not, so
rather than adding a check for C++ language version, this change simply
removes the check for C language version.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D91913

Added: 


Modified: 
clang/lib/Lex/TokenLexer.cpp
clang/test/Preprocessor/macro_fn_comma_swallow2.c

Removed: 




diff  --git a/clang/lib/Lex/TokenLexer.cpp b/clang/lib/Lex/TokenLexer.cpp
index da5681aaf478..6e962dfa2c34 100644
--- a/clang/lib/Lex/TokenLexer.cpp
+++ b/clang/lib/Lex/TokenLexer.cpp
@@ -148,12 +148,11 @@ bool TokenLexer::MaybeRemoveCommaBeforeVaArgs(
 return false;
 
   // GCC removes the comma in the expansion of " ... , ## __VA_ARGS__ " if
-  // __VA_ARGS__ is empty, but not in strict C99 mode where there are no
-  // named arguments, where it remains.  In all other modes, including C99
-  // with GNU extensions, it is removed regardless of named arguments.
+  // __VA_ARGS__ is empty, but not in strict mode where there are no
+  // named arguments, where it remains.  With GNU extensions, it is removed
+  // regardless of named arguments.
   // Microsoft also appears to support this extension, unofficially.
-  if (PP.getLangOpts().C99 && !PP.getLangOpts().GNUMode
-&& Macro->getNumParams() < 2)
+  if (!PP.getLangOpts().GNUMode && Macro->getNumParams() < 2)
 return false;
 
   // Is a comma available to be removed?

diff  --git a/clang/test/Preprocessor/macro_fn_comma_swallow2.c 
b/clang/test/Preprocessor/macro_fn_comma_swallow2.c
index 93ab2b83664a..89ef8c0579c4 100644
--- a/clang/test/Preprocessor/macro_fn_comma_swallow2.c
+++ b/clang/test/Preprocessor/macro_fn_comma_swallow2.c
@@ -1,9 +1,12 @@
 // Test the __VA_ARGS__ comma swallowing extensions of various compiler 
dialects.
 
 // RUN: %clang_cc1 -E %s | FileCheck -check-prefix=GCC -strict-whitespace %s
+// RUN: %clang_cc1 -E -std=c90 %s | FileCheck -check-prefix=C99 
-strict-whitespace %s
 // RUN: %clang_cc1 -E -std=c99 %s | FileCheck -check-prefix=C99 
-strict-whitespace %s
 // RUN: %clang_cc1 -E -std=c11 %s | FileCheck -check-prefix=C99 
-strict-whitespace %s
 // RUN: %clang_cc1 -E -x c++ %s | FileCheck -check-prefix=GCC 
-strict-whitespace %s
+// RUN: %clang_cc1 -E -x c++ -std=c++03 %s | FileCheck -check-prefix=C99 
-strict-whitespace %s
+// RUN: %clang_cc1 -E -x c++ -std=c++11 %s | FileCheck -check-prefix=C99 
-strict-whitespace %s
 // RUN: %clang_cc1 -E -std=gnu99 %s | FileCheck -check-prefix=GCC 
-strict-whitespace %s
 // RUN: %clang_cc1 -E -fms-compatibility %s | FileCheck -check-prefix=MS 
-strict-whitespace %s
 // RUN: %clang_cc1 -E -DNAMED %s | FileCheck -check-prefix=GCC 
-strict-whitespace %s



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 9eac818 - [X86] Fix variadic argument handling for x32

2020-12-14 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-14T23:47:27Z
New Revision: 9eac818370fe4b50a167627593bfe53e61c216bc

URL: 
https://github.com/llvm/llvm-project/commit/9eac818370fe4b50a167627593bfe53e61c216bc
DIFF: 
https://github.com/llvm/llvm-project/commit/9eac818370fe4b50a167627593bfe53e61c216bc.diff

LOG: [X86] Fix variadic argument handling for x32

The X86-64 ABI defines va_list as

  typedef struct {
unsigned int gp_offset;
unsigned int fp_offset;
void *overflow_arg_area;
void *reg_save_area;
  } va_list[1];

This means the size, alignment, and reg_save_area offset will depend on
whether we are in LP64 or in ILP32 mode, so this commit adds the checks.
Additionally, the VAARG_64 pseudo-instruction assumed 64-bit pointers, so
this commit adds a VAARG_X32 pseudo-instruction that behaves just like
VAARG_64, except for assuming 32-bit pointers.

Some of these changes were originally done by
Michael Liao .

Fixes https://bugs.llvm.org/show_bug.cgi?id=48428.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D93160

Added: 


Modified: 
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/lib/Target/X86/X86ISelLowering.h
llvm/lib/Target/X86/X86InstrCompiler.td
llvm/lib/Target/X86/X86InstrInfo.td
llvm/test/CodeGen/X86/x86-64-varargs.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index e4fcafc3352f..21af1a5aad00 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -24360,15 +24360,16 @@ SDValue X86TargetLowering::LowerVAARG(SDValue Op, 
SelectionDAG &DAG) const {
Subtarget.hasSSE1());
   }
 
-  // Insert VAARG_64 node into the DAG
-  // VAARG_64 returns two values: Variable Argument Address, Chain
+  // Insert VAARG node into the DAG
+  // VAARG returns two values: Variable Argument Address, Chain
   SDValue InstOps[] = {Chain, SrcPtr,
DAG.getTargetConstant(ArgSize, dl, MVT::i32),
DAG.getTargetConstant(ArgMode, dl, MVT::i8),
DAG.getTargetConstant(Align, dl, MVT::i32)};
   SDVTList VTs = DAG.getVTList(getPointerTy(DAG.getDataLayout()), MVT::Other);
   SDValue VAARG = DAG.getMemIntrinsicNode(
-  X86ISD::VAARG_64, dl, VTs, InstOps, MVT::i64, MachinePointerInfo(SV),
+  Subtarget.isTarget64BitLP64() ? X86ISD::VAARG_64 : X86ISD::VAARG_X32, dl,
+  VTs, InstOps, MVT::i64, MachinePointerInfo(SV),
   /*Alignment=*/None,
   MachineMemOperand::MOLoad | MachineMemOperand::MOStore);
   Chain = VAARG.getValue(1);
@@ -24394,9 +24395,11 @@ static SDValue LowerVACOPY(SDValue Op, const 
X86Subtarget &Subtarget,
   const Value *SrcSV = cast(Op.getOperand(4))->getValue();
   SDLoc DL(Op);
 
-  return DAG.getMemcpy(Chain, DL, DstPtr, SrcPtr, DAG.getIntPtrConstant(24, 
DL),
-   Align(8), /*isVolatile*/ false, false, false,
-   MachinePointerInfo(DstSV), MachinePointerInfo(SrcSV));
+  return DAG.getMemcpy(
+  Chain, DL, DstPtr, SrcPtr,
+  DAG.getIntPtrConstant(Subtarget.isTarget64BitLP64() ? 24 : 16, DL),
+  Align(Subtarget.isTarget64BitLP64() ? 8 : 4), /*isVolatile*/ false, 
false,
+  false, MachinePointerInfo(DstSV), MachinePointerInfo(SrcSV));
 }
 
 // Helper to get immediate/variable SSE shift opcode from other shift opcodes.
@@ -30959,6 +30962,7 @@ const char 
*X86TargetLowering::getTargetNodeName(unsigned Opcode) const {
   NODE_NAME_CASE(DBPSADBW)
   NODE_NAME_CASE(VASTART_SAVE_XMM_REGS)
   NODE_NAME_CASE(VAARG_64)
+  NODE_NAME_CASE(VAARG_X32)
   NODE_NAME_CASE(WIN_ALLOCA)
   NODE_NAME_CASE(MEMBARRIER)
   NODE_NAME_CASE(MFENCE)
@@ -31548,11 +31552,9 @@ static MachineBasicBlock *emitXBegin(MachineInstr &MI, 
MachineBasicBlock *MBB,
   return sinkMBB;
 }
 
-
-
 MachineBasicBlock *
-X86TargetLowering::EmitVAARG64WithCustomInserter(MachineInstr &MI,
- MachineBasicBlock *MBB) const 
{
+X86TargetLowering::EmitVAARGWithCustomInserter(MachineInstr &MI,
+   MachineBasicBlock *MBB) const {
   // Emit va_arg instruction on X86-64.
 
   // Operands to this pseudo-instruction:
@@ -31563,9 +31565,8 @@ 
X86TargetLowering::EmitVAARG64WithCustomInserter(MachineInstr &MI,
   // 8  ) Align : Alignment of type
   // 9  ) EFLAGS (implicit-def)
 
-  assert(MI.getNumOperands() == 10 && "VAARG_64 should have 10 operands!");
-  static_assert(X86::AddrNumOperands == 5,
-"VAARG_64 assumes 5 address operands");
+  assert(MI.getNumOperands() == 10 && "VAARG should have 10 operands!");
+  static_assert(X86::AddrNumOperands == 5, "VAARG assumes 5 address operands");
 
   Register DestReg = MI.getOperand(0).getReg();
   MachineOperand &Base = MI.getOperand(1);
@@ -31580,7 +31581,7 @@ 
X86TargetLowering::EmitVAARG64WithCustomInserter(MachineInstr 

[llvm-branch-commits] [llvm] 2aae213 - [X86] Add REX prefix for GOTTPOFF/TLSDESC relocs in x32 mode

2020-12-15 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-15T23:07:34Z
New Revision: 2aae2136d5c6b2da69787934f5963a6b3486e5fe

URL: 
https://github.com/llvm/llvm-project/commit/2aae2136d5c6b2da69787934f5963a6b3486e5fe
DIFF: 
https://github.com/llvm/llvm-project/commit/2aae2136d5c6b2da69787934f5963a6b3486e5fe.diff

LOG: [X86] Add REX prefix for GOTTPOFF/TLSDESC relocs in x32 mode

The REX prefix is needed to allow linker relaxations: even if the
instruction we emit may not need it, the linker may change it to a
different instruction which does need it.

Added: 
llvm/test/MC/X86/tlsdesc-x32.s

Modified: 
llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp

Removed: 




diff  --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp 
b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
index 4e5e45f03925..59860cad01f7 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
@@ -93,7 +93,8 @@ class X86MCCodeEmitter : public MCCodeEmitter {
   bool emitOpcodePrefix(int MemOperand, const MCInst &MI,
 const MCSubtargetInfo &STI, raw_ostream &OS) const;
 
-  bool emitREXPrefix(int MemOperand, const MCInst &MI, raw_ostream &OS) const;
+  bool emitREXPrefix(int MemOperand, const MCInst &MI,
+ const MCSubtargetInfo &STI, raw_ostream &OS) const;
 };
 
 } // end anonymous namespace
@@ -1201,6 +1202,7 @@ void X86MCCodeEmitter::emitVEXOpcodePrefix(int 
MemOperand, const MCInst &MI,
 ///
 /// \returns true if REX prefix is used, otherwise returns false.
 bool X86MCCodeEmitter::emitREXPrefix(int MemOperand, const MCInst &MI,
+ const MCSubtargetInfo &STI,
  raw_ostream &OS) const {
   uint8_t REX = [&, MemOperand]() {
 uint8_t REX = 0;
@@ -1221,15 +1223,28 @@ bool X86MCCodeEmitter::emitREXPrefix(int MemOperand, 
const MCInst &MI,
 // If it accesses SPL, BPL, SIL, or DIL, then it requires a 0x40 REX 
prefix.
 for (unsigned i = CurOp; i != NumOps; ++i) {
   const MCOperand &MO = MI.getOperand(i);
-  if (!MO.isReg())
-continue;
-  unsigned Reg = MO.getReg();
-  if (Reg == X86::AH || Reg == X86::BH || Reg == X86::CH || Reg == X86::DH)
-UsesHighByteReg = true;
-  if (X86II::isX86_64NonExtLowByteReg(Reg))
-// FIXME: The caller of determineREXPrefix slaps this prefix onto
-// anything that returns non-zero.
-REX |= 0x40; // REX fixed encoding prefix
+  if (MO.isReg()) {
+unsigned Reg = MO.getReg();
+if (Reg == X86::AH || Reg == X86::BH || Reg == X86::CH ||
+Reg == X86::DH)
+  UsesHighByteReg = true;
+if (X86II::isX86_64NonExtLowByteReg(Reg))
+  // FIXME: The caller of determineREXPrefix slaps this prefix onto
+  // anything that returns non-zero.
+  REX |= 0x40; // REX fixed encoding prefix
+  } else if (MO.isExpr() &&
+ STI.getTargetTriple().getEnvironment() == Triple::GNUX32) {
+// GOTTPOFF and TLSDESC relocations require a REX prefix to allow
+// linker optimizations: even if the instructions we see may not 
require
+// any prefix, they may be replaced by instructions that do. This is
+// handled as a special case here so that it also works for 
hand-written
+// assembly without the user needing to write REX, as with GNU as.
+const auto *Ref = dyn_cast(MO.getExpr());
+if (Ref && (Ref->getKind() == MCSymbolRefExpr::VK_GOTTPOFF ||
+Ref->getKind() == MCSymbolRefExpr::VK_TLSDESC)) {
+  REX |= 0x40; // REX fixed encoding prefix
+}
+  }
 }
 
 switch (TSFlags & X86II::FormMask) {
@@ -1352,7 +1367,7 @@ bool X86MCCodeEmitter::emitOpcodePrefix(int MemOperand, 
const MCInst &MI,
   assert((STI.hasFeature(X86::Mode64Bit) || !(TSFlags & X86II::REX_W)) &&
  "REX.W requires 64bit mode.");
   bool HasREX = STI.hasFeature(X86::Mode64Bit)
-? emitREXPrefix(MemOperand, MI, OS)
+? emitREXPrefix(MemOperand, MI, STI, OS)
 : false;
 
   // 0x0F escape code must be emitted just before the opcode.

diff  --git a/llvm/test/MC/X86/tlsdesc-x32.s b/llvm/test/MC/X86/tlsdesc-x32.s
new file mode 100644
index ..a9884fb5e2ee
--- /dev/null
+++ b/llvm/test/MC/X86/tlsdesc-x32.s
@@ -0,0 +1,20 @@
+# RUN: llvm-mc -triple x86_64-pc-linux-gnux32 %s | FileCheck 
--check-prefix=PRINT %s
+
+# RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnux32 %s -o %t
+# RUN: llvm-readelf -s %t | FileCheck --check-prefix=SYM %s
+# RUN: llvm-objdump -d -r %t | FileCheck --match-full-lines %s
+
+# PRINT:  leal a@tlsdesc(%rip), %eax
+# PRINT-NEXT: callq *a@tlscall(%eax)
+
+# SYM: TLS GLOBAL DEFAULT UND a
+
+# CHECK:  0: 40 8d 05 00 00 00 00  leal (%rip), %eax  # 7 <{{.*}}>

[llvm-branch-commits] [llvm] 09d0e7a - [X86] Avoid %fs:(%eax) references in x32 mode

2020-12-16 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-16T22:39:57Z
New Revision: 09d0e7a7c153820f66597ac431d4453e272f204e

URL: 
https://github.com/llvm/llvm-project/commit/09d0e7a7c153820f66597ac431d4453e272f204e
DIFF: 
https://github.com/llvm/llvm-project/commit/09d0e7a7c153820f66597ac431d4453e272f204e.diff

LOG: [X86] Avoid %fs:(%eax) references in x32 mode

The ABI explains that %fs:(%eax) zero-extends %eax to 64 bits, and adds
that the TLS base address, but that the TLS base address need not be
at the start of the TLS block, TLS references may use negative offsets.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D93158

Added: 


Modified: 
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
llvm/test/CodeGen/X86/pic.ll
llvm/test/CodeGen/X86/tls-pie.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp 
b/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
index 5d197e4d5f76..d7c8e88640af 100644
--- a/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
@@ -207,7 +207,8 @@ namespace {
 void Select(SDNode *N) override;
 
 bool foldOffsetIntoAddress(uint64_t Offset, X86ISelAddressMode &AM);
-bool matchLoadInAddress(LoadSDNode *N, X86ISelAddressMode &AM);
+bool matchLoadInAddress(LoadSDNode *N, X86ISelAddressMode &AM,
+bool AllowSegmentRegForX32 = false);
 bool matchWrapper(SDValue N, X86ISelAddressMode &AM);
 bool matchAddress(SDValue N, X86ISelAddressMode &AM);
 bool matchVectorAddress(SDValue N, X86ISelAddressMode &AM);
@@ -1613,20 +1614,26 @@ bool X86DAGToDAGISel::foldOffsetIntoAddress(uint64_t 
Offset,
 
 }
 
-bool X86DAGToDAGISel::matchLoadInAddress(LoadSDNode *N, X86ISelAddressMode 
&AM){
+bool X86DAGToDAGISel::matchLoadInAddress(LoadSDNode *N, X86ISelAddressMode &AM,
+ bool AllowSegmentRegForX32) {
   SDValue Address = N->getOperand(1);
 
   // load gs:0 -> GS segment register.
   // load fs:0 -> FS segment register.
   //
-  // This optimization is valid because the GNU TLS model defines that
-  // gs:0 (or fs:0 on X86-64) contains its own address.
+  // This optimization is generally valid because the GNU TLS model defines 
that
+  // gs:0 (or fs:0 on X86-64) contains its own address. However, for X86-64 
mode
+  // with 32-bit registers, as we get in ILP32 mode, those registers are first
+  // zero-extended to 64 bits and then added it to the base address, which 
gives
+  // unwanted results when the register holds a negative value.
   // For more information see http://people.redhat.com/drepper/tls.pdf
-  if (ConstantSDNode *C = dyn_cast(Address))
+  if (ConstantSDNode *C = dyn_cast(Address)) {
 if (C->getSExtValue() == 0 && AM.Segment.getNode() == nullptr &&
 !IndirectTlsSegRefs &&
 (Subtarget->isTargetGlibc() || Subtarget->isTargetAndroid() ||
- Subtarget->isTargetFuchsia()))
+ Subtarget->isTargetFuchsia())) {
+  if (Subtarget->isTarget64BitILP32() && !AllowSegmentRegForX32)
+return true;
   switch (N->getPointerInfo().getAddrSpace()) {
   case X86AS::GS:
 AM.Segment = CurDAG->getRegister(X86::GS, MVT::i16);
@@ -1637,6 +1644,8 @@ bool X86DAGToDAGISel::matchLoadInAddress(LoadSDNode *N, 
X86ISelAddressMode &AM){
   // Address space X86AS::SS is not handled here, because it is not used to
   // address TLS areas.
   }
+}
+  }
 
   return true;
 }
@@ -1720,6 +1729,21 @@ bool X86DAGToDAGISel::matchAddress(SDValue N, 
X86ISelAddressMode &AM) {
   if (matchAddressRecursively(N, AM, 0))
 return true;
 
+  // Post-processing: Make a second attempt to fold a load, if we now know
+  // that there will not be any other register. This is only performed for
+  // 64-bit ILP32 mode since 32-bit mode and 64-bit LP64 mode will have folded
+  // any foldable load the first time.
+  if (Subtarget->isTarget64BitILP32() &&
+  AM.BaseType == X86ISelAddressMode::RegBase &&
+  AM.Base_Reg.getNode() != nullptr && AM.IndexReg.getNode() == nullptr) {
+SDValue Save_Base_Reg = AM.Base_Reg;
+if (auto *LoadN = dyn_cast(Save_Base_Reg)) {
+  AM.Base_Reg = SDValue();
+  if (matchLoadInAddress(LoadN, AM, /*AllowSegmentRegForX32=*/true))
+AM.Base_Reg = Save_Base_Reg;
+}
+  }
+
   // Post-processing: Convert lea(,%reg,2) to lea(%reg,%reg), which has
   // a smaller encoding and avoids a scaled-index.
   if (AM.Scale == 2 &&

diff  --git a/llvm/test/CodeGen/X86/pic.ll b/llvm/test/CodeGen/X86/pic.ll
index 101c749633bc..b7d63dce8626 100644
--- a/llvm/test/CodeGen/X86/pic.ll
+++ b/llvm/test/CodeGen/X86/pic.ll
@@ -336,17 +336,18 @@ entry:
 ; CHECK-I686-DAG:  movl%gs:0,
 ; CHECK-X32-DAG:   movltlsdstie@GOTTPOFF(%rip),
 ; CHECK-X32-DAG:   movl%fs:0,
-; CHECK:   addl
+; CHECK-I686:  addl
+; CHECK-X32:   leal({{%.*,%.*}}),
 ; CHECK-I686:  movltlsptrie@GOTN

[llvm-branch-commits] [llvm] adc55b5 - [X86] Avoid generating invalid R_X86_64_GOTPCRELX relocations

2020-12-18 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-18T23:38:38Z
New Revision: adc55b5a5ae49f1fe3a04f7f79b1c08f508b4307

URL: 
https://github.com/llvm/llvm-project/commit/adc55b5a5ae49f1fe3a04f7f79b1c08f508b4307
DIFF: 
https://github.com/llvm/llvm-project/commit/adc55b5a5ae49f1fe3a04f7f79b1c08f508b4307.diff

LOG: [X86] Avoid generating invalid R_X86_64_GOTPCRELX relocations

We need to make sure not to emit R_X86_64_GOTPCRELX relocations for
instructions that use a REX prefix. If a REX prefix is present, we need to
instead use a R_X86_64_REX_GOTPCRELX relocation. The existing logic for
CALL64m, JMP64m, etc. already handles this by checking the HasREX parameter
and using it to determine which relocation type to use. Do this for all
instructions that can use relaxed relocations.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D93561

Added: 


Modified: 
lld/test/ELF/x86-64-gotpc-relax-nopic.s
llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
llvm/test/MC/X86/gotpcrelx.s

Removed: 
llvm/test/MC/ELF/got-relaxed-rex.s



diff  --git a/lld/test/ELF/x86-64-gotpc-relax-nopic.s 
b/lld/test/ELF/x86-64-gotpc-relax-nopic.s
index 501414f7bdde..81d25f9ecafb 100644
--- a/lld/test/ELF/x86-64-gotpc-relax-nopic.s
+++ b/lld/test/ELF/x86-64-gotpc-relax-nopic.s
@@ -23,8 +23,8 @@
 # DISASM-NEXT: orl   {{.*}}(%rip), %edi  # 202240
 # DISASM-NEXT: sbbl  {{.*}}(%rip), %esi  # 202240
 # DISASM-NEXT: subl  {{.*}}(%rip), %ebp  # 202240
-# DISASM-NEXT: xorl  {{.*}}(%rip), %r8d  # 202240
-# DISASM-NEXT: testl %r15d, {{.*}}(%rip) # 202240
+# DISASM-NEXT: xorl  $0x203248, %r8d
+# DISASM-NEXT: testl $0x203248, %r15d
 # DISASM-NEXT:   201200:   adcq  $0x203248, %rax
 # DISASM-NEXT: addq  $0x203248, %rbx
 # DISASM-NEXT: andq  $0x203248, %rcx

diff  --git a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp 
b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
index 59860cad01f7..260253a5302d 100644
--- a/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
+++ b/llvm/lib/Target/X86/MCTargetDesc/X86MCCodeEmitter.cpp
@@ -409,6 +409,12 @@ void X86MCCodeEmitter::emitMemModRMByte(const MCInst &MI, 
unsigned Op,
   switch (Opcode) {
   default:
 return X86::reloc_riprel_4byte;
+  case X86::MOV64rm:
+// movq loads is a subset of reloc_riprel_4byte_relax_rex. It is a
+// special case because COFF and Mach-O don't support ELF's more
+// flexible R_X86_64_REX_GOTPCRELX relaxation.
+assert(HasREX);
+return X86::reloc_riprel_4byte_movq_load;
   case X86::ADC32rm:
   case X86::ADD32rm:
   case X86::AND32rm:
@@ -419,13 +425,6 @@ void X86MCCodeEmitter::emitMemModRMByte(const MCInst &MI, 
unsigned Op,
   case X86::SUB32rm:
   case X86::TEST32mr:
   case X86::XOR32rm:
-return X86::reloc_riprel_4byte_relax;
-  case X86::MOV64rm:
-// movq loads is a subset of reloc_riprel_4byte_relax_rex. It is a
-// special case because COFF and Mach-O don't support ELF's more
-// flexible R_X86_64_REX_GOTPCRELX relaxation.
-assert(HasREX);
-return X86::reloc_riprel_4byte_movq_load;
   case X86::CALL64m:
   case X86::JMP64m:
   case X86::TAILJMPm64:

diff  --git a/llvm/test/MC/ELF/got-relaxed-rex.s 
b/llvm/test/MC/ELF/got-relaxed-rex.s
deleted file mode 100644
index 1924bddc473e..
--- a/llvm/test/MC/ELF/got-relaxed-rex.s
+++ /dev/null
@@ -1,36 +0,0 @@
-// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux %s -o - | llvm-readobj 
-r - | FileCheck %s
-
-// these should produce R_X86_64_REX_GOTPCRELX
-
-movq mov@GOTPCREL(%rip), %rax
-test %rax, test@GOTPCREL(%rip)
-adc adc@GOTPCREL(%rip), %rax
-add add@GOTPCREL(%rip), %rax
-and and@GOTPCREL(%rip), %rax
-cmp cmp@GOTPCREL(%rip), %rax
-or  or@GOTPCREL(%rip), %rax
-sbb sbb@GOTPCREL(%rip), %rax
-sub sub@GOTPCREL(%rip), %rax
-xor xor@GOTPCREL(%rip), %rax
-
-.section .norelax,"ax"
-## This expression loads the GOT entry with an offset.
-## Don't emit R_X86_64_REX_GOTPCRELX.
-movq mov@GOTPCREL+1(%rip), %rax
-
-// CHECK:  Relocations [
-// CHECK-NEXT:   Section ({{.*}}) .rela.text {
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX mov
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX test
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX adc
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX add
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX and
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX cmp
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX or
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX sbb
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX sub
-// CHECK-NEXT: R_X86_64_REX_GOTPCRELX xor
-// CHECK-NEXT:   }
-// CHECK-NEXT:   Section ({{.*}}) .rela.norelax {
-// CHECK-N

[llvm-branch-commits] [llvm] 47c902b - [X86] Have indirect calls take 64-bit operands in 64-bit modes

2020-11-28 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-11-28T16:46:30Z
New Revision: 47c902ba8479fc1faed73b86f59d58830df06644

URL: 
https://github.com/llvm/llvm-project/commit/47c902ba8479fc1faed73b86f59d58830df06644
DIFF: 
https://github.com/llvm/llvm-project/commit/47c902ba8479fc1faed73b86f59d58830df06644.diff

LOG: [X86] Have indirect calls take 64-bit operands in 64-bit modes

The build bots caught two additional pre-existing problems exposed by the test 
change part of my change https://reviews.llvm.org/D91339, when expensive checks 
are enabled. This fixes one of them.

X86 has CALL64r and CALL32r opcodes, where CALL64r takes a 64-bit register, and 
CALL32r takes a 32-bit register. CALL64r can only be used in 64-bit mode, 
CALL32r can only be used in 32-bit mode. LLVM would assume that after picking 
the appropriate CALLr opcode, a pointer-sized register would be a valid 
operand, but in x32 mode, a 64-bit mode, pointers are 32 bits. In this mode, it 
is invalid to directly pass a pointer to CALL64r, it needs to be extended to 64 
bits first.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D91924

Added: 


Modified: 
llvm/lib/Target/X86/X86FastISel.cpp

Removed: 




diff  --git a/llvm/lib/Target/X86/X86FastISel.cpp 
b/llvm/lib/Target/X86/X86FastISel.cpp
index 15b04c0c7357..a8db3d416c2e 100644
--- a/llvm/lib/Target/X86/X86FastISel.cpp
+++ b/llvm/lib/Target/X86/X86FastISel.cpp
@@ -1082,13 +1082,35 @@ bool X86FastISel::X86SelectCallAddress(const Value *V, 
X86AddressMode &AM) {
 
   // If all else fails, try to materialize the value in a register.
   if (!AM.GV || !Subtarget->isPICStyleRIPRel()) {
+auto GetCallRegForValue = [this](const Value *V) {
+  Register Reg = getRegForValue(V);
+
+  // In 64-bit mode, we need a 64-bit register even if pointers are 32 
bits.
+  if (Reg && Subtarget->isTarget64BitILP32()) {
+Register CopyReg = createResultReg(&X86::GR32RegClass);
+BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, 
TII.get(X86::MOV32rr),
+CopyReg)
+.addReg(Reg);
+
+Register ExtReg = createResultReg(&X86::GR64RegClass);
+BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc,
+TII.get(TargetOpcode::SUBREG_TO_REG), ExtReg)
+.addImm(0)
+.addReg(CopyReg)
+.addImm(X86::sub_32bit);
+Reg = ExtReg;
+  }
+
+  return Reg;
+};
+
 if (AM.Base.Reg == 0) {
-  AM.Base.Reg = getRegForValue(V);
+  AM.Base.Reg = GetCallRegForValue(V);
   return AM.Base.Reg != 0;
 }
 if (AM.IndexReg == 0) {
   assert(AM.Scale == 1 && "Scale with no index!");
-  AM.IndexReg = getRegForValue(V);
+  AM.IndexReg = GetCallRegForValue(V);
   return AM.IndexReg != 0;
 }
   }



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 47e2faf - [X86] Do not allow FixupSetCC to relax constraints

2020-11-28 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-11-28T17:46:56Z
New Revision: 47e2fafbf3d933532f46ef6e8515e7005df52758

URL: 
https://github.com/llvm/llvm-project/commit/47e2fafbf3d933532f46ef6e8515e7005df52758
DIFF: 
https://github.com/llvm/llvm-project/commit/47e2fafbf3d933532f46ef6e8515e7005df52758.diff

LOG: [X86] Do not allow FixupSetCC to relax constraints

The build bots caught two additional pre-existing problems exposed by the test 
change part of my change https://reviews.llvm.org/D91339, when expensive checks 
are enabled. https://reviews.llvm.org/D91924 fixes one of them, this fixes the 
other.

FixupSetCC will change code in the form of

  %setcc = SETCCr ...
  %ext1 = MOVZX32rr8 %setcc

to

  %zero = MOV32r0
  %setcc = SETCCr ...
  %ext2 = INSERT_SUBREG %zero, %setcc, %subreg.sub_8bit

and replace uses of %ext1 with %ext2.

The register class for %ext2 did not take into account any constraints on 
%ext1, which may have been required by its uses. This change ensures that the 
original constraints are honoured, by instead of creating a new %ext2 register, 
reusing %ext1 and further constraining it as needed. This requires a slight 
reorganisation to account for the fact that it is possible for the constraining 
to fail, in which case no changes should be made.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D91933

Added: 


Modified: 
llvm/lib/Target/X86/X86FixupSetCC.cpp

Removed: 




diff  --git a/llvm/lib/Target/X86/X86FixupSetCC.cpp 
b/llvm/lib/Target/X86/X86FixupSetCC.cpp
index 09668d7c5468..269f8ce6bd7a 100644
--- a/llvm/lib/Target/X86/X86FixupSetCC.cpp
+++ b/llvm/lib/Target/X86/X86FixupSetCC.cpp
@@ -97,28 +97,31 @@ bool 
X86FixupSetCCPass::runOnMachineFunction(MachineFunction &MF) {
   if (FlagsDefMI->readsRegister(X86::EFLAGS))
 continue;
 
-  ++NumSubstZexts;
-  Changed = true;
-
   // On 32-bit, we need to be careful to force an ABCD register.
   const TargetRegisterClass *RC = MF.getSubtarget().is64Bit()
   ? &X86::GR32RegClass
   : &X86::GR32_ABCDRegClass;
-  Register ZeroReg = MRI->createVirtualRegister(RC);
-  Register InsertReg = MRI->createVirtualRegister(RC);
+  if (!MRI->constrainRegClass(ZExt->getOperand(0).getReg(), RC)) {
+// If we cannot constrain the register, we would need an additional 
copy
+// and are better off keeping the MOVZX32rr8 we have now.
+continue;
+  }
+
+  ++NumSubstZexts;
+  Changed = true;
 
   // Initialize a register with 0. This must go before the eflags def
+  Register ZeroReg = MRI->createVirtualRegister(RC);
   BuildMI(MBB, FlagsDefMI, MI.getDebugLoc(), TII->get(X86::MOV32r0),
   ZeroReg);
 
   // X86 setcc only takes an output GR8, so fake a GR32 input by inserting
   // the setcc result into the low byte of the zeroed register.
   BuildMI(*ZExt->getParent(), ZExt, ZExt->getDebugLoc(),
-  TII->get(X86::INSERT_SUBREG), InsertReg)
+  TII->get(X86::INSERT_SUBREG), ZExt->getOperand(0).getReg())
   .addReg(ZeroReg)
   .addReg(MI.getOperand(0).getReg())
   .addImm(X86::sub_8bit);
-  MRI->replaceRegWith(ZExt->getOperand(0).getReg(), InsertReg);
   ToErase.push_back(ZExt);
 }
   }



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 78a30c8 - [X86] Add -verify-machineinstrs to pic.ll

2020-11-28 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-11-28T17:54:44Z
New Revision: 78a30c830b53dcce32e8d20a966448862106

URL: 
https://github.com/llvm/llvm-project/commit/78a30c830b53dcce32e8d20a966448862106
DIFF: 
https://github.com/llvm/llvm-project/commit/78a30c830b53dcce32e8d20a966448862106.diff

LOG: [X86] Add -verify-machineinstrs to pic.ll

This ensures that failures show up in regular builds, rather than only
when expensive checks are enabled.

Differential Revision: https://reviews.llvm.org/D91339

Added: 


Modified: 
llvm/test/CodeGen/X86/pic.ll

Removed: 




diff  --git a/llvm/test/CodeGen/X86/pic.ll b/llvm/test/CodeGen/X86/pic.ll
index 1de4ca0059d0..34165d47bcb1 100644
--- a/llvm/test/CodeGen/X86/pic.ll
+++ b/llvm/test/CodeGen/X86/pic.ll
@@ -1,6 +1,6 @@
-; RUN: llc < %s -mcpu=generic -mtriple=i686-pc-linux-gnu -relocation-model=pic 
-asm-verbose=false -post-RA-scheduler=false | FileCheck %s 
-check-prefixes=CHECK,CHECK-I686
-; RUN: llc < %s -mcpu=generic -mtriple=x86_64-pc-linux-gnux32 
-relocation-model=pic -asm-verbose=false -post-RA-scheduler=false | FileCheck 
%s -check-prefixes=CHECK,CHECK-X32
-; RUN: llc < %s -mcpu=generic -mtriple=x86_64-pc-linux-gnux32 
-relocation-model=pic -asm-verbose=false -post-RA-scheduler=false -fast-isel | 
FileCheck %s -check-prefixes=CHECK,CHECK-X32
+; RUN: llc < %s -mcpu=generic -mtriple=i686-pc-linux-gnu -relocation-model=pic 
-asm-verbose=false -post-RA-scheduler=false -verify-machineinstrs | FileCheck 
%s -check-prefixes=CHECK,CHECK-I686
+; RUN: llc < %s -mcpu=generic -mtriple=x86_64-pc-linux-gnux32 
-relocation-model=pic -asm-verbose=false -post-RA-scheduler=false 
-verify-machineinstrs | FileCheck %s -check-prefixes=CHECK,CHECK-X32
+; RUN: llc < %s -mcpu=generic -mtriple=x86_64-pc-linux-gnux32 
-relocation-model=pic -asm-verbose=false -post-RA-scheduler=false -fast-isel 
-verify-machineinstrs | FileCheck %s -check-prefixes=CHECK,CHECK-X32
 
 @ptr = external global i32* 
 @dst = external global i32 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [libcxx] fba0b65 - [libc++] hash: adjust for x86-64 ILP32

2020-11-29 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-11-29T13:52:28Z
New Revision: fba0b65f727134e8d05c785b04b7b574f852d49e

URL: 
https://github.com/llvm/llvm-project/commit/fba0b65f727134e8d05c785b04b7b574f852d49e
DIFF: 
https://github.com/llvm/llvm-project/commit/fba0b65f727134e8d05c785b04b7b574f852d49e.diff

LOG: [libc++] hash: adjust for x86-64 ILP32

x86-64 ILP32 mode (x32) uses 32-bit size_t, so share the code with ix86 to zero 
out padding bits, not with x86-64 LP64 mode.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D91349

Added: 


Modified: 
libcxx/include/utility

Removed: 




diff  --git a/libcxx/include/utility b/libcxx/include/utility
index 13489de22c95..5c9e2b6ddef2 100644
--- a/libcxx/include/utility
+++ b/libcxx/include/utility
@@ -1506,7 +1506,7 @@ struct _LIBCPP_TEMPLATE_VIS hash
 // -0.0 and 0.0 should return same hash
 if (__v == 0.0L)
 return 0;
-#if defined(__i386__)
+#if defined(__i386__) || (defined(__x86_64__) && defined(__ILP32__))
 // Zero out padding bits
 union
 {



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] cdac34b - [X86] Zero-extend pointers to i64 for x86_64

2020-11-30 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-11-30T18:51:23Z
New Revision: cdac34bd47a34337579e50dedc119548b379f20e

URL: 
https://github.com/llvm/llvm-project/commit/cdac34bd47a34337579e50dedc119548b379f20e
DIFF: 
https://github.com/llvm/llvm-project/commit/cdac34bd47a34337579e50dedc119548b379f20e.diff

LOG: [X86] Zero-extend pointers to i64 for x86_64

For LP64 mode, this has no effect as pointers are already 64 bits.
For ILP32 mode (x32), this extension is specified by the ABI.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D91338

Added: 


Modified: 
llvm/lib/Target/X86/X86CallingConv.cpp
llvm/lib/Target/X86/X86CallingConv.td
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/musttail-varargs.ll
llvm/test/CodeGen/X86/pr38865-2.ll
llvm/test/CodeGen/X86/pr38865-3.ll
llvm/test/CodeGen/X86/pr38865.ll
llvm/test/CodeGen/X86/sibcall.ll
llvm/test/CodeGen/X86/x32-function_pointer-2.ll
llvm/test/CodeGen/X86/x86-64-sret-return.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86CallingConv.cpp 
b/llvm/lib/Target/X86/X86CallingConv.cpp
index c899db60e016..c80a5d5bb332 100644
--- a/llvm/lib/Target/X86/X86CallingConv.cpp
+++ b/llvm/lib/Target/X86/X86CallingConv.cpp
@@ -330,5 +330,15 @@ static bool CC_X86_Intr(unsigned &ValNo, MVT &ValVT, MVT 
&LocVT,
   return true;
 }
 
+static bool CC_X86_64_Pointer(unsigned &ValNo, MVT &ValVT, MVT &LocVT,
+  CCValAssign::LocInfo &LocInfo,
+  ISD::ArgFlagsTy &ArgFlags, CCState &State) {
+  if (LocVT != MVT::i64) {
+LocVT = MVT::i64;
+LocInfo = CCValAssign::ZExt;
+  }
+  return false;
+}
+
 // Provides entry points of CC_X86 and RetCC_X86.
 #include "X86GenCallingConv.inc"

diff  --git a/llvm/lib/Target/X86/X86CallingConv.td 
b/llvm/lib/Target/X86/X86CallingConv.td
index 802e694999b6..9e414ceeb781 100644
--- a/llvm/lib/Target/X86/X86CallingConv.td
+++ b/llvm/lib/Target/X86/X86CallingConv.td
@@ -336,6 +336,9 @@ def RetCC_X86_64_C : CallingConv<[
   // MMX vector types are always returned in XMM0.
   CCIfType<[x86mmx], CCAssignToReg<[XMM0, XMM1]>>,
 
+  // Pointers are always returned in full 64-bit registers.
+  CCIfPtr>,
+
   CCIfSwiftError>>,
 
   CCDelegateTo
@@ -518,6 +521,9 @@ def CC_X86_64_C : CallingConv<[
   CCIfCC<"CallingConv::Swift",
 CCIfSRet>>>,
 
+  // Pointers are always passed in full 64-bit registers.
+  CCIfPtr>,
+
   // The first 6 integer arguments are passed in integer registers.
   CCIfType<[i32], CCAssignToReg<[EDI, ESI, EDX, ECX, R8D, R9D]>>,
   CCIfType<[i64], CCAssignToReg<[RDI, RSI, RDX, RCX, R8 , R9 ]>>,

diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 6f5f198544c8..1274582614ed 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -3067,8 +3067,9 @@ SDValue X86TargetLowering::LowerCallResult(
 // This truncation won't change the value.
 DAG.getIntPtrConstant(1, dl));
 
-if (VA.isExtInLoc() && (VA.getValVT().getScalarType() == MVT::i1)) {
+if (VA.isExtInLoc()) {
   if (VA.getValVT().isVector() &&
+  VA.getValVT().getScalarType() == MVT::i1 &&
   ((VA.getLocVT() == MVT::i64) || (VA.getLocVT() == MVT::i32) ||
(VA.getLocVT() == MVT::i16) || (VA.getLocVT() == MVT::i8))) {
 // promoting a mask type (v*i1) into a register of type i64/i32/i16/i8

diff  --git a/llvm/test/CodeGen/X86/musttail-varargs.ll 
b/llvm/test/CodeGen/X86/musttail-varargs.ll
index 6e293935911d..f99bdeca019a 100644
--- a/llvm/test/CodeGen/X86/musttail-varargs.ll
+++ b/llvm/test/CodeGen/X86/musttail-varargs.ll
@@ -136,7 +136,7 @@ define void @f_thunk(i8* %this, ...) {
 ; LINUX-X32-NEXT:movq %rcx, %r13
 ; LINUX-X32-NEXT:movq %rdx, %rbp
 ; LINUX-X32-NEXT:movq %rsi, %rbx
-; LINUX-X32-NEXT:movl %edi, %r14d
+; LINUX-X32-NEXT:movq %rdi, %r14
 ; LINUX-X32-NEXT:movb %al, {{[-0-9]+}}(%e{{[sb]}}p) # 1-byte Spill
 ; LINUX-X32-NEXT:testb %al, %al
 ; LINUX-X32-NEXT:je .LBB0_2
@@ -161,7 +161,7 @@ define void @f_thunk(i8* %this, ...) {
 ; LINUX-X32-NEXT:movl %eax, {{[0-9]+}}(%esp)
 ; LINUX-X32-NEXT:movabsq $206158430216, %rax # imm = 0x38
 ; LINUX-X32-NEXT:movq %rax, {{[0-9]+}}(%esp)
-; LINUX-X32-NEXT:movl %r14d, %edi
+; LINUX-X32-NEXT:movq %r14, %rdi
 ; LINUX-X32-NEXT:movaps %xmm7, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
 ; LINUX-X32-NEXT:movaps %xmm6, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
 ; LINUX-X32-NEXT:movaps %xmm5, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
@@ -172,7 +172,7 @@ define void @f_thunk(i8* %this, ...) {
 ; LINUX-X32-NEXT:movaps %xmm0, {{[-0-9]+}}(%e{{[sb]}}p) # 16-byte Spill
 ; LINUX-X32-NEXT:callq get_f
 ; LINUX-X32-NEXT:movl %eax, %r11d
-; LINUX-X32-NEXT:movl %r14d, %edi
+; LINUX-X3

[llvm-branch-commits] [llvm] 18ce612 - Use PC-relative address for x32 TLS address

2020-12-02 Thread Harald van Dijk via llvm-branch-commits

Author: H.J. Lu
Date: 2020-12-02T22:20:36Z
New Revision: 18ce612353795da6838aade2b933503cbe3cf9b9

URL: 
https://github.com/llvm/llvm-project/commit/18ce612353795da6838aade2b933503cbe3cf9b9
DIFF: 
https://github.com/llvm/llvm-project/commit/18ce612353795da6838aade2b933503cbe3cf9b9.diff

LOG: Use PC-relative address for x32 TLS address

Since x32 supports PC-relative address, it shouldn't use EBX for TLS
address.  Instead of checking N.getValueType(), we should check
Subtarget->is32Bit().  This fixes PR 22676.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D16474

Added: 


Modified: 
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp 
b/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
index 9a16dc17ba61..de8c2f345fb5 100644
--- a/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
@@ -2694,12 +2694,12 @@ bool X86DAGToDAGISel::selectTLSADDRAddr(SDValue N, 
SDValue &Base,
   AM.Disp += GA->getOffset();
   AM.SymbolFlags = GA->getTargetFlags();
 
-  MVT VT = N.getSimpleValueType();
-  if (VT == MVT::i32) {
+  if (Subtarget->is32Bit()) {
 AM.Scale = 1;
 AM.IndexReg = CurDAG->getRegister(X86::EBX, MVT::i32);
   }
 
+  MVT VT = N.getSimpleValueType();
   getAddressOperands(AM, SDLoc(N), VT, Base, Scale, Index, Disp, Segment);
   return true;
 }



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] c9be4ef - [X86] Add TLS_(base_)addrX32 for X32 mode

2020-12-02 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-02T22:20:36Z
New Revision: c9be4ef184c1a8cb042ae846f9f1818b3ffcddb0

URL: 
https://github.com/llvm/llvm-project/commit/c9be4ef184c1a8cb042ae846f9f1818b3ffcddb0
DIFF: 
https://github.com/llvm/llvm-project/commit/c9be4ef184c1a8cb042ae846f9f1818b3ffcddb0.diff

LOG: [X86] Add TLS_(base_)addrX32 for X32 mode

LLVM has TLS_(base_)addr32 for 32-bit TLS addresses in 32-bit mode, and
TLS_(base_)addr64 for 64-bit TLS addresses in 64-bit mode. x32 mode wants 32-bit
TLS addresses in 64-bit mode, which were not yet handled. This adds
TLS_(base_)addrX32 as copies of TLS_(base_)addr64, except that they use
tls32(base)addr rather than tls64(base)addr, and then restricts
TLS_(base_)addr64 to 64-bit LP64 mode, TLS_(base_)addrX32 to 64-bit ILP32 mode.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92346

Added: 


Modified: 
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/lib/Target/X86/X86InstrCompiler.td
llvm/lib/Target/X86/X86MCInstLower.cpp
llvm/test/CodeGen/X86/pic.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index 56e098a48dd5..caf9e690f5b8 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -19123,7 +19123,7 @@ LowerToTLSGeneralDynamicModel32(GlobalAddressSDNode 
*GA, SelectionDAG &DAG,
   return GetTLSADDR(DAG, Chain, GA, &InFlag, PtrVT, X86::EAX, X86II::MO_TLSGD);
 }
 
-// Lower ISD::GlobalTLSAddress using the "general dynamic" model, 64 bit
+// Lower ISD::GlobalTLSAddress using the "general dynamic" model, 64 bit LP64
 static SDValue
 LowerToTLSGeneralDynamicModel64(GlobalAddressSDNode *GA, SelectionDAG &DAG,
 const EVT PtrVT) {
@@ -19131,6 +19131,14 @@ LowerToTLSGeneralDynamicModel64(GlobalAddressSDNode 
*GA, SelectionDAG &DAG,
 X86::RAX, X86II::MO_TLSGD);
 }
 
+// Lower ISD::GlobalTLSAddress using the "general dynamic" model, 64 bit ILP32
+static SDValue
+LowerToTLSGeneralDynamicModelX32(GlobalAddressSDNode *GA, SelectionDAG &DAG,
+ const EVT PtrVT) {
+  return GetTLSADDR(DAG, DAG.getEntryNode(), GA, nullptr, PtrVT,
+X86::EAX, X86II::MO_TLSGD);
+}
+
 static SDValue LowerToTLSLocalDynamicModel(GlobalAddressSDNode *GA,
SelectionDAG &DAG,
const EVT PtrVT,
@@ -19241,8 +19249,11 @@ X86TargetLowering::LowerGlobalTLSAddress(SDValue Op, 
SelectionDAG &DAG) const {
 TLSModel::Model model = DAG.getTarget().getTLSModel(GV);
 switch (model) {
   case TLSModel::GeneralDynamic:
-if (Subtarget.is64Bit())
-  return LowerToTLSGeneralDynamicModel64(GA, DAG, PtrVT);
+if (Subtarget.is64Bit()) {
+  if (Subtarget.isTarget64BitLP64())
+return LowerToTLSGeneralDynamicModel64(GA, DAG, PtrVT);
+  return LowerToTLSGeneralDynamicModelX32(GA, DAG, PtrVT);
+}
 return LowerToTLSGeneralDynamicModel32(GA, DAG, PtrVT);
   case TLSModel::LocalDynamic:
 return LowerToTLSLocalDynamicModel(GA, DAG, PtrVT,
@@ -33511,8 +33522,10 @@ 
X86TargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
   default: llvm_unreachable("Unexpected instr type to insert");
   case X86::TLS_addr32:
   case X86::TLS_addr64:
+  case X86::TLS_addrX32:
   case X86::TLS_base_addr32:
   case X86::TLS_base_addr64:
+  case X86::TLS_base_addrX32:
 return EmitLoweredTLSAddr(MI, BB);
   case X86::INDIRECT_THUNK_CALL32:
   case X86::INDIRECT_THUNK_CALL64:

diff  --git a/llvm/lib/Target/X86/X86InstrCompiler.td 
b/llvm/lib/Target/X86/X86InstrCompiler.td
index 9f180c4c91aa..0c9f972cf225 100644
--- a/llvm/lib/Target/X86/X86InstrCompiler.td
+++ b/llvm/lib/Target/X86/X86InstrCompiler.td
@@ -467,11 +467,19 @@ let Defs = [RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11,
 def TLS_addr64 : I<0, Pseudo, (outs), (ins i64mem:$sym),
"# TLS_addr64",
   [(X86tlsaddr tls64addr:$sym)]>,
-  Requires<[In64BitMode]>;
+  Requires<[In64BitMode, IsLP64]>;
 def TLS_base_addr64 : I<0, Pseudo, (outs), (ins i64mem:$sym),
"# TLS_base_addr64",
   [(X86tlsbaseaddr tls64baseaddr:$sym)]>,
-  Requires<[In64BitMode]>;
+  Requires<[In64BitMode, IsLP64]>;
+def TLS_addrX32 : I<0, Pseudo, (outs), (ins i32mem:$sym),
+   "# TLS_addrX32",
+  [(X86tlsaddr tls32addr:$sym)]>,
+  Requires<[In64BitMode, NotLP64]>;
+def TLS_base_addrX32 : I<0, Pseudo, (outs), (ins i32mem:$sym),
+   "# TLS_base_addrX32",
+  [(X86tlsbaseaddr tls32baseaddr:$sym)]>,
+  Requires<[In64BitMode, NotLP64]>;
 }
 
 // Darwin TLS Support

diff  --git a/llvm/lib/Target/X

[llvm-branch-commits] [llvm] 29c8ea6 - [X86] Handle localdynamic TLS model in x32 mode

2020-12-08 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-08T21:06:00Z
New Revision: 29c8ea6f1abd6606b65dafd3db8f15c8104c2593

URL: 
https://github.com/llvm/llvm-project/commit/29c8ea6f1abd6606b65dafd3db8f15c8104c2593
DIFF: 
https://github.com/llvm/llvm-project/commit/29c8ea6f1abd6606b65dafd3db8f15c8104c2593.diff

LOG: [X86] Handle localdynamic TLS model in x32 mode

D92346 added TLS_(base_)addrX32 to handle TLS in x32 mode, but missed the
different TLS models. This diff fixes the logic for the local dynamic model
where `RAX` was used when `EAX` should be, and extends the tests to cover
all four TLS models.

Fixes https://bugs.llvm.org/show_bug.cgi?id=26472.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92737

Added: 


Modified: 
llvm/lib/Target/X86/X86ISelLowering.cpp
llvm/test/CodeGen/X86/pic.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86ISelLowering.cpp 
b/llvm/lib/Target/X86/X86ISelLowering.cpp
index e97f4b12323d6..e437b9291148d 100644
--- a/llvm/lib/Target/X86/X86ISelLowering.cpp
+++ b/llvm/lib/Target/X86/X86ISelLowering.cpp
@@ -19142,9 +19142,8 @@ LowerToTLSGeneralDynamicModelX32(GlobalAddressSDNode 
*GA, SelectionDAG &DAG,
 }
 
 static SDValue LowerToTLSLocalDynamicModel(GlobalAddressSDNode *GA,
-   SelectionDAG &DAG,
-   const EVT PtrVT,
-   bool is64Bit) {
+   SelectionDAG &DAG, const EVT PtrVT,
+   bool Is64Bit, bool Is64BitLP64) {
   SDLoc dl(GA);
 
   // Get the start address of the TLS block for this module.
@@ -19153,8 +19152,9 @@ static SDValue 
LowerToTLSLocalDynamicModel(GlobalAddressSDNode *GA,
   MFI->incNumLocalDynamicTLSAccesses();
 
   SDValue Base;
-  if (is64Bit) {
-Base = GetTLSADDR(DAG, DAG.getEntryNode(), GA, nullptr, PtrVT, X86::RAX,
+  if (Is64Bit) {
+unsigned ReturnReg = Is64BitLP64 ? X86::RAX : X86::EAX;
+Base = GetTLSADDR(DAG, DAG.getEntryNode(), GA, nullptr, PtrVT, ReturnReg,
   X86II::MO_TLSLD, /*LocalDynamic=*/true);
   } else {
 SDValue InFlag;
@@ -19258,8 +19258,8 @@ X86TargetLowering::LowerGlobalTLSAddress(SDValue Op, 
SelectionDAG &DAG) const {
 }
 return LowerToTLSGeneralDynamicModel32(GA, DAG, PtrVT);
   case TLSModel::LocalDynamic:
-return LowerToTLSLocalDynamicModel(GA, DAG, PtrVT,
-   Subtarget.is64Bit());
+return LowerToTLSLocalDynamicModel(GA, DAG, PtrVT, Subtarget.is64Bit(),
+   Subtarget.isTarget64BitLP64());
   case TLSModel::InitialExec:
   case TLSModel::LocalExec:
 return LowerToTLSExecModel(GA, DAG, PtrVT, model, Subtarget.is64Bit(),

diff  --git a/llvm/test/CodeGen/X86/pic.ll b/llvm/test/CodeGen/X86/pic.ll
index c936333a5726a..3f3417e89ad81 100644
--- a/llvm/test/CodeGen/X86/pic.ll
+++ b/llvm/test/CodeGen/X86/pic.ll
@@ -254,15 +254,24 @@ declare void @foo4(...)
 declare void @foo5(...)
 
 ;; Check TLS references
-@tlsptr = external thread_local global i32*
-@tlsdst = external thread_local global i32
-@tlssrc = external thread_local global i32
+@tlsptrgd = thread_local global i32* null
+@tlsdstgd = thread_local global i32 0
+@tlssrcgd = thread_local global i32 0
+@tlsptrld = thread_local(localdynamic) global i32* null
+@tlsdstld = thread_local(localdynamic) global i32 0
+@tlssrcld = thread_local(localdynamic) global i32 0
+@tlsptrie = thread_local(initialexec) global i32* null
+@tlsdstie = thread_local(initialexec) global i32 0
+@tlssrcie = thread_local(initialexec) global i32 0
+@tlsptrle = thread_local(localexec) global i32* null
+@tlsdstle = thread_local(localexec) global i32 0
+@tlssrcle = thread_local(localexec) global i32 0
 
 define void @test8() nounwind {
 entry:
-store i32* @tlsdst, i32** @tlsptr
-%tmp.s = load i32, i32* @tlssrc
-store i32 %tmp.s, i32* @tlsdst
+store i32* @tlsdstgd, i32** @tlsptrgd
+%tmp.s = load i32, i32* @tlssrcgd
+store i32 %tmp.s, i32* @tlsdstgd
 ret void
 
 ; CHECK-LABEL: test8:
@@ -270,18 +279,95 @@ entry:
 ; CHECK-I686-NEXT: .L8$pb:
 ; CHECK-I686-NEXT: popl
 ; CHECK-I686:  addl$_GLOBAL_OFFSET_TABLE_+(.L{{.*}}-.L8$pb), %ebx
-; CHECK-I686-DAG:  lealtlsdst@TLSGD(,%ebx), %eax
+; CHECK-I686-DAG:  lealtlsdstgd@TLSGD(,%ebx), %eax
 ; CHECK-I686-DAG:  calll   ___tls_get_addr@PLT
-; CHECK-I686-DAG:  lealtlsptr@TLSGD(,%ebx), %eax
+; CHECK-I686-DAG:  lealtlsptrgd@TLSGD(,%ebx), %eax
 ; CHECK-I686-DAG:  calll   ___tls_get_addr@PLT
-; CHECK-I686-DAG:  lealtlssrc@TLSGD(,%ebx), %eax
+; CHECK-I686-DAG:  lealtlssrcgd@TLSGD(,%ebx), %eax
 ; CHECK-I686-DAG:  calll   ___tls_get_addr@PLT
-; CHECK-X32-DAG:   leaqtlsdst@TLSGD(%rip), %rdi
+; CHECK-X32-DAG:   leaq   

[llvm-branch-commits] [llvm] f61e5ec - [X86] Avoid data16 prefix for lea in x32 mode

2020-12-12 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-12T17:05:24Z
New Revision: f61e5ecb919b3901590328e69d3e4a557eefd788

URL: 
https://github.com/llvm/llvm-project/commit/f61e5ecb919b3901590328e69d3e4a557eefd788
DIFF: 
https://github.com/llvm/llvm-project/commit/f61e5ecb919b3901590328e69d3e4a557eefd788.diff

LOG: [X86] Avoid data16 prefix for lea in x32 mode

The ABI demands a data16 prefix for lea in 64-bit LP64 mode, but not in
64-bit ILP32 mode. In both modes this prefix would ordinarily be
ignored, but the instructions may be changed by the linker to
instructions that are affected by the prefix.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D93157

Added: 


Modified: 
llvm/lib/Target/X86/X86MCInstLower.cpp
llvm/test/CodeGen/X86/pic.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86MCInstLower.cpp 
b/llvm/lib/Target/X86/X86MCInstLower.cpp
index 6602d819929e..29faaa2dad36 100644
--- a/llvm/lib/Target/X86/X86MCInstLower.cpp
+++ b/llvm/lib/Target/X86/X86MCInstLower.cpp
@@ -979,6 +979,8 @@ void X86AsmPrinter::LowerTlsAddr(X86MCInstLower 
&MCInstLowering,
   NoAutoPaddingScope NoPadScope(*OutStreamer);
   bool Is64Bits = MI.getOpcode() != X86::TLS_addr32 &&
   MI.getOpcode() != X86::TLS_base_addr32;
+  bool Is64BitsLP64 = MI.getOpcode() == X86::TLS_addr64 ||
+  MI.getOpcode() == X86::TLS_base_addr64;
   MCContext &Ctx = OutStreamer->getContext();
 
   MCSymbolRefExpr::VariantKind SRVK;
@@ -1012,7 +1014,7 @@ void X86AsmPrinter::LowerTlsAddr(X86MCInstLower 
&MCInstLowering,
 
   if (Is64Bits) {
 bool NeedsPadding = SRVK == MCSymbolRefExpr::VK_TLSGD;
-if (NeedsPadding)
+if (NeedsPadding && Is64BitsLP64)
   EmitAndCountInstruction(MCInstBuilder(X86::DATA16_PREFIX));
 EmitAndCountInstruction(MCInstBuilder(X86::LEA64r)
 .addReg(X86::RDI)

diff  --git a/llvm/test/CodeGen/X86/pic.ll b/llvm/test/CodeGen/X86/pic.ll
index 3f3417e89ad8..101c749633bc 100644
--- a/llvm/test/CodeGen/X86/pic.ll
+++ b/llvm/test/CodeGen/X86/pic.ll
@@ -285,6 +285,7 @@ entry:
 ; CHECK-I686-DAG:  calll   ___tls_get_addr@PLT
 ; CHECK-I686-DAG:  lealtlssrcgd@TLSGD(,%ebx), %eax
 ; CHECK-I686-DAG:  calll   ___tls_get_addr@PLT
+; CHECK-X32-NOT:   data16
 ; CHECK-X32-DAG:   leaqtlsdstgd@TLSGD(%rip), %rdi
 ; CHECK-X32-DAG:   callq   __tls_get_addr@PLT
 ; CHECK-X32-DAG:   leaqtlsptrgd@TLSGD(%rip), %rdi



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 67c97ed - [UpdateTestChecks] Add --(no-)x86_scrub_sp option.

2020-12-12 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-12T17:11:13Z
New Revision: 67c97ed4a5a99315b305750a7fc0aaa6744e3d37

URL: 
https://github.com/llvm/llvm-project/commit/67c97ed4a5a99315b305750a7fc0aaa6744e3d37
DIFF: 
https://github.com/llvm/llvm-project/commit/67c97ed4a5a99315b305750a7fc0aaa6744e3d37.diff

LOG: [UpdateTestChecks] Add --(no-)x86_scrub_sp option.

This makes it possible to use update_llc_test_checks to manage tests
that check for incorrect x86 stack offsets. It does not yet modify any
test to make use of this new option.

Added: 


Modified: 
llvm/utils/UpdateTestChecks/asm.py
llvm/utils/update_llc_test_checks.py

Removed: 




diff  --git a/llvm/utils/UpdateTestChecks/asm.py 
b/llvm/utils/UpdateTestChecks/asm.py
index 24090fc2ea7e..476e7f1c75c9 100644
--- a/llvm/utils/UpdateTestChecks/asm.py
+++ b/llvm/utils/UpdateTestChecks/asm.py
@@ -177,8 +177,9 @@ def scrub_asm_x86(asm, args):
   # Detect stack spills and reloads and hide their exact offset and whether
   # they used the stack pointer or frame pointer.
   asm = SCRUB_X86_SPILL_RELOAD_RE.sub(r'{{[-0-9]+}}(%\1{{[sb]}}p)\2', asm)
-  # Generically match the stack offset of a memory operand.
-  asm = SCRUB_X86_SP_RE.sub(r'{{[0-9]+}}(%\1)', asm)
+  if getattr(args, 'x86_scrub_sp', True):
+# Generically match the stack offset of a memory operand.
+asm = SCRUB_X86_SP_RE.sub(r'{{[0-9]+}}(%\1)', asm)
   if getattr(args, 'x86_scrub_rip', False):
 # Generically match a RIP-relative memory operand.
 asm = SCRUB_X86_RIP_RE.sub(r'{{.*}}(%rip)', asm)

diff  --git a/llvm/utils/update_llc_test_checks.py 
b/llvm/utils/update_llc_test_checks.py
index b5422bd18791..2826b16fea2c 100755
--- a/llvm/utils/update_llc_test_checks.py
+++ b/llvm/utils/update_llc_test_checks.py
@@ -27,9 +27,14 @@ def main():
   parser.add_argument(
   '--extra_scrub', action='store_true',
   help='Always use additional regex to further reduce 
diff s between various subtargets')
+  parser.add_argument(
+  '--x86_scrub_sp', action='store_true', default=True,
+  help='Use regex for x86 sp matching to reduce 
diff s between various subtargets')
+  parser.add_argument(
+  '--no_x86_scrub_sp', action='store_false', dest='x86_scrub_sp')
   parser.add_argument(
   '--x86_scrub_rip', action='store_true', default=True,
-  help='Use more regex for x86 matching to reduce 
diff s between various subtargets')
+  help='Use more regex for x86 rip matching to reduce 
diff s between various subtargets')
   parser.add_argument(
   '--no_x86_scrub_rip', action='store_false', dest='x86_scrub_rip')
   parser.add_argument(



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] f99b4f5 - [X86] Extend varargs test

2020-12-13 Thread Harald van Dijk via llvm-branch-commits

Author: Harald van Dijk
Date: 2020-12-13T18:33:10Z
New Revision: f99b4f5241a3b3436b05355f5ea8588274254f8b

URL: 
https://github.com/llvm/llvm-project/commit/f99b4f5241a3b3436b05355f5ea8588274254f8b
DIFF: 
https://github.com/llvm/llvm-project/commit/f99b4f5241a3b3436b05355f5ea8588274254f8b.diff

LOG: [X86] Extend varargs test

This extends the existing x86-64-varargs test by passing enough
arguments that they need to be passed in memory, and by passing them in
reverse order, using va_arg for each argument to retrieve them and
restoring them to the correct order, and by using va_copy to have two
va_lists to use with va_arg.

Added: 


Modified: 
llvm/test/CodeGen/X86/x86-64-varargs.ll

Removed: 




diff  --git a/llvm/test/CodeGen/X86/x86-64-varargs.ll 
b/llvm/test/CodeGen/X86/x86-64-varargs.ll
index 58f7c82c2123..48b757be2645 100644
--- a/llvm/test/CodeGen/X86/x86-64-varargs.ll
+++ b/llvm/test/CodeGen/X86/x86-64-varargs.ll
@@ -1,30 +1,326 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc < %s -mtriple=x86_64-apple-darwin -code-model=large 
-relocation-model=static | FileCheck %s
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --no_x86_scrub_sp
+; RUN: llc < %s -mtriple=x86_64-apple-darwin -code-model=large 
-relocation-model=static | FileCheck --check-prefix=CHECK-X64 %s
 
-@.str = internal constant [26 x i8] c"%d, %f, %d, %lld, %d, %f\0A\00"  
; <[26 x i8]*> [#uses=1]
+@.str = internal constant [38 x i8] c"%d, %f, %d, %lld, %d, %f, %d, %d, 
%d\0A\00"  ; <[38 x i8]*> [#uses=1]
 
 declare i32 @printf(i8*, ...) nounwind
 
-define i32 @main() nounwind  {
-; CHECK-LABEL: main:
-; CHECK:   ## %bb.0: ## %entry
-; CHECK-NEXT:pushq %rax
-; CHECK-NEXT:movabsq $_.str, %rdi
-; CHECK-NEXT:movabsq $_printf, %r9
-; CHECK-NEXT:movabsq $LCPI0_0, %rax
-; CHECK-NEXT:movsd {{.*#+}} xmm0 = mem[0],zero
-; CHECK-NEXT:movabsq $LCPI0_1, %rax
-; CHECK-NEXT:movsd {{.*#+}} xmm1 = mem[0],zero
-; CHECK-NEXT:movabsq $123456677890, %rcx ## imm = 0x1CBE976802
-; CHECK-NEXT:movl $12, %esi
-; CHECK-NEXT:movl $120, %edx
-; CHECK-NEXT:movl $-10, %r8d
-; CHECK-NEXT:movb $2, %al
-; CHECK-NEXT:callq *%r9
-; CHECK-NEXT:xorl %eax, %eax
-; CHECK-NEXT:popq %rcx
-; CHECK-NEXT:retq
+declare void @llvm.va_start(i8*)
+declare void @llvm.va_copy(i8*, i8*)
+declare void @llvm.va_end(i8*)
+
+%struct.va_list = type { i32, i32, i8*, i8* }
+
+define void @func(...) nounwind {
+; CHECK-X64-LABEL: func:
+; CHECK-X64:   ## %bb.0: ## %entry
+; CHECK-X64-NEXT:pushq %rbx
+; CHECK-X64-NEXT:subq $224, %rsp
+; CHECK-X64-NEXT:testb %al, %al
+; CHECK-X64-NEXT:je LBB0_2
+; CHECK-X64-NEXT:  ## %bb.1: ## %entry
+; CHECK-X64-NEXT:movaps %xmm0, 96(%rsp)
+; CHECK-X64-NEXT:movaps %xmm1, 112(%rsp)
+; CHECK-X64-NEXT:movaps %xmm2, 128(%rsp)
+; CHECK-X64-NEXT:movaps %xmm3, 144(%rsp)
+; CHECK-X64-NEXT:movaps %xmm4, 160(%rsp)
+; CHECK-X64-NEXT:movaps %xmm5, 176(%rsp)
+; CHECK-X64-NEXT:movaps %xmm6, 192(%rsp)
+; CHECK-X64-NEXT:movaps %xmm7, 208(%rsp)
+; CHECK-X64-NEXT:  LBB0_2: ## %entry
+; CHECK-X64-NEXT:movq %rdi, 48(%rsp)
+; CHECK-X64-NEXT:movq %rsi, 56(%rsp)
+; CHECK-X64-NEXT:movq %rdx, 64(%rsp)
+; CHECK-X64-NEXT:movq %rcx, 72(%rsp)
+; CHECK-X64-NEXT:movq %r8, 80(%rsp)
+; CHECK-X64-NEXT:movq %r9, 88(%rsp)
+; CHECK-X64-NEXT:movabsq $206158430208, %rax ## imm = 0x30
+; CHECK-X64-NEXT:movq %rax, (%rsp)
+; CHECK-X64-NEXT:leaq 240(%rsp), %rax
+; CHECK-X64-NEXT:movq %rax, 8(%rsp)
+; CHECK-X64-NEXT:leaq 48(%rsp), %rax
+; CHECK-X64-NEXT:movq %rax, 16(%rsp)
+; CHECK-X64-NEXT:movl (%rsp), %ecx
+; CHECK-X64-NEXT:cmpl $48, %ecx
+; CHECK-X64-NEXT:jae LBB0_4
+; CHECK-X64-NEXT:  ## %bb.3: ## %entry
+; CHECK-X64-NEXT:movq 16(%rsp), %rax
+; CHECK-X64-NEXT:addq %rcx, %rax
+; CHECK-X64-NEXT:addl $8, %ecx
+; CHECK-X64-NEXT:movl %ecx, (%rsp)
+; CHECK-X64-NEXT:jmp LBB0_5
+; CHECK-X64-NEXT:  LBB0_4: ## %entry
+; CHECK-X64-NEXT:movq 8(%rsp), %rax
+; CHECK-X64-NEXT:movq %rax, %rcx
+; CHECK-X64-NEXT:addq $8, %rcx
+; CHECK-X64-NEXT:movq %rcx, 8(%rsp)
+; CHECK-X64-NEXT:  LBB0_5: ## %entry
+; CHECK-X64-NEXT:movl (%rax), %r10d
+; CHECK-X64-NEXT:movl (%rsp), %ecx
+; CHECK-X64-NEXT:cmpl $48, %ecx
+; CHECK-X64-NEXT:jae LBB0_7
+; CHECK-X64-NEXT:  ## %bb.6: ## %entry
+; CHECK-X64-NEXT:movq 16(%rsp), %rax
+; CHECK-X64-NEXT:addq %rcx, %rax
+; CHECK-X64-NEXT:addl $8, %ecx
+; CHECK-X64-NEXT:movl %ecx, (%rsp)
+; CHECK-X64-NEXT:jmp LBB0_8
+; CHECK-X64-NEXT:  LBB0_7: ## %entry
+; CHECK-X64-NEXT:movq 8(%rsp), %rax
+; CHECK-X64-NEXT:movq %rax, %rcx
+; CHECK-X64-NEXT:addq $8, %rcx
+; CHECK-X64-NEXT:movq %rcx, 8(%rsp)
+; CHECK-X64-NEXT:  LBB0_8: ## %entry
+; 

[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2024-11-04 Thread Harald van Dijk via llvm-branch-commits

hvdijk wrote:

The `version_check` failure is because the repo was expected to be updated 
after the last release, but it has not yet been. Based on 
https://discourse.llvm.org/t/potential-abi-break-in-19-1-3/82865 I assume this 
is because it is not yet decided what the next version number is supposed to 
be? Either way, I think we can ignore that error.

But based on that discussion: this PR is also technically an ABI break. It 
breaks symbols that are public for technical reasons, but are meant to only be 
used by LLVM internally. What is the policy on that? Is that something that can 
go into `19.1.y`, or does that have to wait until `19.2.y`? (No preference 
either way from me.)

https://github.com/llvm/llvm-project/pull/114786
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2024-11-12 Thread Harald van Dijk via llvm-branch-commits

https://github.com/hvdijk updated 
https://github.com/llvm/llvm-project/pull/114786

>From fb70629e196c516ebd5083f8624ba614f746ef67 Mon Sep 17 00:00:00 2001
From: serge-sans-paille 
Date: Sat, 2 Nov 2024 09:14:35 +
Subject: [PATCH 1/2] =?UTF-8?q?[llvm]=20Fix=20=5F=5Fbuiltin=5Fobject=5Fsiz?=
 =?UTF-8?q?e=20interaction=20between=20Negative=20Offset=20=E2=80=A6=20(#1?=
 =?UTF-8?q?11827)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

…and Select/Phi

When picking a SizeOffsetAPInt through combineSizeOffset, the behavior
differs if we're going to apply a constant offset that's positive or
negative: If it's positive, then we need to compare the remaining bytes
(i.e. Size
- Offset), but if it's negative, we need to compare the preceding bytes
(i.e. Offset).

Fix #111709

(cherry picked from commit 01a103b0b9c449e8dec17950835991757d1c4f88)
---
 llvm/include/llvm/Analysis/MemoryBuiltins.h   |  71 +++--
 llvm/lib/Analysis/MemoryBuiltins.cpp  | 145 +-
 .../builtin-object-size-phi.ll| 254 ++
 .../objectsize_basic.ll   |  24 ++
 4 files changed, 405 insertions(+), 89 deletions(-)

diff --git a/llvm/include/llvm/Analysis/MemoryBuiltins.h 
b/llvm/include/llvm/Analysis/MemoryBuiltins.h
index bb282a1b73d320..a21f116db7e70d 100644
--- a/llvm/include/llvm/Analysis/MemoryBuiltins.h
+++ b/llvm/include/llvm/Analysis/MemoryBuiltins.h
@@ -222,21 +222,43 @@ struct SizeOffsetAPInt : public SizeOffsetType {
   static bool known(const APInt &V) { return V.getBitWidth() > 1; }
 };
 
+/// OffsetSpan - Used internally by \p ObjectSizeOffsetVisitor. Represents a
+/// point in memory as a pair of allocated bytes before and after it.
+struct OffsetSpan {
+  APInt Before; /// Number of allocated bytes before this point.
+  APInt After;  /// Number of allocated bytes after this point.
+
+  OffsetSpan() = default;
+  OffsetSpan(APInt Before, APInt After) : Before(Before), After(After) {}
+
+  bool knownBefore() const { return known(Before); }
+  bool knownAfter() const { return known(After); }
+  bool anyKnown() const { return knownBefore() || knownAfter(); }
+  bool bothKnown() const { return knownBefore() && knownAfter(); }
+
+  bool operator==(const OffsetSpan &RHS) const {
+return Before == RHS.Before && After == RHS.After;
+  }
+  bool operator!=(const OffsetSpan &RHS) const { return !(*this == RHS); }
+
+  static bool known(const APInt &V) { return V.getBitWidth() > 1; }
+};
+
 /// Evaluate the size and offset of an object pointed to by a Value*
 /// statically. Fails if size or offset are not known at compile time.
 class ObjectSizeOffsetVisitor
-: public InstVisitor {
+: public InstVisitor {
   const DataLayout &DL;
   const TargetLibraryInfo *TLI;
   ObjectSizeOpts Options;
   unsigned IntTyBits;
   APInt Zero;
-  SmallDenseMap SeenInsts;
+  SmallDenseMap SeenInsts;
   unsigned InstructionsVisited;
 
   APInt align(APInt Size, MaybeAlign Align);
 
-  static SizeOffsetAPInt unknown() { return SizeOffsetAPInt(); }
+  static OffsetSpan unknown() { return OffsetSpan(); }
 
 public:
   ObjectSizeOffsetVisitor(const DataLayout &DL, const TargetLibraryInfo *TLI,
@@ -246,29 +268,30 @@ class ObjectSizeOffsetVisitor
 
   // These are "private", except they can't actually be made private. Only
   // compute() should be used by external users.
-  SizeOffsetAPInt visitAllocaInst(AllocaInst &I);
-  SizeOffsetAPInt visitArgument(Argument &A);
-  SizeOffsetAPInt visitCallBase(CallBase &CB);
-  SizeOffsetAPInt visitConstantPointerNull(ConstantPointerNull &);
-  SizeOffsetAPInt visitExtractElementInst(ExtractElementInst &I);
-  SizeOffsetAPInt visitExtractValueInst(ExtractValueInst &I);
-  SizeOffsetAPInt visitGlobalAlias(GlobalAlias &GA);
-  SizeOffsetAPInt visitGlobalVariable(GlobalVariable &GV);
-  SizeOffsetAPInt visitIntToPtrInst(IntToPtrInst &);
-  SizeOffsetAPInt visitLoadInst(LoadInst &I);
-  SizeOffsetAPInt visitPHINode(PHINode &);
-  SizeOffsetAPInt visitSelectInst(SelectInst &I);
-  SizeOffsetAPInt visitUndefValue(UndefValue &);
-  SizeOffsetAPInt visitInstruction(Instruction &I);
+  OffsetSpan visitAllocaInst(AllocaInst &I);
+  OffsetSpan visitArgument(Argument &A);
+  OffsetSpan visitCallBase(CallBase &CB);
+  OffsetSpan visitConstantPointerNull(ConstantPointerNull &);
+  OffsetSpan visitExtractElementInst(ExtractElementInst &I);
+  OffsetSpan visitExtractValueInst(ExtractValueInst &I);
+  OffsetSpan visitGlobalAlias(GlobalAlias &GA);
+  OffsetSpan visitGlobalVariable(GlobalVariable &GV);
+  OffsetSpan visitIntToPtrInst(IntToPtrInst &);
+  OffsetSpan visitLoadInst(LoadInst &I);
+  OffsetSpan visitPHINode(PHINode &);
+  OffsetSpan visitSelectInst(SelectInst &I);
+  OffsetSpan visitUndefValue(UndefValue &);
+  OffsetSpan visitInstruction(Instruction &I);
 
 private:
-  SizeOffsetAPInt findLoadSizeOffset(
-  LoadInst &LoadFrom, BasicBlock &BB, BasicBlock::iterator From,
-  SmallDenseMap &VisitedBlocks,
- 

[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2024-11-12 Thread Harald van Dijk via llvm-branch-commits

hvdijk wrote:

Also cc @serge-sans-paille for the chance to comment since this now includes 
changes of mine and is no longer a straightforward backport.

https://github.com/llvm/llvm-project/pull/114786
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2024-11-12 Thread Harald van Dijk via llvm-branch-commits

https://github.com/hvdijk updated 
https://github.com/llvm/llvm-project/pull/114786

>From fb70629e196c516ebd5083f8624ba614f746ef67 Mon Sep 17 00:00:00 2001
From: serge-sans-paille 
Date: Sat, 2 Nov 2024 09:14:35 +
Subject: [PATCH 1/2] =?UTF-8?q?[llvm]=20Fix=20=5F=5Fbuiltin=5Fobject=5Fsiz?=
 =?UTF-8?q?e=20interaction=20between=20Negative=20Offset=20=E2=80=A6=20(#1?=
 =?UTF-8?q?11827)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

…and Select/Phi

When picking a SizeOffsetAPInt through combineSizeOffset, the behavior
differs if we're going to apply a constant offset that's positive or
negative: If it's positive, then we need to compare the remaining bytes
(i.e. Size
- Offset), but if it's negative, we need to compare the preceding bytes
(i.e. Offset).

Fix #111709

(cherry picked from commit 01a103b0b9c449e8dec17950835991757d1c4f88)
---
 llvm/include/llvm/Analysis/MemoryBuiltins.h   |  71 +++--
 llvm/lib/Analysis/MemoryBuiltins.cpp  | 145 +-
 .../builtin-object-size-phi.ll| 254 ++
 .../objectsize_basic.ll   |  24 ++
 4 files changed, 405 insertions(+), 89 deletions(-)

diff --git a/llvm/include/llvm/Analysis/MemoryBuiltins.h 
b/llvm/include/llvm/Analysis/MemoryBuiltins.h
index bb282a1b73d320..a21f116db7e70d 100644
--- a/llvm/include/llvm/Analysis/MemoryBuiltins.h
+++ b/llvm/include/llvm/Analysis/MemoryBuiltins.h
@@ -222,21 +222,43 @@ struct SizeOffsetAPInt : public SizeOffsetType {
   static bool known(const APInt &V) { return V.getBitWidth() > 1; }
 };
 
+/// OffsetSpan - Used internally by \p ObjectSizeOffsetVisitor. Represents a
+/// point in memory as a pair of allocated bytes before and after it.
+struct OffsetSpan {
+  APInt Before; /// Number of allocated bytes before this point.
+  APInt After;  /// Number of allocated bytes after this point.
+
+  OffsetSpan() = default;
+  OffsetSpan(APInt Before, APInt After) : Before(Before), After(After) {}
+
+  bool knownBefore() const { return known(Before); }
+  bool knownAfter() const { return known(After); }
+  bool anyKnown() const { return knownBefore() || knownAfter(); }
+  bool bothKnown() const { return knownBefore() && knownAfter(); }
+
+  bool operator==(const OffsetSpan &RHS) const {
+return Before == RHS.Before && After == RHS.After;
+  }
+  bool operator!=(const OffsetSpan &RHS) const { return !(*this == RHS); }
+
+  static bool known(const APInt &V) { return V.getBitWidth() > 1; }
+};
+
 /// Evaluate the size and offset of an object pointed to by a Value*
 /// statically. Fails if size or offset are not known at compile time.
 class ObjectSizeOffsetVisitor
-: public InstVisitor {
+: public InstVisitor {
   const DataLayout &DL;
   const TargetLibraryInfo *TLI;
   ObjectSizeOpts Options;
   unsigned IntTyBits;
   APInt Zero;
-  SmallDenseMap SeenInsts;
+  SmallDenseMap SeenInsts;
   unsigned InstructionsVisited;
 
   APInt align(APInt Size, MaybeAlign Align);
 
-  static SizeOffsetAPInt unknown() { return SizeOffsetAPInt(); }
+  static OffsetSpan unknown() { return OffsetSpan(); }
 
 public:
   ObjectSizeOffsetVisitor(const DataLayout &DL, const TargetLibraryInfo *TLI,
@@ -246,29 +268,30 @@ class ObjectSizeOffsetVisitor
 
   // These are "private", except they can't actually be made private. Only
   // compute() should be used by external users.
-  SizeOffsetAPInt visitAllocaInst(AllocaInst &I);
-  SizeOffsetAPInt visitArgument(Argument &A);
-  SizeOffsetAPInt visitCallBase(CallBase &CB);
-  SizeOffsetAPInt visitConstantPointerNull(ConstantPointerNull &);
-  SizeOffsetAPInt visitExtractElementInst(ExtractElementInst &I);
-  SizeOffsetAPInt visitExtractValueInst(ExtractValueInst &I);
-  SizeOffsetAPInt visitGlobalAlias(GlobalAlias &GA);
-  SizeOffsetAPInt visitGlobalVariable(GlobalVariable &GV);
-  SizeOffsetAPInt visitIntToPtrInst(IntToPtrInst &);
-  SizeOffsetAPInt visitLoadInst(LoadInst &I);
-  SizeOffsetAPInt visitPHINode(PHINode &);
-  SizeOffsetAPInt visitSelectInst(SelectInst &I);
-  SizeOffsetAPInt visitUndefValue(UndefValue &);
-  SizeOffsetAPInt visitInstruction(Instruction &I);
+  OffsetSpan visitAllocaInst(AllocaInst &I);
+  OffsetSpan visitArgument(Argument &A);
+  OffsetSpan visitCallBase(CallBase &CB);
+  OffsetSpan visitConstantPointerNull(ConstantPointerNull &);
+  OffsetSpan visitExtractElementInst(ExtractElementInst &I);
+  OffsetSpan visitExtractValueInst(ExtractValueInst &I);
+  OffsetSpan visitGlobalAlias(GlobalAlias &GA);
+  OffsetSpan visitGlobalVariable(GlobalVariable &GV);
+  OffsetSpan visitIntToPtrInst(IntToPtrInst &);
+  OffsetSpan visitLoadInst(LoadInst &I);
+  OffsetSpan visitPHINode(PHINode &);
+  OffsetSpan visitSelectInst(SelectInst &I);
+  OffsetSpan visitUndefValue(UndefValue &);
+  OffsetSpan visitInstruction(Instruction &I);
 
 private:
-  SizeOffsetAPInt findLoadSizeOffset(
-  LoadInst &LoadFrom, BasicBlock &BB, BasicBlock::iterator From,
-  SmallDenseMap &VisitedBlocks,
- 

[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2024-11-12 Thread Harald van Dijk via llvm-branch-commits

hvdijk wrote:

Disabling the phi handling entirely would work, but it's been in for a few 
releases already so I worry it would cause more damage to disable it again, 
especially if it is only for a single release.

I think I can change this PR to avoid changes to public headers.

https://github.com/llvm/llvm-project/pull/114786
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2024-11-12 Thread Harald van Dijk via llvm-branch-commits

https://github.com/hvdijk updated 
https://github.com/llvm/llvm-project/pull/114786

>From fb70629e196c516ebd5083f8624ba614f746ef67 Mon Sep 17 00:00:00 2001
From: serge-sans-paille 
Date: Sat, 2 Nov 2024 09:14:35 +
Subject: [PATCH 1/2] =?UTF-8?q?[llvm]=20Fix=20=5F=5Fbuiltin=5Fobject=5Fsiz?=
 =?UTF-8?q?e=20interaction=20between=20Negative=20Offset=20=E2=80=A6=20(#1?=
 =?UTF-8?q?11827)?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

…and Select/Phi

When picking a SizeOffsetAPInt through combineSizeOffset, the behavior
differs if we're going to apply a constant offset that's positive or
negative: If it's positive, then we need to compare the remaining bytes
(i.e. Size
- Offset), but if it's negative, we need to compare the preceding bytes
(i.e. Offset).

Fix #111709

(cherry picked from commit 01a103b0b9c449e8dec17950835991757d1c4f88)
---
 llvm/include/llvm/Analysis/MemoryBuiltins.h   |  71 +++--
 llvm/lib/Analysis/MemoryBuiltins.cpp  | 145 +-
 .../builtin-object-size-phi.ll| 254 ++
 .../objectsize_basic.ll   |  24 ++
 4 files changed, 405 insertions(+), 89 deletions(-)

diff --git a/llvm/include/llvm/Analysis/MemoryBuiltins.h 
b/llvm/include/llvm/Analysis/MemoryBuiltins.h
index bb282a1b73d320..a21f116db7e70d 100644
--- a/llvm/include/llvm/Analysis/MemoryBuiltins.h
+++ b/llvm/include/llvm/Analysis/MemoryBuiltins.h
@@ -222,21 +222,43 @@ struct SizeOffsetAPInt : public SizeOffsetType {
   static bool known(const APInt &V) { return V.getBitWidth() > 1; }
 };
 
+/// OffsetSpan - Used internally by \p ObjectSizeOffsetVisitor. Represents a
+/// point in memory as a pair of allocated bytes before and after it.
+struct OffsetSpan {
+  APInt Before; /// Number of allocated bytes before this point.
+  APInt After;  /// Number of allocated bytes after this point.
+
+  OffsetSpan() = default;
+  OffsetSpan(APInt Before, APInt After) : Before(Before), After(After) {}
+
+  bool knownBefore() const { return known(Before); }
+  bool knownAfter() const { return known(After); }
+  bool anyKnown() const { return knownBefore() || knownAfter(); }
+  bool bothKnown() const { return knownBefore() && knownAfter(); }
+
+  bool operator==(const OffsetSpan &RHS) const {
+return Before == RHS.Before && After == RHS.After;
+  }
+  bool operator!=(const OffsetSpan &RHS) const { return !(*this == RHS); }
+
+  static bool known(const APInt &V) { return V.getBitWidth() > 1; }
+};
+
 /// Evaluate the size and offset of an object pointed to by a Value*
 /// statically. Fails if size or offset are not known at compile time.
 class ObjectSizeOffsetVisitor
-: public InstVisitor {
+: public InstVisitor {
   const DataLayout &DL;
   const TargetLibraryInfo *TLI;
   ObjectSizeOpts Options;
   unsigned IntTyBits;
   APInt Zero;
-  SmallDenseMap SeenInsts;
+  SmallDenseMap SeenInsts;
   unsigned InstructionsVisited;
 
   APInt align(APInt Size, MaybeAlign Align);
 
-  static SizeOffsetAPInt unknown() { return SizeOffsetAPInt(); }
+  static OffsetSpan unknown() { return OffsetSpan(); }
 
 public:
   ObjectSizeOffsetVisitor(const DataLayout &DL, const TargetLibraryInfo *TLI,
@@ -246,29 +268,30 @@ class ObjectSizeOffsetVisitor
 
   // These are "private", except they can't actually be made private. Only
   // compute() should be used by external users.
-  SizeOffsetAPInt visitAllocaInst(AllocaInst &I);
-  SizeOffsetAPInt visitArgument(Argument &A);
-  SizeOffsetAPInt visitCallBase(CallBase &CB);
-  SizeOffsetAPInt visitConstantPointerNull(ConstantPointerNull &);
-  SizeOffsetAPInt visitExtractElementInst(ExtractElementInst &I);
-  SizeOffsetAPInt visitExtractValueInst(ExtractValueInst &I);
-  SizeOffsetAPInt visitGlobalAlias(GlobalAlias &GA);
-  SizeOffsetAPInt visitGlobalVariable(GlobalVariable &GV);
-  SizeOffsetAPInt visitIntToPtrInst(IntToPtrInst &);
-  SizeOffsetAPInt visitLoadInst(LoadInst &I);
-  SizeOffsetAPInt visitPHINode(PHINode &);
-  SizeOffsetAPInt visitSelectInst(SelectInst &I);
-  SizeOffsetAPInt visitUndefValue(UndefValue &);
-  SizeOffsetAPInt visitInstruction(Instruction &I);
+  OffsetSpan visitAllocaInst(AllocaInst &I);
+  OffsetSpan visitArgument(Argument &A);
+  OffsetSpan visitCallBase(CallBase &CB);
+  OffsetSpan visitConstantPointerNull(ConstantPointerNull &);
+  OffsetSpan visitExtractElementInst(ExtractElementInst &I);
+  OffsetSpan visitExtractValueInst(ExtractValueInst &I);
+  OffsetSpan visitGlobalAlias(GlobalAlias &GA);
+  OffsetSpan visitGlobalVariable(GlobalVariable &GV);
+  OffsetSpan visitIntToPtrInst(IntToPtrInst &);
+  OffsetSpan visitLoadInst(LoadInst &I);
+  OffsetSpan visitPHINode(PHINode &);
+  OffsetSpan visitSelectInst(SelectInst &I);
+  OffsetSpan visitUndefValue(UndefValue &);
+  OffsetSpan visitInstruction(Instruction &I);
 
 private:
-  SizeOffsetAPInt findLoadSizeOffset(
-  LoadInst &LoadFrom, BasicBlock &BB, BasicBlock::iterator From,
-  SmallDenseMap &VisitedBlocks,
- 

[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2024-11-12 Thread Harald van Dijk via llvm-branch-commits

hvdijk wrote:

This more limited version is not enough to completely fix the incorrect results 
in LLVM internally, but should be a strict improvement compared to what is 
currently on `release/19.x`, passes the tests that were added, and in cases 
where under the old API we cannot return the correct result, opts to return the 
larger of the two possible correct results, which should be enough to avoid the 
false positives in UBSAN. Is this version suitable for the release branch?

https://github.com/llvm/llvm-project/pull/114786
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2025-01-27 Thread Harald van Dijk via llvm-branch-commits

hvdijk wrote:

I do still think it should be fixed but if any fix needs to be reviewed by 
@serge-sans-paille and he is not going to review it, there is very little I can 
do, so might as well leave it closed.

https://github.com/llvm/llvm-project/pull/114786
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [llvm] Fix __builtin_object_size interaction between Negative Offset … (#111827) (PR #114786)

2025-01-27 Thread Harald van Dijk via llvm-branch-commits

hvdijk wrote:

The reason for the backport of the minimal fix was that LLVM 19 was throwing 
false `-fsanitize` errors breaking our code. I guess we now have confirmation 
we will not be able to use LLVM 19.

https://github.com/llvm/llvm-project/pull/114786
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] release/20.x: [reland][DebugInfo] Update DIBuilder insertion to take InsertPosition (#126967) (PR #127124)

2025-02-13 Thread Harald van Dijk via llvm-branch-commits

hvdijk wrote:

I'm hoping that we're still early enough in the LLVM 20 release cycle, since 
we're before the initial 20.1 release, that we can still cherry-pick it. It 
restores a certain level of API compatibility with earlier LLVM releases, but 
it would be an ABI break so would be risky to pick up after the 20.1 release.

https://github.com/llvm/llvm-project/pull/127124
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits