[llvm-branch-commits] [clang] 9723fc1 - [OpenCL][Docs] Release 13 notes.

2021-09-10 Thread Anastasia Stulova via llvm-branch-commits

Author: Anastasia Stulova
Date: 2021-09-10T11:41:27+01:00
New Revision: 9723fc15338e83737d0c5f7cbf415e7f1d9d1ec3

URL: 
https://github.com/llvm/llvm-project/commit/9723fc15338e83737d0c5f7cbf415e7f1d9d1ec3
DIFF: 
https://github.com/llvm/llvm-project/commit/9723fc15338e83737d0c5f7cbf415e7f1d9d1ec3.diff

LOG: [OpenCL][Docs] Release 13 notes.

Major OpenCL functionality added in release 13.

Differential Revision: https://reviews.llvm.org/D109327

Added: 


Modified: 
clang/docs/ReleaseNotes.rst

Removed: 




diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 820253348a194..e7fdd69542e16 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -150,10 +150,85 @@ C++2b Feature Support
 Objective-C Language Changes in Clang
 -
 
-OpenCL C Language Changes in Clang
---
+OpenCL Kernel Language Changes in Clang
+---
 
-...
+
+Command-line interface changes:
+
+- All builtin types, macros and function declarations are now added by default
+  without any command-line flags. A flag is provided ``-cl-no-stdinc`` to
+  suppress the default declarations non-native to the compiler.
+
+- Clang now compiles using OpenCL C version 1.2 by default if no version is
+  specified explicitly from the command line.
+
+- Clang now supports ``.clcpp`` file extension for sources written in
+  C++ for OpenCL.
+
+- Clang now accepts ``-cl-std=clc++1.0`` that sets C++ for OpenCL to
+  the version 1.0 explicitly.
+
+Misc common changes:
+
+- Added ``NULL`` definition in internal headers for standards prior to the
+  version 2.0.
+
+- Simplified use of pragma in extensions for ``double``, images, atomics,
+  subgroups, Arm dot product extension. There are less cases where extension
+  pragma is now required by clang to compile kernel sources.
+
+- Added missing ``as_size``/``as_ptr
diff ``/``as_intptr``/``as_uintptr_t``
+  operators to internal headers.
+
+- Added new builtin function for ndrange, ``cl_khr_subgroup_extended_types``,
+  ``cl_khr_subgroup_non_uniform_vote``, ``cl_khr_subgroup_ballot``,
+  ``cl_khr_subgroup_non_uniform_arithmetic``, ``cl_khr_subgroup_shuffle``,
+  ``cl_khr_subgroup_shuffle_relative``, ``cl_khr_subgroup_clustered_reduce``
+  into the default Tablegen-based header.
+
+- Added online documentation for Tablegen-based header, OpenCL 3.0 support,
+  new clang extensions.
+
+- Fixed OpenCL C language version and SPIR address space reporting in DWARF.
+
+New extensions:
+
+- ``cl_khr_integer_dot_product`` for dedicated support of dot product.
+
+- ``cl_khr_extended_bit_ops`` for dedicated support of extra binary operations.
+
+- ``__cl_clang_bitfields`` for use of bit-fields in the kernel code.
+
+- ``__cl_clang_non_portable_kernel_param_types`` for relaxing some restrictions
+  to types of kernel parameters.
+
+OpenCL C 3.0 related changes:
+
+- Added parsing support for the optionality of generic address space, images 
+  (including 3d writes and ``read_write`` access qualifier), pipes, program
+  scope variables, double-precision floating-point support. 
+
+- Added optionality support for builtin functions (in ``opencl-c.h`` header)
+  for generic address space, C11 atomics.  
+
+- Added ``memory_scope_all_devices`` enum for the atomics in internal headers.
+
+- Enabled use of ``.rgba`` vector components.
+
+C++ for OpenCL related changes:
+
+- Added ``__remove_address_space`` metaprogramming utility in internal headers
+  to allow removing address spaces from types.
+
+- Improved overloads resolution logic for constructors wrt address spaces.
+
+- Improved diagnostics of OpenCL specific types and address space qualified
+  types in ``reinterpret_cast`` and template functions.
+
+- Fixed ``NULL`` macro in internal headers to be compatible with C++.
+
+- Fixed use of ``half`` type.
 
 ABI Changes in Clang
 



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] bd8cc85 - [OpenCL][Docs] Update OpenCL 3.0 implementation status.

2021-09-10 Thread Anastasia Stulova via llvm-branch-commits

Author: Anastasia Stulova
Date: 2021-09-10T12:33:09+01:00
New Revision: bd8cc8543fdc91a571d92bb004dbc38c9e008249

URL: 
https://github.com/llvm/llvm-project/commit/bd8cc8543fdc91a571d92bb004dbc38c9e008249
DIFF: 
https://github.com/llvm/llvm-project/commit/bd8cc8543fdc91a571d92bb004dbc38c9e008249.diff

LOG: [OpenCL][Docs] Update OpenCL 3.0 implementation status.

Update a section of OpenCLSupport page to reflect the latest
development in OpenCL 3.0 support for release 13.

Differential Revision: https://reviews.llvm.org/D109320

(cherry picked from commit cff03d5fc48700b73ae863d4f25e780e74dff33e)

Added: 


Modified: 
clang/docs/OpenCLSupport.rst

Removed: 




diff  --git a/clang/docs/OpenCLSupport.rst b/clang/docs/OpenCLSupport.rst
index 047c73f2834a..006579855215 100644
--- a/clang/docs/OpenCLSupport.rst
+++ b/clang/docs/OpenCLSupport.rst
@@ -362,41 +362,43 @@ OpenCL C 3.0 Implementation Status
 The following table provides an overview of features in OpenCL C 3.0 and their
 implementation status.
 
-+--+--+--+---+
-| Category | Feature   
   | Status   | Reviews 
  |
-+==+==+==+===+
-| Command line interface   | New value for ``-cl-std`` flag
   | :good:`done` | https://reviews.llvm.org/D88300 
  |
-+--+--+--+---+
-| Predefined macros| New version macro 
   | :good:`done` | https://reviews.llvm.org/D88300 
  |
-+--+--+--+---+
-| Predefined macros| Feature macros
   | :good:`done` | https://reviews.llvm.org/D95776 
  |
-+--+--+--+---+
-| Feature optionality  | Generic address space 
   | :none:`worked on`| https://reviews.llvm.org/D95778 
(partial frontend)|
-+--+--+--+---+
-| Feature optionality  | Builtin function overloads with generic 
address space| :part:`worked on`| https://reviews.llvm.org/D92004   
|
-+--+--+--+---+
-| Feature optionality  | Program scope variables in global memory  
   | :none:`unclaimed`| 
  |
-+--+--+--+---+
-| Feature optionality  | 3D image writes including builtin functions   
   | :none:`unclaimed`| 
  |
-+--+--+--+---+
-| Feature optionality  | read_write images including builtin functions 
   | :none:`unclaimed`| 
  |
-+--+--+--+---+
-| Feature optionality  | C11 atomics memory scopes, ordering and 
builtin function | :part:`worked on`| https://reviews.llvm.org/D9

[llvm-branch-commits] [llvm] 84a3be8 - [SimplifyCFG] performBranchToCommonDestFolding(): require block-closed SSA form for bonus instructions (PR51125)

2021-09-10 Thread Tom Stellard via llvm-branch-commits

Author: Roman Lebedev
Date: 2021-09-10T09:02:26-07:00
New Revision: 84a3be829686268ce033b034bfa5f2c9d1a83b60

URL: 
https://github.com/llvm/llvm-project/commit/84a3be829686268ce033b034bfa5f2c9d1a83b60
DIFF: 
https://github.com/llvm/llvm-project/commit/84a3be829686268ce033b034bfa5f2c9d1a83b60.diff

LOG: [SimplifyCFG] performBranchToCommonDestFolding(): require block-closed SSA 
form for bonus instructions (PR51125)

I can't seem to wrap my head around the proper fix here,
we should be fine without this requirement, iff we can form this form,
but the naive attempt (https://reviews.llvm.org/D106317) has failed.
So just to unblock the release, put up a restriction.

Fixes https://bugs.llvm.org/show_bug.cgi?id=51125

(cherry picked from commit 909cba969981032c5740774ca84a34b7f76b909b)

Added: 


Modified: 
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
llvm/test/CodeGen/Thumb2/mve-float16regloops.ll
llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
llvm/test/CodeGen/Thumb2/mve-postinc-lsr.ll
llvm/test/Transforms/LoopUnroll/peel-loop-inner.ll
llvm/test/Transforms/PhaseOrdering/X86/vector-reductions-logical.ll
llvm/test/Transforms/SimplifyCFG/fold-branch-to-common-dest.ll

Removed: 




diff  --git a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp 
b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
index 583bb379488e6..d86ecbb6db004 100644
--- a/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyCFG.cpp
@@ -1094,17 +1094,24 @@ static void 
CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses(
 
 // Update (liveout) uses of bonus instructions,
 // now that the bonus instruction has been cloned into predecessor.
-SSAUpdater SSAUpdate;
-SSAUpdate.Initialize(BonusInst.getType(),
- (NewBonusInst->getName() + ".merge").str());
-SSAUpdate.AddAvailableValue(BB, &BonusInst);
-SSAUpdate.AddAvailableValue(PredBlock, NewBonusInst);
+// Note that we expect to be in a block-closed SSA form for this to work!
 for (Use &U : make_early_inc_range(BonusInst.uses())) {
   auto *UI = cast(U.getUser());
-  if (UI->getParent() != PredBlock)
-SSAUpdate.RewriteUseAfterInsertions(U);
-  else // Use is in the same block as, and comes before, NewBonusInst.
-SSAUpdate.RewriteUse(U);
+  auto *PN = dyn_cast(UI);
+  if (!PN) {
+assert(UI->getParent() == BB && BonusInst.comesBefore(UI) &&
+   "If the user is not a PHI node, then it should be in the same "
+   "block as, and come after, the original bonus instruction.");
+continue; // Keep using the original bonus instruction.
+  }
+  // Is this the block-closed SSA form PHI node?
+  if (PN->getIncomingBlock(U) == BB)
+continue; // Great, keep using the original bonus instruction.
+  // The only other alternative is an "use" when coming from
+  // the predecessor block - here we should refer to the cloned bonus 
instr.
+  assert(PN->getIncomingBlock(U) == PredBlock &&
+ "Not in block-closed SSA form?");
+  U.set(NewBonusInst);
 }
   }
 }
@@ -3207,6 +3214,17 @@ bool llvm::FoldBranchToCommonDest(BranchInst *BI, 
DomTreeUpdater *DTU,
 // Early exits once we reach the limit.
 if (NumBonusInsts > BonusInstThreshold)
   return false;
+
+auto IsBCSSAUse = [BB, &I](Use &U) {
+  auto *UI = cast(U.getUser());
+  if (auto *PN = dyn_cast(UI))
+return PN->getIncomingBlock(U) == BB;
+  return UI->getParent() == BB && I.comesBefore(UI);
+};
+
+// Does this instruction require rewriting of uses?
+if (!all_of(I.uses(), IsBCSSAUse))
+  return false;
   }
 
   // Ok, we have the budget. Perform the transformation.

diff  --git a/llvm/test/CodeGen/Thumb2/mve-float16regloops.ll 
b/llvm/test/CodeGen/Thumb2/mve-float16regloops.ll
index cc8a3b36c8305..51cc1ec01a7fd 100644
--- a/llvm/test/CodeGen/Thumb2/mve-float16regloops.ll
+++ b/llvm/test/CodeGen/Thumb2/mve-float16regloops.ll
@@ -1054,7 +1054,7 @@ define void @fir(%struct.arm_fir_instance_f32* nocapture 
readonly %S, half* noca
 ; CHECK-NEXT:cmp r3, #8
 ; CHECK-NEXT:str r1, [sp, #20] @ 4-byte Spill
 ; CHECK-NEXT:blo.w .LBB16_12
-; CHECK-NEXT:  @ %bb.1: @ %entry
+; CHECK-NEXT:  @ %bb.1: @ %if.then
 ; CHECK-NEXT:lsrs.w r12, r3, #2
 ; CHECK-NEXT:beq.w .LBB16_12
 ; CHECK-NEXT:  @ %bb.2: @ %while.body.lr.ph

diff  --git a/llvm/test/CodeGen/Thumb2/mve-float32regloops.ll 
b/llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
index 2d73c7531fe69..ae209e7d5688c 100644
--- a/llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
+++ b/llvm/test/CodeGen/Thumb2/mve-float32regloops.ll
@@ -1048,7 +1048,7 @@ define void @fir(%struct.arm_fir_instance_f32* nocapture 
readonly %S, float* noc
 ; CHECK-NEXT:sub sp, #32
 ; CHECK-NEXT:cmp r3, #8
 ; CHECK-NEXT:blo.w .LBB16_12
-; CHECK-NEXT:  @ %bb.1: @ %entry
+;

[llvm-branch-commits] [llvm] 1ff9aa2 - [IR] Handle constant expressions in containsUndefinedElement()

2021-09-10 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2021-09-10T09:04:21-07:00
New Revision: 1ff9aa2bfe19dc8cefe7ced1a277cc58477b5dcb

URL: 
https://github.com/llvm/llvm-project/commit/1ff9aa2bfe19dc8cefe7ced1a277cc58477b5dcb
DIFF: 
https://github.com/llvm/llvm-project/commit/1ff9aa2bfe19dc8cefe7ced1a277cc58477b5dcb.diff

LOG: [IR] Handle constant expressions in containsUndefinedElement()

If the constant is a constant expression, then getAggregateElement()
will return null. Guard against this before calling HasFn().

(cherry picked from commit af382b93831ae6a58bce8bc075458cfd056e3976)

Added: 


Modified: 
llvm/lib/IR/Constants.cpp
llvm/test/Transforms/InstSimplify/ConstProp/vecreduce.ll

Removed: 




diff  --git a/llvm/lib/IR/Constants.cpp b/llvm/lib/IR/Constants.cpp
index 6c75085a6678d..1e72cb4d3a668 100644
--- a/llvm/lib/IR/Constants.cpp
+++ b/llvm/lib/IR/Constants.cpp
@@ -315,9 +315,11 @@ containsUndefinedElement(const Constant *C,
   return false;
 
 for (unsigned i = 0, e = cast(VTy)->getNumElements();
- i != e; ++i)
-  if (HasFn(C->getAggregateElement(i)))
-return true;
+ i != e; ++i) {
+  if (Constant *Elem = C->getAggregateElement(i))
+if (HasFn(Elem))
+  return true;
+}
   }
 
   return false;

diff  --git a/llvm/test/Transforms/InstSimplify/ConstProp/vecreduce.ll 
b/llvm/test/Transforms/InstSimplify/ConstProp/vecreduce.ll
index e27180b1a8909..b407c908cf2a2 100644
--- a/llvm/test/Transforms/InstSimplify/ConstProp/vecreduce.ll
+++ b/llvm/test/Transforms/InstSimplify/ConstProp/vecreduce.ll
@@ -87,6 +87,15 @@ define i32 @add_poison1() {
   ret i32 %x
 }
 
+define i32 @add_constexpr() {
+; CHECK-LABEL: @add_constexpr(
+; CHECK-NEXT:[[X:%.*]] = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> 
bitcast (<4 x i64>  to <8 x i32>))
+; CHECK-NEXT:ret i32 [[X]]
+;
+  %x = call i32 @llvm.vector.reduce.add.v8i32(<8 x i32> bitcast (<4 x i64> 
 to <8 x i32>))
+  ret i32 %x
+}
+
 define i32 @mul_0() {
 ; CHECK-LABEL: @mul_0(
 ; CHECK-NEXT:ret i32 0



___
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 77f2430 - [X86] Don't clobber EBX in stackprobes

2021-09-10 Thread Tom Stellard via llvm-branch-commits

Author: Elliot Saba
Date: 2021-09-10T09:30:52-07:00
New Revision: 77f24308fe7890ee5094dd3c84441ae8a45adb20

URL: 
https://github.com/llvm/llvm-project/commit/77f24308fe7890ee5094dd3c84441ae8a45adb20
DIFF: 
https://github.com/llvm/llvm-project/commit/77f24308fe7890ee5094dd3c84441ae8a45adb20.diff

LOG: [X86] Don't clobber EBX in stackprobes

On X86, the stackprobe emission code chooses the `R11D` register, which
is illegal on i686.  This ends up wrapping around to `EBX`, which does
not get properly callee-saved within the stack probing prologue,
clobbering the register for the callers.

We fix this by explicitly using `EAX` as the stack probe register.

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D109203

(cherry picked from commit ae8507b0df738205a6b9e3795ad34672b7499381)

Added: 


Modified: 
llvm/lib/Target/X86/X86FrameLowering.cpp
llvm/test/CodeGen/X86/stack-clash-large.ll

Removed: 




diff  --git a/llvm/lib/Target/X86/X86FrameLowering.cpp 
b/llvm/lib/Target/X86/X86FrameLowering.cpp
index 4cde7971e597e..86cb86b19d629 100644
--- a/llvm/lib/Target/X86/X86FrameLowering.cpp
+++ b/llvm/lib/Target/X86/X86FrameLowering.cpp
@@ -671,7 +671,9 @@ void X86FrameLowering::emitStackProbeInlineGenericLoop(
   MF.insert(MBBIter, testMBB);
   MF.insert(MBBIter, tailMBB);
 
-  Register FinalStackProbed = Uses64BitFramePtr ? X86::R11 : X86::R11D;
+  Register FinalStackProbed = Uses64BitFramePtr ? X86::R11
+  : Is64Bit ? X86::R11D
+: X86::EAX;
   BuildMI(MBB, MBBI, DL, TII.get(TargetOpcode::COPY), FinalStackProbed)
   .addReg(StackPtr)
   .setMIFlag(MachineInstr::FrameSetup);
@@ -1092,7 +1094,9 @@ void 
X86FrameLowering::BuildStackAlignAND(MachineBasicBlock &MBB,
   MF.insert(MBBIter, bodyMBB);
   MF.insert(MBBIter, footMBB);
   const unsigned MovMIOpc = Is64Bit ? X86::MOV64mi32 : X86::MOV32mi;
-  Register FinalStackProbed = Uses64BitFramePtr ? X86::R11 : X86::R11D;
+  Register FinalStackProbed = Uses64BitFramePtr ? X86::R11
+  : Is64Bit ? X86::R11D
+: X86::EAX;
 
   // Setup entry block
   {

diff  --git a/llvm/test/CodeGen/X86/stack-clash-large.ll 
b/llvm/test/CodeGen/X86/stack-clash-large.ll
index 9129e4ed40fda..00c7843b54f55 100644
--- a/llvm/test/CodeGen/X86/stack-clash-large.ll
+++ b/llvm/test/CodeGen/X86/stack-clash-large.ll
@@ -1,45 +1,64 @@
-; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py 
UTC_ARGS: --no_x86_scrub_sp
-; RUN: llc -mtriple=x86_64-linux-android < %s | FileCheck 
-check-prefix=CHECK-X86-64 %s
-; RUN: llc -mtriple=i686-linux-android < %s | FileCheck 
-check-prefix=CHECK-X86-32 %s
+; RUN: llc -mtriple=x86_64-linux-android < %s | FileCheck 
-check-prefix=CHECK-X64 %s
+; RUN: llc -mtriple=i686-linux-android < %s | FileCheck 
-check-prefix=CHECK-X86 %s
+; RUN: llc -mtriple=x86_64-linux-gnux32 < %s | FileCheck 
-check-prefix=CHECK-X32 %s
 
 define i32 @foo() local_unnamed_addr #0 {
-; CHECK-X86-64-LABEL: foo:
-; CHECK-X86-64:   # %bb.0:
-; CHECK-X86-64-NEXT:movq %rsp, %r11
-; CHECK-X86-64-NEXT:subq $69632, %r11 # imm = 0x11000
-; CHECK-X86-64-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
-; CHECK-X86-64-NEXT:subq $4096, %rsp # imm = 0x1000
-; CHECK-X86-64-NEXT:movq $0, (%rsp)
-; CHECK-X86-64-NEXT:cmpq %r11, %rsp
-; CHECK-X86-64-NEXT:jne .LBB0_1
-; CHECK-X86-64-NEXT:  # %bb.2:
-; CHECK-X86-64-NEXT:subq $2248, %rsp # imm = 0x8C8
-; CHECK-X86-64-NEXT:.cfi_def_cfa_offset 71888
-; CHECK-X86-64-NEXT:movl $1, 264(%rsp)
-; CHECK-X86-64-NEXT:movl $1, 28664(%rsp)
-; CHECK-X86-64-NEXT:movl -128(%rsp), %eax
-; CHECK-X86-64-NEXT:addq $71880, %rsp # imm = 0x118C8
-; CHECK-X86-64-NEXT:.cfi_def_cfa_offset 8
-; CHECK-X86-64-NEXT:retq
+; CHECK-X64-LABEL: foo:
+; CHECK-X64:   # %bb.0:
+; CHECK-X64-NEXT:movq %rsp, %r11
+; CHECK-X64-NEXT:subq $69632, %r11 # imm = 0x11000
+; CHECK-X64-NEXT:  .LBB0_1: # =>This Inner Loop Header: Depth=1
+; CHECK-X64-NEXT:subq $4096, %rsp # imm = 0x1000
+; CHECK-X64-NEXT:movq $0, (%rsp)
+; CHECK-X64-NEXT:cmpq %r11, %rsp
+; CHECK-X64-NEXT:jne .LBB0_1
+; CHECK-X64-NEXT:  # %bb.2:
+; CHECK-X64-NEXT:subq $2248, %rsp # imm = 0x8C8
+; CHECK-X64-NEXT:.cfi_def_cfa_offset 71888
+; CHECK-X64-NEXT:movl $1, 264(%rsp)
+; CHECK-X64-NEXT:movl $1, 28664(%rsp)
+; CHECK-X64-NEXT:movl -128(%rsp), %eax
+; CHECK-X64-NEXT:addq $71880, %rsp # imm = 0x118C8
+; CHECK-X64-NEXT:.cfi_def_cfa_offset 8
+; CHECK-X64-NEXT:retq
 ;
-; CHECK-X86-32-LABEL: foo:
-; CHECK-X86-32:   # %bb.0:
-; CHECK-X86-32-NEXT:movl %esp, %r11d
-; CHECK-X86-32-NEXT:subl $69632, %r11d # imm = 0x11000
-; CHECK-X86-32-NEXT:  .LBB0_1: # =>This Inner Loop Header:

[llvm-branch-commits] [lld] 4728892 - [LLD] Support compressed input sections on big-endian targets

2021-09-10 Thread Tom Stellard via llvm-branch-commits

Author: Simon Atanasyan
Date: 2021-09-10T15:36:33-07:00
New Revision: 4728892cd336dfd29a2d18dee06c1f2e46b8f73d

URL: 
https://github.com/llvm/llvm-project/commit/4728892cd336dfd29a2d18dee06c1f2e46b8f73d
DIFF: 
https://github.com/llvm/llvm-project/commit/4728892cd336dfd29a2d18dee06c1f2e46b8f73d.diff

LOG: [LLD] Support compressed input sections on big-endian targets

This patch enables compressed input sections on big-endian targets by
checking the target endianness and selecting an appropriate `Chdr`
structure.

Fixes PR51369

Differential Revision: https://reviews.llvm.org/D107635

(cherry picked from commit c6ebc651b6fac9cf1d9f8c00ea49d29093003f85)

Added: 


Modified: 
lld/ELF/InputSection.cpp
lld/ELF/InputSection.h
lld/test/ELF/compressed-debug-input-err.s
lld/test/ELF/compressed-debug-input.s

Removed: 




diff  --git a/lld/ELF/InputSection.cpp b/lld/ELF/InputSection.cpp
index 1f9fa961fc268..7d952e9037f16 100644
--- a/lld/ELF/InputSection.cpp
+++ b/lld/ELF/InputSection.cpp
@@ -88,7 +88,22 @@ InputSectionBase::InputSectionBase(InputFile *file, uint64_t 
flags,
 if (!zlib::isAvailable())
   error(toString(file) + ": contains a compressed section, " +
 "but zlib is not available");
-parseCompressedHeader();
+switch (config->ekind) {
+case ELF32LEKind:
+  parseCompressedHeader();
+  break;
+case ELF32BEKind:
+  parseCompressedHeader();
+  break;
+case ELF64LEKind:
+  parseCompressedHeader();
+  break;
+case ELF64BEKind:
+  parseCompressedHeader();
+  break;
+default:
+  llvm_unreachable("unknown ELFT");
+}
   }
 }
 
@@ -210,10 +225,7 @@ OutputSection *SectionBase::getOutputSection() {
 // When a section is compressed, `rawData` consists with a header followed
 // by zlib-compressed data. This function parses a header to initialize
 // `uncompressedSize` member and remove the header from `rawData`.
-void InputSectionBase::parseCompressedHeader() {
-  using Chdr64 = typename ELF64LE::Chdr;
-  using Chdr32 = typename ELF32LE::Chdr;
-
+template  void InputSectionBase::parseCompressedHeader() {
   // Old-style header
   if (name.startswith(".zdebug")) {
 if (!toStringRef(rawData).startswith("ZLIB")) {
@@ -239,32 +251,13 @@ void InputSectionBase::parseCompressedHeader() {
   assert(flags & SHF_COMPRESSED);
   flags &= ~(uint64_t)SHF_COMPRESSED;
 
-  // New-style 64-bit header
-  if (config->is64) {
-if (rawData.size() < sizeof(Chdr64)) {
-  error(toString(this) + ": corrupted compressed section");
-  return;
-}
-
-auto *hdr = reinterpret_cast(rawData.data());
-if (hdr->ch_type != ELFCOMPRESS_ZLIB) {
-  error(toString(this) + ": unsupported compression type");
-  return;
-}
-
-uncompressedSize = hdr->ch_size;
-alignment = std::max(hdr->ch_addralign, 1);
-rawData = rawData.slice(sizeof(*hdr));
-return;
-  }
-
-  // New-style 32-bit header
-  if (rawData.size() < sizeof(Chdr32)) {
+  // New-style header
+  if (rawData.size() < sizeof(typename ELFT::Chdr)) {
 error(toString(this) + ": corrupted compressed section");
 return;
   }
 
-  auto *hdr = reinterpret_cast(rawData.data());
+  auto *hdr = reinterpret_cast(rawData.data());
   if (hdr->ch_type != ELFCOMPRESS_ZLIB) {
 error(toString(this) + ": unsupported compression type");
 return;

diff  --git a/lld/ELF/InputSection.h b/lld/ELF/InputSection.h
index 5b91c1c90bd2c..c914d0b421552 100644
--- a/lld/ELF/InputSection.h
+++ b/lld/ELF/InputSection.h
@@ -238,6 +238,7 @@ class InputSectionBase : public SectionBase {
   }
 
 protected:
+  template 
   void parseCompressedHeader();
   void uncompress() const;
 

diff  --git a/lld/test/ELF/compressed-debug-input-err.s 
b/lld/test/ELF/compressed-debug-input-err.s
index 89773eca59d71..0495a9eaa08e9 100644
--- a/lld/test/ELF/compressed-debug-input-err.s
+++ b/lld/test/ELF/compressed-debug-input-err.s
@@ -3,6 +3,9 @@
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
 # RUN: not ld.lld %t.o -o /dev/null -shared 2>&1 | FileCheck %s
 
+# RUN: llvm-mc -filetype=obj -triple=powerpc64-unknown-unknown %s -o %t-be.o
+# RUN: not ld.lld %t-be.o -o /dev/null -shared 2>&1 | FileCheck %s
+
 ## Check we are able to report zlib uncompress errors.
 # CHECK: error: {{.*}}.o:(.debug_str): uncompress failed: zlib error: 
Z_DATA_ERROR
 

diff  --git a/lld/test/ELF/compressed-debug-input.s 
b/lld/test/ELF/compressed-debug-input.s
index c9bfd3e516209..5b61ea8b384e0 100644
--- a/lld/test/ELF/compressed-debug-input.s
+++ b/lld/test/ELF/compressed-debug-input.s
@@ -1,7 +1,9 @@
 # REQUIRES: zlib, x86
 
 # RUN: llvm-mc -compress-debug-sections=zlib -filetype=obj 
-triple=x86_64-unknown-linux %s -o %t
+# RUN: llvm-mc -compress-debug-sections=zlib -filetype=obj 
-triple=powerpc64-unknown-unknown %s -o %t-be
 # RUN: llvm-readobj --sections %t | FileCheck -check-prefix=ZLIB %s

[llvm-branch-commits] [llvm] 1c198b3 - Revert [MC][ELF] Emit separate unique sections for different flags

2021-09-10 Thread Tom Stellard via llvm-branch-commits

Author: Nikita Popov
Date: 2021-09-10T16:55:29-07:00
New Revision: 1c198b3032e899003b5c1ffaa15c7648f24f1e69

URL: 
https://github.com/llvm/llvm-project/commit/1c198b3032e899003b5c1ffaa15c7648f24f1e69
DIFF: 
https://github.com/llvm/llvm-project/commit/1c198b3032e899003b5c1ffaa15c7648f24f1e69.diff

LOG: Revert [MC][ELF] Emit separate unique sections for different flags

Commit Message from @MaskRay:

Rust has a fragile embed-bitcode implementation
(https://github.com/rust-lang/rust/blob/bddb59cf07efcf6e606f16b87f85e3ecd2c1ca69/compiler/rustc_codegen_llvm/src/back/write.rs#L970-L1017)
which relied on the previous LLVM MC behavior.  Rust's LLVM fork
has carried a revert. This commit made the similar revert to help
distributions since they would otherwise probably carry a similar patch
(as they ship rust linked against system LLVM).

Fixes https://bugs.llvm.org/show_bug.cgi?id=51207.

Differential Revision: https://reviews.llvm.org/D107216

Added: 


Modified: 
llvm/include/llvm/MC/MCContext.h
llvm/lib/MC/MCContext.cpp
llvm/test/CodeGen/Mips/gpopt-explict-section.ll
llvm/test/CodeGen/X86/explicit-section-mergeable.ll
llvm/unittests/ExecutionEngine/Orc/RTDyldObjectLinkingLayerTest.cpp

Removed: 
llvm/test/CodeGen/X86/elf-unique-sections-by-flags.ll



diff  --git a/llvm/include/llvm/MC/MCContext.h 
b/llvm/include/llvm/MC/MCContext.h
index 877b2dc4ac92..2ff9c967e848 100644
--- a/llvm/include/llvm/MC/MCContext.h
+++ b/llvm/include/llvm/MC/MCContext.h
@@ -374,17 +374,17 @@ namespace llvm {
   bool operator<(const ELFEntrySizeKey &Other) const {
 if (SectionName != Other.SectionName)
   return SectionName < Other.SectionName;
-if (Flags != Other.Flags)
-  return Flags < Other.Flags;
+if ((Flags & ELF::SHF_STRINGS) != (Other.Flags & ELF::SHF_STRINGS))
+  return Other.Flags & ELF::SHF_STRINGS;
 return EntrySize < Other.EntrySize;
   }
 };
 
-// Symbols must be assigned to a section with a compatible entry size and
-// flags. This map is used to assign unique IDs to sections to distinguish
-// between sections with identical names but incompatible entry sizes 
and/or
-// flags. This can occur when a symbol is explicitly assigned to a section,
-// e.g. via __attribute__((section("myname"))).
+// Symbols must be assigned to a section with a compatible entry
+// size. This map is used to assign unique IDs to sections to
+// distinguish between sections with identical names but incompatible entry
+// sizes. This can occur when a symbol is explicitly assigned to a
+// section, e.g. via __attribute__((section("myname"))).
 std::map ELFEntrySizeMap;
 
 // This set is used to record the generic mergeable section names seen.
@@ -592,8 +592,6 @@ namespace llvm {
 
 bool isELFGenericMergeableSection(StringRef Name);
 
-/// Return the unique ID of the section with the given name, flags and 
entry
-/// size, if it exists.
 Optional getELFUniqueIDForEntsize(StringRef SectionName,
 unsigned Flags,
 unsigned EntrySize);

diff  --git a/llvm/lib/MC/MCContext.cpp b/llvm/lib/MC/MCContext.cpp
index aa4051aa2400..cc349af6393b 100644
--- a/llvm/lib/MC/MCContext.cpp
+++ b/llvm/lib/MC/MCContext.cpp
@@ -586,7 +586,7 @@ void MCContext::recordELFMergeableSectionInfo(StringRef 
SectionName,
   unsigned Flags, unsigned 
UniqueID,
   unsigned EntrySize) {
   bool IsMergeable = Flags & ELF::SHF_MERGE;
-  if (UniqueID == GenericSectionID)
+  if (IsMergeable && (UniqueID == GenericSectionID))
 ELFSeenGenericMergeableSections.insert(SectionName);
 
   // For mergeable sections or non-mergeable sections with a generic mergeable

diff  --git a/llvm/test/CodeGen/Mips/gpopt-explict-section.ll 
b/llvm/test/CodeGen/Mips/gpopt-explict-section.ll
index d546895749ca..f1518ba562c7 100644
--- a/llvm/test/CodeGen/Mips/gpopt-explict-section.ll
+++ b/llvm/test/CodeGen/Mips/gpopt-explict-section.ll
@@ -7,7 +7,7 @@
 ; small data section. Also test that explicitly placing something in the small
 ; data section uses %gp_rel addressing mode.
 
-@a = constant [2 x i32] zeroinitializer, section ".rodata", align 4
+@a = global [2 x i32] zeroinitializer, section ".rodata", align 4
 @b = global [4 x i32] zeroinitializer, section ".sdata", align 4
 @c = global [4 x i32] zeroinitializer, section ".sbss", align 4
 

diff  --git a/llvm/test/CodeGen/X86/elf-unique-sections-by-flags.ll 
b/llvm/test/CodeGen/X86/elf-unique-sections-by-flags.ll
deleted file mode 100644
index 0ca08d4b3fea..
--- a/llvm/test/CodeGen/X86/elf-unique-sections-by-flags.ll
+++ /dev/null
@@ -1,140 +0,0 @@
-; Test that global values with the same specified section produces multip

[llvm-branch-commits] [llvm] fbb8b41 - Revert "[AArch64][GlobalISel] Legalize bswap <2 x i16>"

2021-09-10 Thread Tom Stellard via llvm-branch-commits

Author: Tom Stellard
Date: 2021-09-10T21:09:59-07:00
New Revision: fbb8b41588be5012584f14255beb1b0da5d9c12a

URL: 
https://github.com/llvm/llvm-project/commit/fbb8b41588be5012584f14255beb1b0da5d9c12a
DIFF: 
https://github.com/llvm/llvm-project/commit/fbb8b41588be5012584f14255beb1b0da5d9c12a.diff

LOG: Revert "[AArch64][GlobalISel] Legalize bswap <2 x i16>"

This reverts commit 5cd63e9ec2a385de2682949c0bbe928afaf35c91.

https://bugs.llvm.org/show_bug.cgi?id=51707

Added: 


Modified: 
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.h
llvm/test/CodeGen/AArch64/GlobalISel/legalize-bswap.mir
llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir

Removed: 




diff  --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp 
b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
index 08e4a119127c9..edf4d06d4d591 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
@@ -103,8 +103,7 @@ AArch64LegalizerInfo::AArch64LegalizerInfo(const 
AArch64Subtarget &ST)
   getActionDefinitionsBuilder(G_BSWAP)
   .legalFor({s32, s64, v4s32, v2s32, v2s64})
   .clampScalar(0, s32, s64)
-  .widenScalarToNextPow2(0)
-  .customIf(typeIs(0, v2s16)); // custom lower as G_REV32 + G_LSHR
+  .widenScalarToNextPow2(0);
 
   getActionDefinitionsBuilder({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR})
   .legalFor({s32, s64, v2s32, v4s32, v4s16, v8s16, v16s8, v8s8})
@@ -799,8 +798,6 @@ bool AArch64LegalizerInfo::legalizeCustom(LegalizerHelper 
&Helper,
   case TargetOpcode::G_LOAD:
   case TargetOpcode::G_STORE:
 return legalizeLoadStore(MI, MRI, MIRBuilder, Observer);
-  case TargetOpcode::G_BSWAP:
-return legalizeBSwap(MI, MRI, MIRBuilder);
   case TargetOpcode::G_SHL:
   case TargetOpcode::G_ASHR:
   case TargetOpcode::G_LSHR:
@@ -1015,46 +1012,6 @@ bool AArch64LegalizerInfo::legalizeLoadStore(
   return true;
 }
 
-bool AArch64LegalizerInfo::legalizeBSwap(MachineInstr &MI,
- MachineRegisterInfo &MRI,
- MachineIRBuilder &MIRBuilder) const {
-  assert(MI.getOpcode() == TargetOpcode::G_BSWAP);
-
-  // The <2 x half> case needs special lowering because there isn't an
-  // instruction that does that directly. Instead, we widen to <8 x i8>
-  // and emit a G_REV32 followed by a G_LSHR knowing that instruction selection
-  // will later match them as:
-  //
-  //   rev32.8b v0, v0
-  //   ushr.2s v0, v0, #16
-  //
-  // We could emit those here directly, but it seems better to keep things as
-  // generic as possible through legalization, and avoid committing layering
-  // violations by legalizing & selecting here at the same time.
-
-  Register ValReg = MI.getOperand(1).getReg();
-  assert(LLT::fixed_vector(2, 16) == MRI.getType(ValReg));
-  const LLT v2s32 = LLT::fixed_vector(2, 32);
-  const LLT v8s8 = LLT::fixed_vector(8, 8);
-  const LLT s32 = LLT::scalar(32);
-
-  auto Undef = MIRBuilder.buildUndef(v8s8);
-  auto Insert =
-  MIRBuilder
-  .buildInstr(TargetOpcode::INSERT_SUBREG, {v8s8}, {Undef, ValReg})
-  .addImm(AArch64::ssub);
-  auto Rev32 = MIRBuilder.buildInstr(AArch64::G_REV32, {v8s8}, {Insert});
-  auto Bitcast = MIRBuilder.buildBitcast(v2s32, Rev32);
-  auto Amt = MIRBuilder.buildConstant(v2s32, 16);
-  auto UShr =
-  MIRBuilder.buildInstr(TargetOpcode::G_LSHR, {v2s32}, {Bitcast, Amt});
-  auto Zero = MIRBuilder.buildConstant(s32, 0);
-  auto Extract = MIRBuilder.buildExtractVectorElement(s32, UShr, Zero);
-  MIRBuilder.buildBitcast({MI.getOperand(0).getReg()}, Extract);
-  MI.eraseFromParent();
-  return true;
-}
-
 bool AArch64LegalizerInfo::legalizeVaArg(MachineInstr &MI,
  MachineRegisterInfo &MRI,
  MachineIRBuilder &MIRBuilder) const {

diff  --git a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.h 
b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.h
index 78fc24559d71c..35456d95dc2b6 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.h
+++ b/llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.h
@@ -35,8 +35,6 @@ class AArch64LegalizerInfo : public LegalizerInfo {
  MachineInstr &MI) const override;
 
 private:
-  bool legalizeBSwap(MachineInstr &MI, MachineRegisterInfo &MRI,
- MachineIRBuilder &MIRBuilder) const;
   bool legalizeVaArg(MachineInstr &MI, MachineRegisterInfo &MRI,
  MachineIRBuilder &MIRBuilder) const;
   bool legalizeLoadStore(MachineInstr &MI, MachineRegisterInfo &MRI,

diff  --git a/llvm/test/CodeGen/AArch64/GlobalISel/legalize-bswap.mir 
b/llvm/test/CodeGen/AArch64/GlobalISel/legalize-bswap.mir
index 6ffed0a20600c..5e5cbf63bb073 100644
--- a/llvm/test/CodeGen/A