date:20201120

[llvm-branch-commits] [clang] d50044e - [CUDA] Improve clang's ability to detect recent CUDA versions.

2020-11-20 Thread Tom Stellard via llvm-branch-commits


Author: Artem Belevich
Date: 2020-11-20T22:02:57-08:00
New Revision: d50044e809d2c15c56df0ea808f047a2c81d7344

URL: 
https://github.com/llvm/llvm-project/commit/d50044e809d2c15c56df0ea808f047a2c81d7344
DIFF: 
https://github.com/llvm/llvm-project/commit/d50044e809d2c15c56df0ea808f047a2c81d7344.diff

LOG: [CUDA] Improve clang's ability to detect recent CUDA versions.

CUDA-11.1 does not carry version.txt which causes clang to assume that it's
CUDA-7.0, which used to be the only CUDA version w/o version.txt.

In order to tell CUDA-7.0 apart from the new versions, clang now probes for the
presence of libdevice.10.bc which is not present in the old CUDA versions.

This should keep Clang working for CUDA-11.1.

PR47332: https://bugs.llvm.org/show_bug.cgi?id=47332

Differential Revision: https://reviews.llvm.org/D89752

(cherry picked from commit 65d206484c54177641d4b11d42cab1f1acc8c0c7)

Added: 
clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/bin/.keep
clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/include/.keep
clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/lib/.keep
clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/lib64/.keep

clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/nvvm/libdevice/libdevice.10.bc

Modified: 
clang/lib/Driver/ToolChains/Cuda.cpp
clang/test/Driver/cuda-version-check.cu

Removed: 




diff  --git a/clang/lib/Driver/ToolChains/Cuda.cpp 
b/clang/lib/Driver/ToolChains/Cuda.cpp
index 110a0bca9bc1..cfd9dae0fa91 100644
--- a/clang/lib/Driver/ToolChains/Cuda.cpp
+++ b/clang/lib/Driver/ToolChains/Cuda.cpp
@@ -155,9 +155,14 @@ CudaInstallationDetector::CudaInstallationDetector(
 llvm::ErrorOr> VersionFile =
 FS.getBufferForFile(InstallPath + "/version.txt");
 if (!VersionFile) {
-  // CUDA 7.0 doesn't have a version.txt, so guess that's our version if
-  // version.txt isn't present.
-  Version = CudaVersion::CUDA_70;
+  // CUDA 7.0 and CUDA 11.1+ do not have version.txt file.
+  // Use libdevice file to distinguish 7.0 from the new versions.
+  if (FS.exists(LibDevicePath + "/libdevice.10.bc")) {
+Version = CudaVersion::LATEST;
+DetectedVersionIsNotSupported = Version > 
CudaVersion::LATEST_SUPPORTED;
+  } else {
+Version = CudaVersion::CUDA_70;
+  }
 } else {
   ParseCudaVersionFile((*VersionFile)->getBuffer());
 }

diff  --git a/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/bin/.keep 
b/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/bin/.keep
new file mode 100644
index ..e69de29bb2d1

diff  --git a/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/include/.keep 
b/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/include/.keep
new file mode 100644
index ..e69de29bb2d1

diff  --git a/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/lib/.keep 
b/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/lib/.keep
new file mode 100644
index ..e69de29bb2d1

diff  --git a/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/lib64/.keep 
b/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/lib64/.keep
new file mode 100644
index ..e69de29bb2d1

diff  --git 
a/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/nvvm/libdevice/libdevice.10.bc
 
b/clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/nvvm/libdevice/libdevice.10.bc
new file mode 100644
index ..e69de29bb2d1

diff  --git a/clang/test/Driver/cuda-version-check.cu 
b/clang/test/Driver/cuda-version-check.cu
index a09b248304f2..1e6af029202f 100644
--- a/clang/test/Driver/cuda-version-check.cu
+++ b/clang/test/Driver/cuda-version-check.cu
@@ -10,6 +10,11 @@
 // RUN:FileCheck %s --check-prefix=OK
 // RUN: %clang --target=x86_64-linux -v -### --cuda-gpu-arch=sm_60 
--cuda-path=%S/Inputs/CUDA-unknown/usr/local/cuda 2>&1 %s | \
 // RUN:FileCheck %s --check-prefix=UNKNOWN_VERSION
+// CUDA versions after 11.0 (update 1) do not carry version.txt file. Make sure
+// we still detect them as a new version and handle them the same as we handle
+// other new CUDA versions.
+// RUN: %clang --target=x86_64-linux -v -### --cuda-gpu-arch=sm_60 
--cuda-path=%S/Inputs/CUDA_111/usr/local/cuda 2>&1 %s | \
+// RUN:FileCheck %s --check-prefix=UNKNOWN_VERSION
 // Make sure that we don't warn about CUDA version during C++ compilation.
 // RUN: %clang --target=x86_64-linux -v -### -x c++ --cuda-gpu-arch=sm_60 \
 // RUN:--cuda-path=%S/Inputs/CUDA-unknown/usr/local/cuda 2>&1 %s | \
@@ -65,5 +70,5 @@
 // ERR_SM61: error: GPU arch sm_61 {{.*}}
 // ERR_SM61-NOT: error: GPU arch sm_61
 
-// UNKNOWN_VERSION: Unknown CUDA version 999.999. Assuming the latest 
supported version
+// UNKNOWN_VERSION: Unknown CUDA version {{.*}}. Assuming the latest supported 
version
 // UNKNOWN_VERSION_CXX-NOT: Unknown CUDA version



___
llvm-branch-commits mailing list
llvm-branch-commits@l

[llvm-branch-commits] [clang] 06f479c - [CUDA] Extract CUDA version from cuda.h if version.txt is not found

2020-11-20 Thread Tom Stellard via llvm-branch-commits


Author: Artem Belevich
Date: 2020-11-20T22:02:57-08:00
New Revision: 06f479cba3a09ef47326ea69e719d2aa1c0fba4c

URL: 
https://github.com/llvm/llvm-project/commit/06f479cba3a09ef47326ea69e719d2aa1c0fba4c
DIFF: 
https://github.com/llvm/llvm-project/commit/06f479cba3a09ef47326ea69e719d2aa1c0fba4c.diff

LOG: [CUDA] Extract CUDA version from cuda.h if version.txt is not found

If CUDA version can not be determined based on version.txt file, attempt to find
CUDA_VERSION macro in cuda.h.

This is a follow-up to D89752,

Differntial Revision: https://reviews.llvm.org/D89832

(cherry picked from commit e7fe125b776bf08d95e60ff3354a5c836218a0e6)

Added: 
clang/test/Driver/Inputs/CUDA_102/usr/local/cuda/bin/.keep
clang/test/Driver/Inputs/CUDA_102/usr/local/cuda/include/.keep
clang/test/Driver/Inputs/CUDA_102/usr/local/cuda/lib/.keep
clang/test/Driver/Inputs/CUDA_102/usr/local/cuda/lib64/.keep

clang/test/Driver/Inputs/CUDA_102/usr/local/cuda/nvvm/libdevice/libdevice.10.bc
clang/test/Driver/Inputs/CUDA_102/usr/local/cuda/version.txt
clang/test/Driver/Inputs/CUDA_111/usr/local/cuda/include/cuda.h

Modified: 
clang/include/clang/Basic/DiagnosticDriverKinds.td
clang/lib/Driver/ToolChains/Cuda.cpp
clang/lib/Driver/ToolChains/Cuda.h
clang/test/Driver/cuda-version-check.cu

Removed: 




diff  --git a/clang/include/clang/Basic/DiagnosticDriverKinds.td 
b/clang/include/clang/Basic/DiagnosticDriverKinds.td
index 558639ecad6a..acdad15cdf6c 100644
--- a/clang/include/clang/Basic/DiagnosticDriverKinds.td
+++ b/clang/include/clang/Basic/DiagnosticDriverKinds.td
@@ -69,7 +69,7 @@ def err_drv_cuda_version_unsupported : Error<
   "install, pass a 
diff erent GPU arch with --cuda-gpu-arch, or pass "
   "--no-cuda-version-check.">;
 def warn_drv_unknown_cuda_version: Warning<
-  "Unknown CUDA version %0. Assuming the latest supported version %1">,
+  "Unknown CUDA version. %0 Assuming the latest supported version %1">,
   InGroup;
 def err_drv_cuda_host_arch : Error<"unsupported architecture '%0' for host 
compilation.">;
 def err_drv_mix_cuda_hip : Error<"Mixed Cuda and HIP compilation is not 
supported.">;

diff  --git a/clang/lib/Driver/ToolChains/Cuda.cpp 
b/clang/lib/Driver/ToolChains/Cuda.cpp
index cfd9dae0fa91..ffc606dd554b 100644
--- a/clang/lib/Driver/ToolChains/Cuda.cpp
+++ b/clang/lib/Driver/ToolChains/Cuda.cpp
@@ -16,6 +16,7 @@
 #include "clang/Driver/Driver.h"
 #include "clang/Driver/DriverDiagnostic.h"
 #include "clang/Driver/Options.h"
+#include "llvm/ADT/Optional.h"
 #include "llvm/Option/ArgList.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Host.h"
@@ -32,29 +33,80 @@ using namespace clang::driver::tools;
 using namespace clang;
 using namespace llvm::opt;
 
+namespace {
+struct CudaVersionInfo {
+  std::string DetectedVersion;
+  CudaVersion Version;
+};
 // Parses the contents of version.txt in an CUDA installation.  It should
 // contain one line of the from e.g. "CUDA Version 7.5.2".
-void CudaInstallationDetector::ParseCudaVersionFile(llvm::StringRef V) {
-  Version = CudaVersion::UNKNOWN;
+CudaVersionInfo parseCudaVersionFile(llvm::StringRef V) {
+  V = V.trim();
   if (!V.startswith("CUDA Version "))
-return;
+return {V.str(), CudaVersion::UNKNOWN};
   V = V.substr(strlen("CUDA Version "));
   SmallVector VersionParts;
   V.split(VersionParts, '.');
-  if (VersionParts.size() < 2)
-return;
-  DetectedVersion = join_items(".", VersionParts[0], VersionParts[1]);
-  Version = CudaStringToVersion(DetectedVersion);
-  if (Version != CudaVersion::UNKNOWN) {
-// TODO(tra): remove the warning once we have all features of 10.2 and 11.0
-// implemented.
-DetectedVersionIsNotSupported = Version > CudaVersion::LATEST_SUPPORTED;
-return;
-  }
+  return {"version.txt: " + V.str() + ".",
+  VersionParts.size() < 2
+  ? CudaVersion::UNKNOWN
+  : CudaStringToVersion(
+join_items(".", VersionParts[0], VersionParts[1]))};
+}
+
+CudaVersion getCudaVersion(uint32_t raw_version) {
+  if (raw_version < 7050)
+return CudaVersion::CUDA_70;
+  if (raw_version < 8000)
+return CudaVersion::CUDA_75;
+  if (raw_version < 9000)
+return CudaVersion::CUDA_80;
+  if (raw_version < 9010)
+return CudaVersion::CUDA_90;
+  if (raw_version < 9020)
+return CudaVersion::CUDA_91;
+  if (raw_version < 1)
+return CudaVersion::CUDA_92;
+  if (raw_version < 10010)
+return CudaVersion::CUDA_100;
+  if (raw_version < 10020)
+return CudaVersion::CUDA_101;
+  if (raw_version < 11000)
+return CudaVersion::CUDA_102;
+  if (raw_version < 11010)
+return CudaVersion::CUDA_110;
+  return CudaVersion::LATEST;
+}
 
-  Version = CudaVersion::LATEST_SUPPORTED;
-  DetectedVersionIsNotSupported = true;
+CudaVersionInfo parseCudaHFile(llvm::StringRef Input) {
+  // Helper lambda which skips the words if the

[llvm-branch-commits] [llvm] 973b95e - [MCA][LSUnit] Correctly update the internal group flags on store barrier execution. Fixes PR48024.

2020-11-20 Thread Tom Stellard via llvm-branch-commits


Author: Andrea Di Biagio
Date: 2020-11-20T22:09:45-08:00
New Revision: 973b95e0a8450e701a106896b5fb9aeda46f9071

URL: 
https://github.com/llvm/llvm-project/commit/973b95e0a8450e701a106896b5fb9aeda46f9071
DIFF: 
https://github.com/llvm/llvm-project/commit/973b95e0a8450e701a106896b5fb9aeda46f9071.diff

LOG: [MCA][LSUnit] Correctly update the internal group flags on store barrier 
execution. Fixes PR48024.

This is likely to be a regressigion introduced by my last refactoring of the
LSUnit (commit 5578ec32f9c4f). Before this patch, the
"CurrentStoreBarrierGroupID" index was not correctly reset on store barrier
executions.  This was leading to unexpected crashes like the one reported as
PR48024.

(cherry picked from commit 0e20666db3ac280affe82d31b6c144923704e9c4)

Added: 
llvm/test/tools/llvm-mca/X86/BtVer2/stmxcsr-ldmxcsr.s
llvm/test/tools/llvm-mca/X86/Haswell/stmxcsr-ldmxcsr.s

Modified: 
llvm/lib/MCA/HardwareUnits/LSUnit.cpp

Removed: 




diff  --git a/llvm/lib/MCA/HardwareUnits/LSUnit.cpp 
b/llvm/lib/MCA/HardwareUnits/LSUnit.cpp
index e945e8cecce9..4594368fc0e9 100644
--- a/llvm/lib/MCA/HardwareUnits/LSUnit.cpp
+++ b/llvm/lib/MCA/HardwareUnits/LSUnit.cpp
@@ -243,6 +243,8 @@ void LSUnit::onInstructionExecuted(const InstRef &IR) {
   CurrentStoreGroupID = 0;
 if (GroupID == CurrentLoadBarrierGroupID)
   CurrentLoadBarrierGroupID = 0;
+if (GroupID == CurrentStoreBarrierGroupID)
+  CurrentStoreBarrierGroupID = 0;
   }
 }
 

diff  --git a/llvm/test/tools/llvm-mca/X86/BtVer2/stmxcsr-ldmxcsr.s 
b/llvm/test/tools/llvm-mca/X86/BtVer2/stmxcsr-ldmxcsr.s
new file mode 100644
index ..52bf97732d95
--- /dev/null
+++ b/llvm/test/tools/llvm-mca/X86/BtVer2/stmxcsr-ldmxcsr.s
@@ -0,0 +1,104 @@
+# NOTE: Assertions have been autogenerated by utils/update_mca_test_checks.py
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=btver2 -timeline 
-timeline-max-iterations=3 < %s | FileCheck %s
+
+# Code snippet taken from PR48024.
+
+stmxcsr -4(%rsp)
+movl$-24577, %eax# imm = 0x9FFF
+andl-4(%rsp), %eax
+movl%eax, -8(%rsp)
+ldmxcsr -8(%rsp)
+retq
+
+# CHECK:  Iterations:100
+# CHECK-NEXT: Instructions:  600
+# CHECK-NEXT: Total Cycles:  704
+# CHECK-NEXT: Total uOps:600
+
+# CHECK:  Dispatch Width:2
+# CHECK-NEXT: uOps Per Cycle:0.85
+# CHECK-NEXT: IPC:   0.85
+# CHECK-NEXT: Block RThroughput: 3.0
+
+# CHECK:  Instruction Info:
+# CHECK-NEXT: [1]: #uOps
+# CHECK-NEXT: [2]: Latency
+# CHECK-NEXT: [3]: RThroughput
+# CHECK-NEXT: [4]: MayLoad
+# CHECK-NEXT: [5]: MayStore
+# CHECK-NEXT: [6]: HasSideEffects (U)
+
+# CHECK:  [1][2][3][4][5][6]Instructions:
+# CHECK-NEXT:  1  1 1.00   *  U stmxcsr-4(%rsp)
+# CHECK-NEXT:  1  1 0.50movl   $-24577, %eax
+# CHECK-NEXT:  1  4 1.00*   andl   -4(%rsp), %eax
+# CHECK-NEXT:  1  1 1.00   *movl   %eax, -8(%rsp)
+# CHECK-NEXT:  1  3 1.00* U ldmxcsr-8(%rsp)
+# CHECK-NEXT:  1  4 1.00  U retq
+
+# CHECK:  Resources:
+# CHECK-NEXT: [0]   - JALU0
+# CHECK-NEXT: [1]   - JALU1
+# CHECK-NEXT: [2]   - JDiv
+# CHECK-NEXT: [3]   - JFPA
+# CHECK-NEXT: [4]   - JFPM
+# CHECK-NEXT: [5]   - JFPU0
+# CHECK-NEXT: [6]   - JFPU1
+# CHECK-NEXT: [7]   - JLAGU
+# CHECK-NEXT: [8]   - JMul
+# CHECK-NEXT: [9]   - JSAGU
+# CHECK-NEXT: [10]  - JSTC
+# CHECK-NEXT: [11]  - JVALU0
+# CHECK-NEXT: [12]  - JVALU1
+# CHECK-NEXT: [13]  - JVIMUL
+
+# CHECK:  Resource pressure per iteration:
+# CHECK-NEXT: [0][1][2][3][4][5][6][7][8]
[9][10]   [11]   [12]   [13]
+# CHECK-NEXT: 1.50   1.50-  -  -  -  - 3.00- 
2.00-  -  -  -
+
+# CHECK:  Resource pressure by instruction:
+# CHECK-NEXT: [0][1][2][3][4][5][6][7][8]
[9][10]   [11]   [12]   [13]   Instructions:
+# CHECK-NEXT:  -  -  -  -  -  -  -  -  - 
1.00-  -  -  - stmxcsr-4(%rsp)
+# CHECK-NEXT: 0.50   0.50-  -  -  -  -  -  -  
-  -  -  -  - movl   $-24577, %eax
+# CHECK-NEXT: 0.50   0.50-  -  -  -  - 1.00-  
-  -  -  -  - andl   -4(%rsp), %eax
+# CHECK-NEXT:  -  -  -  -  -  -  -  -  - 
1.00-  -  -  - movl   %eax, -8(%rsp)
+# CHECK-NEXT:  -  -  -  -  -  -  - 1.00-  
-  -  -  -  - ldmxcsr-8(%rsp)
+# CHECK-NEXT: 0.50   0.50-  -  -  -  - 1.00-  
-  -  -  -  - retq
+
+# CHECK:  Timeline view:
+# CHECK-NEXT:

[llvm-branch-commits] [llvm] 02b2bcd - [VE] Correct types of return/argument values for getAdjustedFrameSize()

2020-11-20 Thread Kazushi Marukawa via llvm-branch-commits


Author: Kazushi (Jam) Marukawa
Date: 2020-11-21T16:08:20+09:00
New Revision: 02b2bcd940cc61c90a966679b48d3c1a34e13139

URL: 
https://github.com/llvm/llvm-project/commit/02b2bcd940cc61c90a966679b48d3c1a34e13139
DIFF: 
https://github.com/llvm/llvm-project/commit/02b2bcd940cc61c90a966679b48d3c1a34e13139.diff

LOG: [VE] Correct types of return/argument values for getAdjustedFrameSize()

A getAdjustedFrameSize function may need to handle larger than 32 bits
integer, so change int to uint64_t.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D91862

Added: 
llvm/test/CodeGen/VE/Scalar/stackframe_call.ll
llvm/test/CodeGen/VE/Scalar/stackframe_nocall.ll

Modified: 
llvm/lib/Target/VE/VESubtarget.cpp
llvm/lib/Target/VE/VESubtarget.h

Removed: 




diff  --git a/llvm/lib/Target/VE/VESubtarget.cpp 
b/llvm/lib/Target/VE/VESubtarget.cpp
index f9c179e18528..e15969cd6091 100644
--- a/llvm/lib/Target/VE/VESubtarget.cpp
+++ b/llvm/lib/Target/VE/VESubtarget.cpp
@@ -47,7 +47,7 @@ VESubtarget::VESubtarget(const Triple &TT, const std::string 
&CPU,
   InstrInfo(initializeSubtargetDependencies(CPU, FS)), TLInfo(TM, *this),
   FrameLowering(*this) {}
 
-int VESubtarget::getAdjustedFrameSize(int frameSize) const {
+uint64_t VESubtarget::getAdjustedFrameSize(uint64_t FrameSize) const {
 
   // VE stack frame:
   //
@@ -93,10 +93,10 @@ int VESubtarget::getAdjustedFrameSize(int frameSize) const {
   //  16(fp) | Thread pointer register (%tp=%s14)   |
   // +--+
 
-  frameSize += 176;   // for RSA, RA, and FP
-  frameSize = alignTo(frameSize, 16); // requires 16 bytes alignment
+  FrameSize += 176;   // For RSA, RA, and FP.
+  FrameSize = alignTo(FrameSize, 16); // Requires 16 bytes alignment.
 
-  return frameSize;
+  return FrameSize;
 }
 
 bool VESubtarget::enableMachineScheduler() const { return true; }

diff  --git a/llvm/lib/Target/VE/VESubtarget.h 
b/llvm/lib/Target/VE/VESubtarget.h
index 04c133342f2a..9fe2a8f1f825 100644
--- a/llvm/lib/Target/VE/VESubtarget.h
+++ b/llvm/lib/Target/VE/VESubtarget.h
@@ -72,7 +72,7 @@ class VESubtarget : public VEGenSubtargetInfo {
   /// Given a actual stack size as determined by FrameInfo, this function
   /// returns adjusted framesize which includes space for register window
   /// spills and arguments.
-  int getAdjustedFrameSize(int stackSize) const;
+  uint64_t getAdjustedFrameSize(uint64_t FrameSize) const;
 
   bool isTargetLinux() const { return TargetTriple.isOSLinux(); }
 };

diff  --git a/llvm/test/CodeGen/VE/Scalar/stackframe_call.ll 
b/llvm/test/CodeGen/VE/Scalar/stackframe_call.ll
new file mode 100644
index ..a6305092dc39
--- /dev/null
+++ b/llvm/test/CodeGen/VE/Scalar/stackframe_call.ll
@@ -0,0 +1,440 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -mtriple=ve | FileCheck %s
+; RUN: llc < %s -mtriple=ve -relocation-model=pic | FileCheck %s 
--check-prefix=PIC
+
+;; Check stack frame allocation of a function which calls other functions
+
+; Function Attrs: norecurse nounwind readnone
+define signext i32 @test_frame0(i32 signext %0) {
+; CHECK-LABEL: test_frame0:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:adds.w.sx %s0, 3, %s0
+; CHECK-NEXT:adds.w.sx %s0, %s0, (0)1
+; CHECK-NEXT:b.l.t (, %s10)
+;
+; PIC-LABEL: test_frame0:
+; PIC:   # %bb.0:
+; PIC-NEXT:adds.w.sx %s0, 3, %s0
+; PIC-NEXT:adds.w.sx %s0, %s0, (0)1
+; PIC-NEXT:b.l.t (, %s10)
+  %2 = add nsw i32 %0, 3
+  ret i32 %2
+}
+
+; Function Attrs: nounwind
+define i8* @test_frame32(i8* %0) {
+; CHECK-LABEL: test_frame32:
+; CHECK:   # %bb.0:
+; CHECK-NEXT:st %s9, (, %s11)
+; CHECK-NEXT:st %s10, 8(, %s11)
+; CHECK-NEXT:st %s15, 24(, %s11)
+; CHECK-NEXT:st %s16, 32(, %s11)
+; CHECK-NEXT:or %s9, 0, %s11
+; CHECK-NEXT:lea %s13, -272
+; CHECK-NEXT:and %s13, %s13, (32)0
+; CHECK-NEXT:lea.sl %s11, -1(%s13, %s11)
+; CHECK-NEXT:brge.l.t %s11, %s8, .LBB1_2
+; CHECK-NEXT:  # %bb.1:
+; CHECK-NEXT:ld %s61, 24(, %s14)
+; CHECK-NEXT:or %s62, 0, %s0
+; CHECK-NEXT:lea %s63, 315
+; CHECK-NEXT:shm.l %s63, (%s61)
+; CHECK-NEXT:shm.l %s8, 8(%s61)
+; CHECK-NEXT:shm.l %s11, 16(%s61)
+; CHECK-NEXT:monc
+; CHECK-NEXT:or %s0, 0, %s62
+; CHECK-NEXT:  .LBB1_2:
+; CHECK-NEXT:or %s1, 0, %s0
+; CHECK-NEXT:lea %s0, fun@lo
+; CHECK-NEXT:and %s0, %s0, (32)0
+; CHECK-NEXT:lea.sl %s12, fun@hi(, %s0)
+; CHECK-NEXT:lea %s0, 240(, %s11)
+; CHECK-NEXT:bsic %s10, (, %s12)
+; CHECK-NEXT:or %s11, 0, %s9
+; CHECK-NEXT:ld %s16, 32(, %s11)
+; CHECK-NEXT:ld %s15, 24(, %s11)
+; CHECK-NEXT:ld %s10, 8(, %s11)
+; CHECK-NEXT:ld %s9, (, %s11)
+; CHECK-NEXT:b.l.t (, %s10)
+;
+; PIC-LABEL: test_frame32:
+; PIC:   # %bb.0:
+; PIC-NEXT:st %s9, (, %s11)
+;

[llvm-branch-commits] [llvm] 4a1d230 - [VE][NFC] Modify function order and simplify comments

2020-11-20 Thread Kazushi Marukawa via llvm-branch-commits


Author: Kazushi (Jam) Marukawa
Date: 2020-11-21T16:09:37+09:00
New Revision: 4a1d230fa6f4cf27e4ce0626afe6c1434eab29b2

URL: 
https://github.com/llvm/llvm-project/commit/4a1d230fa6f4cf27e4ce0626afe6c1434eab29b2
DIFF: 
https://github.com/llvm/llvm-project/commit/4a1d230fa6f4cf27e4ce0626afe6c1434eab29b2.diff

LOG: [VE][NFC] Modify function order and simplify comments

Added: 


Modified: 
llvm/lib/Target/VE/VEISelLowering.cpp

Removed: 




diff  --git a/llvm/lib/Target/VE/VEISelLowering.cpp 
b/llvm/lib/Target/VE/VEISelLowering.cpp
index c41d0a416eaa..eb47d01afc77 100644
--- a/llvm/lib/Target/VE/VEISelLowering.cpp
+++ b/llvm/lib/Target/VE/VEISelLowering.cpp
@@ -801,30 +801,6 @@ bool VETargetLowering::allowsMisalignedMemoryAccesses(EVT 
VT,
   return true;
 }
 
-bool VETargetLowering::hasAndNot(SDValue Y) const {
-  EVT VT = Y.getValueType();
-
-  // VE doesn't have vector and not instruction.
-  if (VT.isVector())
-return false;
-
-  // VE allows 
diff erent immediate values for X and Y where ~X & Y.
-  // Only simm7 works for X, and only mimm works for Y on VE.  However, this
-  // function is used to check whether an immediate value is OK for and-not
-  // instruction as both X and Y.  Generating additional instruction to
-  // retrieve an immediate value is no good since the purpose of this
-  // function is to convert a series of 3 instructions to another series of
-  // 3 instructions with better parallelism.  Therefore, we return false
-  // for all immediate values now.
-  // FIXME: Change hasAndNot function to have two operands to make it work
-  //correctly with Aurora VE.
-  if (isa(Y))
-return false;
-
-  // It's ok for generic registers.
-  return true;
-}
-
 VETargetLowering::VETargetLowering(const TargetMachine &TM,
const VESubtarget &STI)
 : TargetLowering(TM), Subtarget(&STI) {
@@ -1617,7 +1593,7 @@ SDValue VETargetLowering::PerformDAGCombine(SDNode *N,
 }
 
 
//===--===//
-// VE Inline Assembly Support
+// VE Inline Assembly Support
 
//===--===//
 
 VETargetLowering::ConstraintType
@@ -1666,3 +1642,27 @@ unsigned VETargetLowering::getMinimumJumpTableEntries() 
const {
 
   return TargetLowering::getMinimumJumpTableEntries();
 }
+
+bool VETargetLowering::hasAndNot(SDValue Y) const {
+  EVT VT = Y.getValueType();
+
+  // VE doesn't have vector and not instruction.
+  if (VT.isVector())
+return false;
+
+  // VE allows 
diff erent immediate values for X and Y where ~X & Y.
+  // Only simm7 works for X, and only mimm works for Y on VE.  However, this
+  // function is used to check whether an immediate value is OK for and-not
+  // instruction as both X and Y.  Generating additional instruction to
+  // retrieve an immediate value is no good since the purpose of this
+  // function is to convert a series of 3 instructions to another series of
+  // 3 instructions with better parallelism.  Therefore, we return false
+  // for all immediate values now.
+  // FIXME: Change hasAndNot function to have two operands to make it work
+  //correctly with Aurora VE.
+  if (isa(Y))
+return false;
+
+  // It's ok for generic registers.
+  return true;
+}



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [clang] d50044e - [CUDA] Improve clang's ability to detect recent CUDA versions.

[llvm-branch-commits] [clang] 06f479c - [CUDA] Extract CUDA version from cuda.h if version.txt is not found

[llvm-branch-commits] [llvm] 973b95e - [MCA][LSUnit] Correctly update the internal group flags on store barrier execution. Fixes PR48024.

[llvm-branch-commits] [llvm] 02b2bcd - [VE] Correct types of return/argument values for getAdjustedFrameSize()

[llvm-branch-commits] [llvm] 4a1d230 - [VE][NFC] Modify function order and simplify comments

5 matches

Site Navigation

Mail list logo

Footer information