[llvm-branch-commits] [clang] [libcxx] [flang] [llvm] [libc] [compiler-rt] [RISCV] Support select optimization (PR #80124)
wangpc-pp wrote: > JFYI, I don't find the AArch64 data particularly convincing for RISCV. The > magnitude of the change even on AArch64 is small, and could easily be swung > one direction or the other by differences in implementation between the > backends. Yeah! The result will differ for different targets/CPUs. One RISCV data for SPEC 2006 (which is not universal I think) on an OoO RISCV CPU, options: `-march=rv64gc_zba_zbb_zicond -O3`: ``` 400.perlbench0.538% 401.bzip20.018% 403.gcc 0.105% 429.mcf 1.028% 445.gobmk-0.221% 456.hmmer1.582% 458.sjeng-0.026% 462.libquantum -0.090% 464.h264ref 0.905% 471.omnetpp -0.776% 473.astar0.205% ``` The geomean is: 0.295%. The result can be better with PGO I think (haven't tried it). Some related discussions: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization. So I think we can be just like AArch64, make it a tune feature and processors can add it if needed. https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [Clang][RISCV] Refactor builtins to TableGen (PR #80280)
wangpc-pp wrote: Ping. https://github.com/llvm/llvm-project/pull/80280 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/80124 >From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001 From: wangpc Date: Sun, 18 Feb 2024 11:12:16 +0800 Subject: [PATCH] Move after addIRPasses Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp index fdf1c023fff878..7a26e1956424cb 100644 --- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp +++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp @@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() { if (EnableLoopDataPrefetch) addPass(createLoopDataPrefetchPass()); -if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive) - addPass(createSelectOptimizePass()); - addPass(createRISCVGatherScatterLoweringPass()); addPass(createInterleavedAccessPass()); addPass(createRISCVCodeGenPreparePass()); } TargetPassConfig::addIRPasses(); + + if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt) +addPass(createSelectOptimizePass()); } bool RISCVPassConfig::addPreISel() { ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/82322 This intrinsic was introduced by #81331, which is a lot like `llvm.readcyclecounter`. For the RISCV implementation, we rename `ReadCycleWide` pseudo to `ReadCounterWide` and make it accept two operands (the low and high parts of the counter). As for legalization and lowering parts, we reuse the code of `ISD::READCYCLECOUNTER` (make it able to handle both intrinsics). Tests using Clang builtins are runned on real hardware and it works as excepted. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
https://github.com/wangpc-pp edited https://github.com/llvm/llvm-project/pull/82322 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/82322 >From f8415de83823cd5b244fcb288b29d4afc7ea10db Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 20 Feb 2024 18:20:03 +0800 Subject: [PATCH] Fix typo and address comments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVISelLowering.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 32d47a669020f1..1814928c5ca159 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -625,7 +625,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM, if (Subtarget.is64Bit()) setOperationAction(ISD::Constant, MVT::i64, Custom); - // TODO: On M-mode only targets, the cycle[h] CSR may not be present. + // TODO: On M-mode only targets, the cycle[h]/time[h] CSR may not be present. // Unfortunately this can't be determined just from the ISA naming string. setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Subtarget.is64Bit() ? Legal : Custom); @@ -11739,7 +11739,7 @@ void RISCVTargetLowering::ReplaceNodeResults(SDNode *N, RISCVSysReg::lookupSysRegByName("CYCLE")->Encoding, DL, XLenVT); HiCounter = DAG.getConstant( RISCVSysReg::lookupSysRegByName("CYCLEH")->Encoding, DL, XLenVT); -} else if (N->getOpcode() == ISD::READSTEADYCOUNTER) { +} else { LoCounter = DAG.getConstant( RISCVSysReg::lookupSysRegByName("TIME")->Encoding, DL, XLenVT); HiCounter = DAG.getConstant( @@ -16929,9 +16929,9 @@ static MachineBasicBlock *emitReadCounterWidePseudo(MachineInstr &MI, // For example: // ``` // read: - // csrrs x3, counter # load high word of counter - // csrrs x2, counterh # load low word of counter - // csrrs x4, counter # load high word of counter + // csrrs x3, counterh # load high word of counter + // csrrs x2, counter # load low word of counter + // csrrs x4, counterh # load high word of counter // bne x3, x4, read # check if high word reads match, otherwise try again // ``` ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
https://github.com/wangpc-pp edited https://github.com/llvm/llvm-project/pull/82322 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/82322 >From f8415de83823cd5b244fcb288b29d4afc7ea10db Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 20 Feb 2024 18:20:03 +0800 Subject: [PATCH] Fix typo and address comments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVISelLowering.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 32d47a669020f1..1814928c5ca159 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -625,7 +625,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM, if (Subtarget.is64Bit()) setOperationAction(ISD::Constant, MVT::i64, Custom); - // TODO: On M-mode only targets, the cycle[h] CSR may not be present. + // TODO: On M-mode only targets, the cycle[h]/time[h] CSR may not be present. // Unfortunately this can't be determined just from the ISA naming string. setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Subtarget.is64Bit() ? Legal : Custom); @@ -11739,7 +11739,7 @@ void RISCVTargetLowering::ReplaceNodeResults(SDNode *N, RISCVSysReg::lookupSysRegByName("CYCLE")->Encoding, DL, XLenVT); HiCounter = DAG.getConstant( RISCVSysReg::lookupSysRegByName("CYCLEH")->Encoding, DL, XLenVT); -} else if (N->getOpcode() == ISD::READSTEADYCOUNTER) { +} else { LoCounter = DAG.getConstant( RISCVSysReg::lookupSysRegByName("TIME")->Encoding, DL, XLenVT); HiCounter = DAG.getConstant( @@ -16929,9 +16929,9 @@ static MachineBasicBlock *emitReadCounterWidePseudo(MachineInstr &MI, // For example: // ``` // read: - // csrrs x3, counter # load high word of counter - // csrrs x2, counterh # load low word of counter - // csrrs x4, counter # load high word of counter + // csrrs x3, counterh # load high word of counter + // csrrs x2, counter # load low word of counter + // csrrs x4, counterh # load high word of counter // bne x3, x4, read # check if high word reads match, otherwise try again // ``` ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/82322 >From f8415de83823cd5b244fcb288b29d4afc7ea10db Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 20 Feb 2024 18:20:03 +0800 Subject: [PATCH] Fix typo and address comments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVISelLowering.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 32d47a669020f1..1814928c5ca159 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -625,7 +625,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM, if (Subtarget.is64Bit()) setOperationAction(ISD::Constant, MVT::i64, Custom); - // TODO: On M-mode only targets, the cycle[h] CSR may not be present. + // TODO: On M-mode only targets, the cycle[h]/time[h] CSR may not be present. // Unfortunately this can't be determined just from the ISA naming string. setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Subtarget.is64Bit() ? Legal : Custom); @@ -11739,7 +11739,7 @@ void RISCVTargetLowering::ReplaceNodeResults(SDNode *N, RISCVSysReg::lookupSysRegByName("CYCLE")->Encoding, DL, XLenVT); HiCounter = DAG.getConstant( RISCVSysReg::lookupSysRegByName("CYCLEH")->Encoding, DL, XLenVT); -} else if (N->getOpcode() == ISD::READSTEADYCOUNTER) { +} else { LoCounter = DAG.getConstant( RISCVSysReg::lookupSysRegByName("TIME")->Encoding, DL, XLenVT); HiCounter = DAG.getConstant( @@ -16929,9 +16929,9 @@ static MachineBasicBlock *emitReadCounterWidePseudo(MachineInstr &MI, // For example: // ``` // read: - // csrrs x3, counter # load high word of counter - // csrrs x2, counterh # load low word of counter - // csrrs x4, counter # load high word of counter + // csrrs x3, counterh # load high word of counter + // csrrs x2, counter # load low word of counter + // csrrs x4, counterh # load high word of counter // bne x3, x4, read # check if high word reads match, otherwise try again // ``` ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/82322 >From f8415de83823cd5b244fcb288b29d4afc7ea10db Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 20 Feb 2024 18:20:03 +0800 Subject: [PATCH 1/2] Fix typo and address comments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVISelLowering.cpp | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp index 32d47a669020f1..1814928c5ca159 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.cpp @@ -625,7 +625,7 @@ RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM, if (Subtarget.is64Bit()) setOperationAction(ISD::Constant, MVT::i64, Custom); - // TODO: On M-mode only targets, the cycle[h] CSR may not be present. + // TODO: On M-mode only targets, the cycle[h]/time[h] CSR may not be present. // Unfortunately this can't be determined just from the ISA naming string. setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Subtarget.is64Bit() ? Legal : Custom); @@ -11739,7 +11739,7 @@ void RISCVTargetLowering::ReplaceNodeResults(SDNode *N, RISCVSysReg::lookupSysRegByName("CYCLE")->Encoding, DL, XLenVT); HiCounter = DAG.getConstant( RISCVSysReg::lookupSysRegByName("CYCLEH")->Encoding, DL, XLenVT); -} else if (N->getOpcode() == ISD::READSTEADYCOUNTER) { +} else { LoCounter = DAG.getConstant( RISCVSysReg::lookupSysRegByName("TIME")->Encoding, DL, XLenVT); HiCounter = DAG.getConstant( @@ -16929,9 +16929,9 @@ static MachineBasicBlock *emitReadCounterWidePseudo(MachineInstr &MI, // For example: // ``` // read: - // csrrs x3, counter # load high word of counter - // csrrs x2, counterh # load low word of counter - // csrrs x4, counter # load high word of counter + // csrrs x3, counterh # load high word of counter + // csrrs x2, counter # load low word of counter + // csrrs x4, counterh # load high word of counter // bne x3, x4, read # check if high word reads match, otherwise try again // ``` >From 95acdc7abbc5b85e6370b83a9efe961ccfb54e27 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Wed, 21 Feb 2024 13:03:08 +0800 Subject: [PATCH 2/2] Remove duplicated comments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVISelLowering.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVISelLowering.h b/llvm/lib/Target/RISCV/RISCVISelLowering.h index 879af0ecdf8bc0..83b1c68eea61ac 100644 --- a/llvm/lib/Target/RISCV/RISCVISelLowering.h +++ b/llvm/lib/Target/RISCV/RISCVISelLowering.h @@ -126,9 +126,10 @@ enum NodeType : unsigned { // Floating point fmax and fmin matching the RISC-V instruction semantics. FMAX, FMIN, - // READ_COUNTER_WIDE - A read of the 64-bit counter CSR on a 32-bit target - // (returns (Lo, Hi)). It takes a chain operand. + // A read of the 64-bit counter CSR on a 32-bit target (returns (Lo, Hi)). + // It takes a chain operand. READ_COUNTER_WIDE, + // brev8, orc.b, zip, and unzip from Zbb and Zbkb. All operands are i32 or // XLenVT. BREV8, ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
https://github.com/wangpc-pp closed https://github.com/llvm/llvm-project/pull/82322 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support llvm.readsteadycounter intrinsic (PR #82322)
wangpc-pp wrote: Commited as b8ed69ecc01385c03844e8fa05ba418a5670d322. SPR sometines failed to land after rebasing: ```shell # spr land --cherry-pick d830d43 [RISCV] Support llvm.readsteadycounter intrinsic #️⃣ Pull Request #82322 🛫 Getting started... 🛑 GitHub: Validation Failed Documentation URL: https://docs.github.com/rest/pulls/pulls#update-a-pull-request Errors: - {"code":"invalid","field":"base","message":"Proposed base branch 'refs/heads/main' is invalid","resource":"PullRequest"} ``` https://github.com/llvm/llvm-project/pull/82322 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
wangpc-pp wrote: Gentle ping. https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/80124 >From e3fb1fe7bdd4b7c24f9361c4d14dd1206fc8c067 Mon Sep 17 00:00:00 2001 From: wangpc Date: Sun, 18 Feb 2024 11:12:16 +0800 Subject: [PATCH 1/2] Move after addIRPasses Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVTargetMachine.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp index fdf1c023fff878..7a26e1956424cb 100644 --- a/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp +++ b/llvm/lib/Target/RISCV/RISCVTargetMachine.cpp @@ -450,15 +450,15 @@ void RISCVPassConfig::addIRPasses() { if (EnableLoopDataPrefetch) addPass(createLoopDataPrefetchPass()); -if (EnableSelectOpt && getOptLevel() == CodeGenOptLevel::Aggressive) - addPass(createSelectOptimizePass()); - addPass(createRISCVGatherScatterLoweringPass()); addPass(createInterleavedAccessPass()); addPass(createRISCVCodeGenPreparePass()); } TargetPassConfig::addIRPasses(); + + if (getOptLevel() == CodeGenOptLevel::Aggressive && EnableSelectOpt) +addPass(createSelectOptimizePass()); } bool RISCVPassConfig::addPreISel() { >From 5d5398596dc30c47c67572ec20137fb3f9434940 Mon Sep 17 00:00:00 2001 From: wangpc Date: Wed, 21 Feb 2024 21:21:28 +0800 Subject: [PATCH 2/2] Fix test Created using spr 1.3.4 --- llvm/test/CodeGen/RISCV/O3-pipeline.ll | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/llvm/test/CodeGen/RISCV/O3-pipeline.ll b/llvm/test/CodeGen/RISCV/O3-pipeline.ll index 62c1af52e6c20e..8b52e3fe7b2f15 100644 --- a/llvm/test/CodeGen/RISCV/O3-pipeline.ll +++ b/llvm/test/CodeGen/RISCV/O3-pipeline.ll @@ -34,15 +34,6 @@ ; CHECK-NEXT: Optimization Remark Emitter ; CHECK-NEXT: Scalar Evolution Analysis ; CHECK-NEXT: Loop Data Prefetch -; CHECK-NEXT: Post-Dominator Tree Construction -; CHECK-NEXT: Branch Probability Analysis -; CHECK-NEXT: Block Frequency Analysis -; CHECK-NEXT: Lazy Branch Probability Analysis -; CHECK-NEXT: Lazy Block Frequency Analysis -; CHECK-NEXT: Optimization Remark Emitter -; CHECK-NEXT: Optimize selects -; CHECK-NEXT: Dominator Tree Construction -; CHECK-NEXT: Natural Loop Information ; CHECK-NEXT: RISC-V gather/scatter lowering ; CHECK-NEXT: Interleaved Access Pass ; CHECK-NEXT: RISC-V CodeGenPrepare @@ -77,6 +68,15 @@ ; CHECK-NEXT: Expand reduction intrinsics ; CHECK-NEXT: Natural Loop Information ; CHECK-NEXT: TLS Variable Hoist +; CHECK-NEXT: Post-Dominator Tree Construction +; CHECK-NEXT: Branch Probability Analysis +; CHECK-NEXT: Block Frequency Analysis +; CHECK-NEXT: Lazy Branch Probability Analysis +; CHECK-NEXT: Lazy Block Frequency Analysis +; CHECK-NEXT: Optimization Remark Emitter +; CHECK-NEXT: Optimize selects +; CHECK-NEXT: Dominator Tree Construction +; CHECK-NEXT: Natural Loop Information ; CHECK-NEXT: CodeGen Prepare ; CHECK-NEXT: Dominator Tree Construction ; CHECK-NEXT: Exception handling preparation ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
wangpc-pp wrote: Ping. https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
@@ -101,6 +101,11 @@ static cl::opt EnableMISchedLoadClustering( cl::desc("Enable load clustering in the machine scheduler"), cl::init(false)); +static cl::opt +EnableSelectOpt("riscv-select-opt", cl::Hidden, wangpc-pp wrote: We have already disabled it via `enableSelectOptimize()`? https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
@@ -101,6 +101,11 @@ static cl::opt EnableMISchedLoadClustering( cl::desc("Enable load clustering in the machine scheduler"), cl::init(false)); +static cl::opt +EnableSelectOpt("riscv-select-opt", cl::Hidden, wangpc-pp wrote: Yeah, this point makes sence to me. This pass adds several analysis passes (most of them can be cached), so it may impact compile time. https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
https://github.com/wangpc-pp edited https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
@@ -101,6 +101,11 @@ static cl::opt EnableMISchedLoadClustering( cl::desc("Enable load clustering in the machine scheduler"), cl::init(false)); +static cl::opt +EnableSelectOpt("riscv-select-opt", cl::Hidden, wangpc-pp wrote: @topperc WDYT? https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] release/18.x: [TableGen] Fix wrong codegen of BothFusionPredicateWithMCInstPredicate (#83990) (PR #83999)
wangpc-pp wrote: Ping? https://github.com/llvm/llvm-project/pull/83999 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV] Pass LMUL to copyPhysRegVector (PR #84448)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/84448 The opcode will be determined by LMUL. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV] Pass LMUL to copyPhysRegVector (PR #84448)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV] Pass LMUL to copyPhysRegVector (PR #84448)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV][NFC] Refactor copyPhysRegVector (PR #84455)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/84455 Reduce some duplications and make it easy to follow. We can optimize segment copies later. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV][NFC] Refactor copyPhysRegVector (PR #84455)
https://github.com/wangpc-pp converted_to_draft https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm][lld][RISCV] Support x3_reg_usage (PR #84598)
@@ -24,6 +24,9 @@ .attribute priv_spec_revision, 0 # CHECK: attribute 12, 0 + wangpc-pp wrote: This blank should be added in Atomic ABI PR I think. https://github.com/llvm/llvm-project/pull/84598 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm][lld][RISCV] Support x3_reg_usage (PR #84598)
@@ -520,3 +520,8 @@ define i8 @atomic_load_i8_seq_cst(ptr %a) nounwind { ; A6S: .attribute 14, 2 ; A6C: .attribute 14, 1 } + wangpc-pp wrote: No CHECKs for this test or I miss something here? https://github.com/llvm/llvm-project/pull/84598 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm][lld][RISCV] Support x3_reg_usage (PR #84598)
@@ -47,6 +48,15 @@ enum AtomicABI : unsigned { }; } // namespace RISCVAtomicAbiTag +namespace RISCVX3RegUse { +enum X3RegUsage : unsigned { + UNKNOWN = 0, + GP = 0, wangpc-pp wrote: Copy paste mistakes? Why all 0s? https://github.com/llvm/llvm-project/pull/84598 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV][NFC] Pass LMUL to copyPhysRegVector (PR #84448)
https://github.com/wangpc-pp edited https://github.com/llvm/llvm-project/pull/84448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84455 >From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 17:31:47 +0800 Subject: [PATCH] Reduce copies Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 89 +- llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir | 30 +--- llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++ 3 files changed, 106 insertions(+), 188 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 7895e87702c711..9fe5666d6a81f4 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); - for (; I != End; I += Incr) { + unsigned I = 0; + while (I != NumRegs) { auto GetCopyInfo = -[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple { - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - return {SubRegIdx, Opc, VVOpc, VIOpc}; +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; }; -auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF); +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); MachineBasicBlock::const_iterator DefMBBI; if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { @@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } } +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) { +SrcReg = Reg; +break; + } +} + +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) { +DstReg = Reg; +break; + } +} + auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) { auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg); bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I; @@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } }; -if (NF == 1) { - EmitCopy(SrcReg, DstReg, Opc); - return; -} - -EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I), - TRI->g
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp edited https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp ready_for_review https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/84894 This TSFlags was introduced by https://reviews.llvm.org/D108815. We store VLMul/NF into TSFlags and add helpers to get them. This can reduce some lines and I think there will be more usages. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84894 >From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 18:41:50 +0800 Subject: [PATCH] Fix wrong arguments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 3e52583ec8ad82..1b3e6cf10189c5 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB, RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass, RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) { if (RegClass.contains(DstReg, SrcReg)) { - copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, -getLMul(RegClass.TSFlags), -/*NF=*/ -getNF(RegClass.TSFlags)); + copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass); return; } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp edited https://github.com/llvm/llvm-project/pull/84894 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84455 >From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 17:31:47 +0800 Subject: [PATCH 1/2] Reduce copies Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 89 +- llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir | 30 +--- llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++ 3 files changed, 106 insertions(+), 188 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 7895e87702c711..9fe5666d6a81f4 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); - for (; I != End; I += Incr) { + unsigned I = 0; + while (I != NumRegs) { auto GetCopyInfo = -[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple { - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - return {SubRegIdx, Opc, VVOpc, VIOpc}; +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; }; -auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF); +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); MachineBasicBlock::const_iterator DefMBBI; if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { @@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } } +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) { +SrcReg = Reg; +break; + } +} + +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) { +DstReg = Reg; +break; + } +} + auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) { auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg); bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I; @@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } }; -if (NF == 1) { - EmitCopy(SrcReg, DstReg, Opc); - return; -} - -EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I), - TR
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
@@ -302,102 +302,87 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - - bool UseVMV_V_V = false; - bool UseVMV_V_I = false; - MachineBasicBlock::const_iterator DefMBBI; - if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { -UseVMV_V_V = true; -Opc = VVOpc; - -if (DefMBBI->getOpcode() == VIOpc) { - UseVMV_V_I = true; - Opc = VIOpc; -} - } - - if (NF == 1) { -auto MIB = BuildMI(MBB, MBBI, DL, get(Opc), DstReg); -if (UseVMV_V_V) - MIB.addReg(DstReg, RegState::Undef); -if (UseVMV_V_I) - MIB = MIB.add(DefMBBI->getOperand(2)); -else - MIB = MIB.addReg(SrcReg, getKillRegState(KillSrc)); -if (UseVMV_V_V) { - const MCInstrDesc &Desc = DefMBBI->getDesc(); - MIB.add(DefMBBI->getOperand(RISCVII::getVLOpNum(Desc))); // AVL - MIB.add(DefMBBI->getOperand(RISCVII::getSEWOpNum(Desc))); // SEW - MIB.addImm(0);// tu, mu - MIB.addReg(RISCV::VL, RegState::Implicit); - MIB.addReg(RISCV::VTYPE, RegState::Implicit); -} -return; - } - - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } - - for (; I != End; I += Incr) { -auto MIB = -BuildMI(MBB, MBBI, DL, get(Opc), TRI->getSubReg(DstReg, SubRegIdx + I)); -if (UseVMV_V_V) - MIB.addReg(TRI->getSubReg(DstReg, SubRegIdx + I), RegState::Undef); -if (UseVMV_V_I) - MIB = MIB.add(DefMBBI->getOperand(2)); -else - MIB = MIB.addReg(TRI->getSubReg(SrcReg, SubRegIdx + I), - getKillRegState(KillSrc)); -if (UseVMV_V_V) { - const MCInstrDesc &Desc = DefMBBI->getDesc(); - MIB.add(DefMBBI->getOperand(RISCVII::getVLOpNum(Desc))); // AVL - MIB.add(DefMBBI->getOperand(RISCVII::getSEWOpNum(Desc))); // SEW - MIB.addImm(0);// tu, mu - MIB.addReg(RISCV::VL, RegState::Implicit); - MIB.addReg(RISCV::VTYPE, RegState::Implicit); + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); + + unsigned I = 0; + while (I != NumRegs) { +auto GetCopyInfo = +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; +}; + +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); + +MachineBasicBlock::const_iterator DefMBBI; +if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { + Opc = VVOpc; + + if (DefMBBI->getOpcode() == VIOpc) { +Opc = VIOpc; + } } + +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Re
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
@@ -302,102 +302,87 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - - bool UseVMV_V_V = false; - bool UseVMV_V_I = false; - MachineBasicBlock::const_iterator DefMBBI; - if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { -UseVMV_V_V = true; -Opc = VVOpc; - -if (DefMBBI->getOpcode() == VIOpc) { - UseVMV_V_I = true; - Opc = VIOpc; -} - } - - if (NF == 1) { -auto MIB = BuildMI(MBB, MBBI, DL, get(Opc), DstReg); -if (UseVMV_V_V) - MIB.addReg(DstReg, RegState::Undef); -if (UseVMV_V_I) - MIB = MIB.add(DefMBBI->getOperand(2)); -else - MIB = MIB.addReg(SrcReg, getKillRegState(KillSrc)); -if (UseVMV_V_V) { - const MCInstrDesc &Desc = DefMBBI->getDesc(); - MIB.add(DefMBBI->getOperand(RISCVII::getVLOpNum(Desc))); // AVL - MIB.add(DefMBBI->getOperand(RISCVII::getSEWOpNum(Desc))); // SEW - MIB.addImm(0);// tu, mu - MIB.addReg(RISCV::VL, RegState::Implicit); - MIB.addReg(RISCV::VTYPE, RegState::Implicit); -} -return; - } - - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } - - for (; I != End; I += Incr) { -auto MIB = -BuildMI(MBB, MBBI, DL, get(Opc), TRI->getSubReg(DstReg, SubRegIdx + I)); -if (UseVMV_V_V) - MIB.addReg(TRI->getSubReg(DstReg, SubRegIdx + I), RegState::Undef); -if (UseVMV_V_I) - MIB = MIB.add(DefMBBI->getOperand(2)); -else - MIB = MIB.addReg(TRI->getSubReg(SrcReg, SubRegIdx + I), - getKillRegState(KillSrc)); -if (UseVMV_V_V) { - const MCInstrDesc &Desc = DefMBBI->getDesc(); - MIB.add(DefMBBI->getOperand(RISCVII::getVLOpNum(Desc))); // AVL - MIB.add(DefMBBI->getOperand(RISCVII::getSEWOpNum(Desc))); // SEW - MIB.addImm(0);// tu, mu - MIB.addReg(RISCV::VL, RegState::Implicit); - MIB.addReg(RISCV::VTYPE, RegState::Implicit); + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); + + unsigned I = 0; + while (I != NumRegs) { +auto GetCopyInfo = +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; +}; + +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); + +MachineBasicBlock::const_iterator DefMBBI; +if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { wangpc-pp wrote: Fixed. Thanks for catching this! https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing l
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84455 >From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 17:31:47 +0800 Subject: [PATCH 1/3] Reduce copies Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 89 +- llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir | 30 +--- llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++ 3 files changed, 106 insertions(+), 188 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 7895e87702c711..9fe5666d6a81f4 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); - for (; I != End; I += Incr) { + unsigned I = 0; + while (I != NumRegs) { auto GetCopyInfo = -[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple { - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - return {SubRegIdx, Opc, VVOpc, VIOpc}; +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; }; -auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF); +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); MachineBasicBlock::const_iterator DefMBBI; if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { @@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } } +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) { +SrcReg = Reg; +break; + } +} + +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) { +DstReg = Reg; +break; + } +} + auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) { auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg); bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I; @@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } }; -if (NF == 1) { - EmitCopy(SrcReg, DstReg, Opc); - return; -} - -EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I), - TR
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84894 >From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 18:41:50 +0800 Subject: [PATCH] Fix wrong arguments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 3e52583ec8ad82..1b3e6cf10189c5 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB, RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass, RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) { if (RegClass.contains(DstReg, SrcReg)) { - copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, -getLMul(RegClass.TSFlags), -/*NF=*/ -getNF(RegClass.TSFlags)); + copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass); return; } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp edited https://github.com/llvm/llvm-project/pull/84894 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84894 >From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 18:41:50 +0800 Subject: [PATCH] Fix wrong arguments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 3e52583ec8ad82..1b3e6cf10189c5 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB, RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass, RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) { if (RegClass.contains(DstReg, SrcReg)) { - copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, -getLMul(RegClass.TSFlags), -/*NF=*/ -getNF(RegClass.TSFlags)); + copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass); return; } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
wangpc-pp wrote: Ping. Any more concerns? https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [MacroFusion] Add SingleFusion that accepts a single instruction pair (PR #85750)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/85750 We add a common class `SingleFusion` that accepts a single instruction pair to simplify fusion definitions. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
@@ -101,6 +101,11 @@ static cl::opt EnableMISchedLoadClustering( cl::desc("Enable load clustering in the machine scheduler"), cl::init(false)); +static cl::opt +EnableSelectOpt("riscv-select-opt", cl::Hidden, wangpc-pp wrote: Most of the added passes have been run before, so they may be cached? https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Support select optimization (PR #80124)
wangpc-pp wrote: > > > JFYI, I don't find the AArch64 data particularly convincing for RISCV. > > > The magnitude of the change even on AArch64 is small, and could easily be > > > swung one direction or the other by differences in implementation between > > > the backends. > > > > > > Yeah! The result will differ for different targets/CPUs. One RISCV data for > > SPEC 2006 (which is not universal I think) on an OoO RISCV CPU, options: > > `-march=rv64gc_zba_zbb_zicond -O3`: > > ``` > > 400.perlbench0.538% > > 401.bzip20.018% > > 403.gcc 0.105% > > 429.mcf 1.028% > > 445.gobmk-0.221% > > 456.hmmer1.582% > > 458.sjeng-0.026% > > 462.libquantum -0.090% > > 464.h264ref 0.905% > > 471.omnetpp -0.776% > > 473.astar0.205% > > ``` > > The geomean is: 0.295%. The result can be better with PGO I think (haven't > > tried it). Some related discussions: > > https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization. So I think we > > can be just like AArch64, make it a tune feature and processors can add it > > if needed. > > Do we have any data without Zicond? The worst case Zicond sequence is > czero.eqz+czero.nez+or which is kind of expensive. Curious if this is > pointing to Zicond being used too aggressively. Sorry, I didn't run it with this configuration. I was going to run some small benchmarks (the hardware resources were busy) like coremark on CA model today, but it seems there is no codegen change with selectopt enabled. :-( Will lacking this data block this PR? https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [MacroFusion] Add SingleFusion that accepts a single instruction pair (PR #85750)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/85750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [MacroFusion] Add SingleFusion that accepts a single instruction pair (PR #85750)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/85750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [MacroFusion] Add SingleFusion that accepts a single instruction pair (PR #85750)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/85750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [MacroFusion] Add SingleFusion that accepts a single instruction pair (PR #85750)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/85750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [MacroFusion] Add SingleFusion that accepts a single instruction pair (PR #85750)
https://github.com/wangpc-pp closed https://github.com/llvm/llvm-project/pull/85750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [MacroFusion] Add SingleFusion that accepts a single instruction pair (PR #85750)
wangpc-pp wrote: Committed as 4a6bc9fd14bd79f1edf5b651b43bd9bda9b90991. https://github.com/llvm/llvm-project/pull/85750 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [RISCV][NFC] Pass LMUL to copyPhysRegVector (PR #84448)
wangpc-pp wrote: Ping. https://github.com/llvm/llvm-project/pull/84448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
wangpc-pp wrote: Ping. https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
wangpc-pp wrote: Ping. https://github.com/llvm/llvm-project/pull/84894 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
@@ -302,102 +302,81 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - - bool UseVMV_V_V = false; - bool UseVMV_V_I = false; - MachineBasicBlock::const_iterator DefMBBI; - if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { -UseVMV_V_V = true; -Opc = VVOpc; - -if (DefMBBI->getOpcode() == VIOpc) { - UseVMV_V_I = true; - Opc = VIOpc; -} - } - - if (NF == 1) { -auto MIB = BuildMI(MBB, MBBI, DL, get(Opc), DstReg); -if (UseVMV_V_V) - MIB.addReg(DstReg, RegState::Undef); -if (UseVMV_V_I) - MIB = MIB.add(DefMBBI->getOperand(2)); -else - MIB = MIB.addReg(SrcReg, getKillRegState(KillSrc)); -if (UseVMV_V_V) { - const MCInstrDesc &Desc = DefMBBI->getDesc(); - MIB.add(DefMBBI->getOperand(RISCVII::getVLOpNum(Desc))); // AVL - MIB.add(DefMBBI->getOperand(RISCVII::getSEWOpNum(Desc))); // SEW - MIB.addImm(0);// tu, mu - MIB.addReg(RISCV::VL, RegState::Implicit); - MIB.addReg(RISCV::VTYPE, RegState::Implicit); -} -return; - } - - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); + + unsigned I = 0; + auto GetCopyInfo = [&](MCRegister SrcReg, MCRegister DstReg) + -> std::tuple { +unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); +unsigned DstEncoding = TRI->getEncodingValue(DstReg); +if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) wangpc-pp wrote: Yes! Using masks is an optimization here. But I think the compiler can do the favor, so using division will be OK too. https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
@@ -127,8 +127,21 @@ def XLenRI : RegInfoByHwMode< [RV32, RV64], [RegInfo<32,32,32>, RegInfo<64,64,64>]>; +class RISCVRegisterClass regTypes, int align, dag regList> +: RegisterClass<"RISCV", regTypes, align, regList> { + bit IsVRegClass = 0; + int VLMul = 1; + int NF = 1; wangpc-pp wrote: VLMul can't be 0 because `!logtwo(VLMul=0)` is illegal. We use `bits<3>` to store `NF-1` (which is in range `[1, 7]`). NF is in range `[2, 8]`, but we will need 4 bits if we store its raw value. The default NF being 1 (NF-1==0) is a compromise, which is OK I think. These fields are legal iff `IsVegClass` is true. https://github.com/llvm/llvm-project/pull/84894 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
@@ -127,8 +127,21 @@ def XLenRI : RegInfoByHwMode< [RV32, RV64], [RegInfo<32,32,32>, RegInfo<64,64,64>]>; +class RISCVRegisterClass regTypes, int align, dag regList> +: RegisterClass<"RISCV", regTypes, align, regList> { + bit IsVRegClass = 0; + int VLMul = 1; + int NF = 1; wangpc-pp wrote: And `NF=1` means no segment, I think it's obvious. :-) https://github.com/llvm/llvm-project/pull/84894 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
@@ -146,16 +127,12 @@ body: | ; CHECK-NEXT: $v7 = VMV1R_V $v12 ; CHECK-NEXT: $v8 = VMV1R_V $v13 ; CHECK-NEXT: $v9 = VMV1R_V $v14 -; CHECK-NEXT: $v6 = VMV1R_V $v10 -; CHECK-NEXT: $v7 = VMV1R_V $v11 -; CHECK-NEXT: $v8 = VMV1R_V $v12 -; CHECK-NEXT: $v9 = VMV1R_V $v13 -; CHECK-NEXT: $v10 = VMV1R_V $v14 -; CHECK-NEXT: $v18 = VMV1R_V $v14 -; CHECK-NEXT: $v17 = VMV1R_V $v13 -; CHECK-NEXT: $v16 = VMV1R_V $v12 -; CHECK-NEXT: $v15 = VMV1R_V $v11 -; CHECK-NEXT: $v14 = VMV1R_V $v10 +; CHECK-NEXT: $v6m2 = VMV2R_V $v10m2 +; CHECK-NEXT: $v8m2 = VMV2R_V $v12m2 +; CHECK-NEXT: $v8 = VMV1R_V $v14 +; CHECK-NEXT: $v14m2 = VMV2R_V $v10m2 +; CHECK-NEXT: $v12m2 = VMV2R_V $v8m2 +; CHECK-NEXT: $v8 = VMV1R_V $v4 wangpc-pp wrote: Thanks for catching this! Obviously, I used the wrong register number here. I will check. https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84455 >From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 17:31:47 +0800 Subject: [PATCH 1/3] Reduce copies Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 89 +- llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir | 30 +--- llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++ 3 files changed, 106 insertions(+), 188 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 7895e87702c711..9fe5666d6a81f4 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); - for (; I != End; I += Incr) { + unsigned I = 0; + while (I != NumRegs) { auto GetCopyInfo = -[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple { - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - return {SubRegIdx, Opc, VVOpc, VIOpc}; +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; }; -auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF); +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); MachineBasicBlock::const_iterator DefMBBI; if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { @@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } } +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) { +SrcReg = Reg; +break; + } +} + +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) { +DstReg = Reg; +break; + } +} + auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) { auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg); bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I; @@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } }; -if (NF == 1) { - EmitCopy(SrcReg, DstReg, Opc); - return; -} - -EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I), - TR
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84455 >From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 17:31:47 +0800 Subject: [PATCH 1/3] Reduce copies Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 89 +- llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir | 30 +--- llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++ 3 files changed, 106 insertions(+), 188 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 7895e87702c711..9fe5666d6a81f4 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); - for (; I != End; I += Incr) { + unsigned I = 0; + while (I != NumRegs) { auto GetCopyInfo = -[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple { - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - return {SubRegIdx, Opc, VVOpc, VIOpc}; +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; }; -auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF); +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); MachineBasicBlock::const_iterator DefMBBI; if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { @@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } } +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) { +SrcReg = Reg; +break; + } +} + +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) { +DstReg = Reg; +break; + } +} + auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) { auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg); bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I; @@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } }; -if (NF == 1) { - EmitCopy(SrcReg, DstReg, Opc); - return; -} - -EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I), - TR
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84455 >From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 17:31:47 +0800 Subject: [PATCH 1/4] Reduce copies Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 89 +- llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir | 30 +--- llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++ 3 files changed, 106 insertions(+), 188 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 7895e87702c711..9fe5666d6a81f4 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); - for (; I != End; I += Incr) { + unsigned I = 0; + while (I != NumRegs) { auto GetCopyInfo = -[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple { - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - return {SubRegIdx, Opc, VVOpc, VIOpc}; +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; }; -auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF); +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); MachineBasicBlock::const_iterator DefMBBI; if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { @@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } } +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) { +SrcReg = Reg; +break; + } +} + +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) { +DstReg = Reg; +break; + } +} + auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) { auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg); bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I; @@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } }; -if (NF == 1) { - EmitCopy(SrcReg, DstReg, Opc); - return; -} - -EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I), - TR
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84894 >From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 18:41:50 +0800 Subject: [PATCH] Fix wrong arguments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 3e52583ec8ad82..1b3e6cf10189c5 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB, RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass, RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) { if (RegClass.contains(DstReg, SrcReg)) { - copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, -getLMul(RegClass.TSFlags), -/*NF=*/ -getNF(RegClass.TSFlags)); + copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass); return; } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84894 >From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 18:41:50 +0800 Subject: [PATCH] Fix wrong arguments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 3e52583ec8ad82..1b3e6cf10189c5 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB, RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass, RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) { if (RegClass.contains(DstReg, SrcReg)) { - copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, -getLMul(RegClass.TSFlags), -/*NF=*/ -getNF(RegClass.TSFlags)); + copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass); return; } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Store VLMul/NF into RegisterClass's TSFlags (PR #84894)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84894 >From 951478b16d8aa834bff4494dc6d05c5f1175d59f Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 18:41:50 +0800 Subject: [PATCH 1/2] Fix wrong arguments Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 3e52583ec8ad82..1b3e6cf10189c5 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -495,10 +495,7 @@ void RISCVInstrInfo::copyPhysReg(MachineBasicBlock &MBB, RISCV::VRN4M1RegClass, RISCV::VRN4M2RegClass, RISCV::VRN5M1RegClass, RISCV::VRN6M1RegClass, RISCV::VRN7M1RegClass, RISCV::VRN8M1RegClass}) { if (RegClass.contains(DstReg, SrcReg)) { - copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, -getLMul(RegClass.TSFlags), -/*NF=*/ -getNF(RegClass.TSFlags)); + copyPhysRegVector(MBB, MBBI, DL, DstReg, SrcReg, KillSrc, RegClass); return; } } >From 9f649a2ceabb7d6a8154c68b4b58b0278b606512 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Mon, 25 Mar 2024 16:50:58 +0800 Subject: [PATCH 2/2] clear includes Created using spr 1.3.6-beta.1 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 1 - 1 file changed, 1 deletion(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 27a58460b1ba9c..d28e4e39eadcbc 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -29,7 +29,6 @@ #include "llvm/CodeGen/MachineTraceMetrics.h" #include "llvm/CodeGen/RegisterScavenging.h" #include "llvm/CodeGen/StackMaps.h" -#include "llvm/CodeGen/TargetRegisterInfo.h" #include "llvm/IR/DebugInfoMetadata.h" #include "llvm/MC/MCInstBuilder.h" #include "llvm/MC/TargetRegistry.h" ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/84455 >From 35d0ea085b43a67c092e6263e6ec9d34e66e1453 Mon Sep 17 00:00:00 2001 From: Wang Pengcheng Date: Tue, 12 Mar 2024 17:31:47 +0800 Subject: [PATCH 1/5] Reduce copies Created using spr 1.3.4 --- llvm/lib/Target/RISCV/RISCVInstrInfo.cpp | 89 +- llvm/test/CodeGen/RISCV/rvv/vmv-copy.mir | 30 +--- llvm/test/CodeGen/RISCV/rvv/zvlsseg-copy.mir | 175 +++ 3 files changed, 106 insertions(+), 188 deletions(-) diff --git a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp index 7895e87702c711..9fe5666d6a81f4 100644 --- a/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp +++ b/llvm/lib/Target/RISCV/RISCVInstrInfo.cpp @@ -302,58 +302,38 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); - for (; I != End; I += Incr) { + unsigned I = 0; + while (I != NumRegs) { auto GetCopyInfo = -[](RISCVII::VLMUL LMul,unsigned NF) -> std::tuple { - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - return {SubRegIdx, Opc, VVOpc, VIOpc}; +[&](unsigned SrcReg, +unsigned DstReg) -> std::tuple { + unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); + unsigned DstEncoding = TRI->getEncodingValue(DstReg); + if (!(SrcEncoding & 0b111) && !(DstEncoding & 0b111) && I + 8 <= NumRegs) +return {8, RISCV::VRM8RegClass, RISCV::VMV8R_V, RISCV::PseudoVMV_V_V_M8, +RISCV::PseudoVMV_V_I_M8}; + if (!(SrcEncoding & 0b11) && !(DstEncoding & 0b11) && I + 4 <= NumRegs) +return {4, RISCV::VRM4RegClass, RISCV::VMV4R_V, RISCV::PseudoVMV_V_V_M4, +RISCV::PseudoVMV_V_I_M4}; + if (!(SrcEncoding & 0b1) && !(DstEncoding & 0b1) && I + 2 <= NumRegs) +return {2, RISCV::VRM2RegClass, RISCV::VMV2R_V, RISCV::PseudoVMV_V_V_M2, +RISCV::PseudoVMV_V_I_M2}; + return {1, RISCV::VRRegClass, RISCV::VMV1R_V, RISCV::PseudoVMV_V_V_M1, + RISCV::PseudoVMV_V_I_M1}; }; -auto [SubRegIdx, Opc, VVOpc, VIOpc] = GetCopyInfo(LMul, NF); +auto [NumCopied, RegClass, Opc, VVOpc, VIOpc] = GetCopyInfo(SrcReg, DstReg); MachineBasicBlock::const_iterator DefMBBI; if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { @@ -364,6 +344,20 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } } +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(SrcReg)) { +SrcReg = Reg; +break; + } +} + +for (MCPhysReg Reg : RegClass.getRegisters()) { + if (TRI->getEncodingValue(Reg) == TRI->getEncodingValue(DstReg)) { +DstReg = Reg; +break; + } +} + auto EmitCopy = [&](MCRegister SrcReg, MCRegister DstReg, unsigned Opcode) { auto MIB = BuildMI(MBB, MBBI, DL, get(Opcode), DstReg); bool UseVMV_V_I = RISCV::getRVVMCOpcode(Opcode) == RISCV::VMV_V_I; @@ -385,13 +379,10 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, } }; -if (NF == 1) { - EmitCopy(SrcReg, DstReg, Opc); - return; -} - -EmitCopy(TRI->getSubReg(SrcReg, SubRegIdx + I), - TR
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
@@ -302,102 +302,98 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - - bool UseVMV_V_V = false; - bool UseVMV_V_I = false; - MachineBasicBlock::const_iterator DefMBBI; - if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { -UseVMV_V_V = true; -Opc = VVOpc; - -if (DefMBBI->getOpcode() == VIOpc) { - UseVMV_V_I = true; - Opc = VIOpc; -} - } - - if (NF == 1) { -auto MIB = BuildMI(MBB, MBBI, DL, get(Opc), DstReg); -if (UseVMV_V_V) - MIB.addReg(DstReg, RegState::Undef); -if (UseVMV_V_I) - MIB = MIB.add(DefMBBI->getOperand(2)); -else - MIB = MIB.addReg(SrcReg, getKillRegState(KillSrc)); -if (UseVMV_V_V) { - const MCInstrDesc &Desc = DefMBBI->getDesc(); - MIB.add(DefMBBI->getOperand(RISCVII::getVLOpNum(Desc))); // AVL - MIB.add(DefMBBI->getOperand(RISCVII::getSEWOpNum(Desc))); // SEW - MIB.addImm(0);// tu, mu - MIB.addReg(RISCV::VL, RegState::Implicit); - MIB.addReg(RISCV::VTYPE, RegState::Implicit); -} -return; - } - - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); + if (ReversedCopy) { +// If there exists overlapping, we should copy the registers reversely. +SrcEncoding += NumRegs - LMulVal; +DstEncoding += NumRegs - LMulVal; + } + + unsigned I = 0; + auto GetCopyInfo = [&](uint16_t SrcEncoding, uint16_t DstEncoding) + -> std::tuple { +// If source register encoding and destination register encoding are aligned +// to 8, we can do a LMUL8 copying. +if (SrcEncoding % 8 == 0 && DstEncoding % 8 == 0 && I + 8 <= NumRegs) + return {RISCVII::LMUL_8, RISCV::VRM8RegClass, RISCV::VMV8R_V, + RISCV::PseudoVMV_V_V_M8, RISCV::PseudoVMV_V_I_M8}; +// If source register encoding and destination register encoding are aligned +// to 4, we can do a LMUL4 copying. +if (SrcEncoding % 4 == 0 && DstEncoding % 4 == 0 && I + 4 <= NumRegs) + return {RISCVII::LMUL_4, RISCV::VRM4RegClass, RISCV::VMV4R_V, + RISCV::PseudoVMV_V_V_M4, RISCV::PseudoVMV_V_I_M4}; +// If source register encoding and destination register encoding are aligned +// to 2, we can do a LMUL2 copying. +if (SrcEncoding % 2 == 0 && DstEncoding % 2 == 0 && I + 2 <= NumRegs) + return {RISCVII::LMUL_2, RISCV::VRM2RegClass, RISCV::VMV2R_V, + RISCV::PseudoVMV_V_V_M2, RISCV::PseudoVMV_V_I_M2}; +// Or we should do LMUL1 copying. +return {RISCVII::LMUL_1, RISCV::VRRegClass, RISCV::VMV1R_V, +RISCV::PseudoVMV_V_V_M1, RISCV::PseudoVMV_V_I_M1}; + }; + auto FindRegWithEncoding = [&TRI](const TargetRegisterClass &RegClass, +uint16_t Encoding) { +ArrayRef Regs = RegClass.getRegisters(); +const auto *FoundReg = llvm::find_if(Regs, [&](MCPhysReg Reg) { + return TRI->getEncodingValue(Reg) == Encoding; +}); +// We should be always able to find one valid register. +assert(FoundReg != Regs.end()); +return *FoundReg; + }; wangpc-pp wrote: I tried. But maybe I missed somthing here, the result got wrong. Is possible to get a `VRM8` subreg from a `VRN8M1` register? https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing list llvm-branch-commits@list
[llvm-branch-commits] [llvm] [RISCV] Use larger copies when register tuples are aligned (PR #84455)
@@ -302,102 +302,98 @@ void RISCVInstrInfo::copyPhysRegVector(MachineBasicBlock &MBB, RISCVII::VLMUL LMul, unsigned NF) const { const TargetRegisterInfo *TRI = STI.getRegisterInfo(); - unsigned Opc; - unsigned SubRegIdx; - unsigned VVOpc, VIOpc; - switch (LMul) { - default: -llvm_unreachable("Impossible LMUL for vector register copy."); - case RISCVII::LMUL_1: -Opc = RISCV::VMV1R_V; -SubRegIdx = RISCV::sub_vrm1_0; -VVOpc = RISCV::PseudoVMV_V_V_M1; -VIOpc = RISCV::PseudoVMV_V_I_M1; -break; - case RISCVII::LMUL_2: -Opc = RISCV::VMV2R_V; -SubRegIdx = RISCV::sub_vrm2_0; -VVOpc = RISCV::PseudoVMV_V_V_M2; -VIOpc = RISCV::PseudoVMV_V_I_M2; -break; - case RISCVII::LMUL_4: -Opc = RISCV::VMV4R_V; -SubRegIdx = RISCV::sub_vrm4_0; -VVOpc = RISCV::PseudoVMV_V_V_M4; -VIOpc = RISCV::PseudoVMV_V_I_M4; -break; - case RISCVII::LMUL_8: -assert(NF == 1); -Opc = RISCV::VMV8R_V; -SubRegIdx = RISCV::sub_vrm1_0; // There is no sub_vrm8_0. -VVOpc = RISCV::PseudoVMV_V_V_M8; -VIOpc = RISCV::PseudoVMV_V_I_M8; -break; - } - - bool UseVMV_V_V = false; - bool UseVMV_V_I = false; - MachineBasicBlock::const_iterator DefMBBI; - if (isConvertibleToVMV_V_V(STI, MBB, MBBI, DefMBBI, LMul)) { -UseVMV_V_V = true; -Opc = VVOpc; - -if (DefMBBI->getOpcode() == VIOpc) { - UseVMV_V_I = true; - Opc = VIOpc; -} - } - - if (NF == 1) { -auto MIB = BuildMI(MBB, MBBI, DL, get(Opc), DstReg); -if (UseVMV_V_V) - MIB.addReg(DstReg, RegState::Undef); -if (UseVMV_V_I) - MIB = MIB.add(DefMBBI->getOperand(2)); -else - MIB = MIB.addReg(SrcReg, getKillRegState(KillSrc)); -if (UseVMV_V_V) { - const MCInstrDesc &Desc = DefMBBI->getDesc(); - MIB.add(DefMBBI->getOperand(RISCVII::getVLOpNum(Desc))); // AVL - MIB.add(DefMBBI->getOperand(RISCVII::getSEWOpNum(Desc))); // SEW - MIB.addImm(0);// tu, mu - MIB.addReg(RISCV::VL, RegState::Implicit); - MIB.addReg(RISCV::VTYPE, RegState::Implicit); -} -return; - } - - int I = 0, End = NF, Incr = 1; unsigned SrcEncoding = TRI->getEncodingValue(SrcReg); unsigned DstEncoding = TRI->getEncodingValue(DstReg); unsigned LMulVal; bool Fractional; std::tie(LMulVal, Fractional) = RISCVVType::decodeVLMUL(LMul); assert(!Fractional && "It is impossible be fractional lmul here."); - if (forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NF * LMulVal)) { -I = NF - 1; -End = -1; -Incr = -1; - } + unsigned NumRegs = NF * LMulVal; + bool ReversedCopy = + forwardCopyWillClobberTuple(DstEncoding, SrcEncoding, NumRegs); + if (ReversedCopy) { +// If there exists overlapping, we should copy the registers reversely. +SrcEncoding += NumRegs - LMulVal; +DstEncoding += NumRegs - LMulVal; + } + + unsigned I = 0; + auto GetCopyInfo = [&](uint16_t SrcEncoding, uint16_t DstEncoding) + -> std::tuple { +// If source register encoding and destination register encoding are aligned +// to 8, we can do a LMUL8 copying. +if (SrcEncoding % 8 == 0 && DstEncoding % 8 == 0 && I + 8 <= NumRegs) + return {RISCVII::LMUL_8, RISCV::VRM8RegClass, RISCV::VMV8R_V, + RISCV::PseudoVMV_V_V_M8, RISCV::PseudoVMV_V_I_M8}; +// If source register encoding and destination register encoding are aligned +// to 4, we can do a LMUL4 copying. +if (SrcEncoding % 4 == 0 && DstEncoding % 4 == 0 && I + 4 <= NumRegs) + return {RISCVII::LMUL_4, RISCV::VRM4RegClass, RISCV::VMV4R_V, + RISCV::PseudoVMV_V_V_M4, RISCV::PseudoVMV_V_I_M4}; +// If source register encoding and destination register encoding are aligned +// to 2, we can do a LMUL2 copying. +if (SrcEncoding % 2 == 0 && DstEncoding % 2 == 0 && I + 2 <= NumRegs) + return {RISCVII::LMUL_2, RISCV::VRM2RegClass, RISCV::VMV2R_V, + RISCV::PseudoVMV_V_V_M2, RISCV::PseudoVMV_V_I_M2}; +// Or we should do LMUL1 copying. +return {RISCVII::LMUL_1, RISCV::VRRegClass, RISCV::VMV1R_V, +RISCV::PseudoVMV_V_V_M1, RISCV::PseudoVMV_V_I_M1}; + }; + auto FindRegWithEncoding = [&TRI](const TargetRegisterClass &RegClass, +uint16_t Encoding) { +ArrayRef Regs = RegClass.getRegisters(); +const auto *FoundReg = llvm::find_if(Regs, [&](MCPhysReg Reg) { + return TRI->getEncodingValue(Reg) == Encoding; +}); +// We should be always able to find one valid register. +assert(FoundReg != Regs.end()); +return *FoundReg; + }; wangpc-pp wrote: `VRN8M1` may not be 8-aligned so it may be able to be converted to `VRM8`. I think the subreg mechanism doesn't work here if I understand correctly here. https://github.com/llvm/llvm-project/pull/84455 ___ llvm-branch-commits mailing
[llvm-branch-commits] Revert "[RISCV] Make X5 allocatable for JALR on CPUs without RAS" (PR #78946)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/78946 This reverts commit 333963a9c75b2f79bff73227eae0c72747ac945e. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Revert "[RISCV] Make X5 allocatable for JALR on CPUs without RAS" (PR #78946)
https://github.com/wangpc-pp closed https://github.com/llvm/llvm-project/pull/78946 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] Revert "[RISCV] Make X5 allocatable for JALR on CPUs without RAS" (PR #78946)
wangpc-pp wrote: Sorry for the noise, this is for testing spr tools. https://github.com/llvm/llvm-project/pull/78946 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] PR for llvm/llvm-project#79425 (PR #79560)
https://github.com/wangpc-pp approved this pull request. LGTM. (Is this the right approach in current workflow?) https://github.com/llvm/llvm-project/pull/79560 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [libc] [flang] [llvm] [clang] [compiler-rt] [RISCV] Support select optimization (PR #80124)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/80124 AArch64 has enabled this in https://reviews.llvm.org/D138990, and the measurement data still stands for RISCV. And, similar optimization like #77284 is added too. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [libc] [libcxx] [clang] [llvm] [compiler-rt] [SelectOpt] Print instruction instead of pointer (PR #80125)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/80125 None ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [flang] [libc] [libcxx] [compiler-rt] [clang] [RISCV] Support select optimization (PR #80124)
@@ -0,0 +1,873 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt -select-optimize -mtriple=riscv64 -S < %s \ +; RUN: | FileCheck %s --check-prefix=CHECK-SELECT +; RUN: opt -select-optimize -mtriple=riscv64 -mattr=+enable-select-opt -S < %s \ +; RUN: | FileCheck %s --check-prefix=CHECK-BRANCH +; RUN: opt -select-optimize -mtriple=riscv64 -mattr=+enable-select-opt,+predictable-select-expensive -S < %s \ +; RUN: | FileCheck %s --check-prefix=CHECK-BRANCH + +%struct.st = type { i32, i64, ptr, ptr, i16, ptr, ptr, i64, i64 } + +; This test has a select at the end of if.then, which is better transformed to a branch on OoO cores. + +define void @replace(ptr nocapture noundef %newst, ptr noundef %t, ptr noundef %h, i64 noundef %c, i64 noundef %rc, i64 noundef %ma, i64 noundef %n) { wangpc-pp wrote: This file is copied from AArch64, I don't know if I can reduce it. https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [libcxx] [llvm] [clang] [clang-tools-extra] [lld] [flang] [libc] [compiler-rt] [libcxxabi] [lldb] [SelectOpt] Print instruction instead of pointer (PR #80125)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/80125 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [clang-tools-extra] [flang] [libcxx] [lldb] [libcxxabi] [libc] [compiler-rt] [lld] [clang] [SelectOpt] Print instruction instead of pointer (PR #80125)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/80125 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [flang] [clang] [libcxx] [libc] [compiler-rt] [llvm] [RISCV] Support select optimization (PR #80124)
wangpc-pp wrote: > > and the measurement data still stands for RISCV. > > Please give the measurement data in this review or a direct link to it. I > tried searching for it, and did not immediately find it. It's in the Phabricator link (https://reviews.llvm.org/D138990): > The headline numbers are these for SPEC2017 on a Neoverse N1: > > 500.perlbench_r -0.12% > 502.gcc_r 0.02% > 505.mcf_r 6.02% > 520.omnetpp_r 0.32% > 523.xalancbmk_r 0.20% > 525.x264_r0.02% > 531.deepsjeng_r 0.00% > 541.leela_r -0.09% > 548.exchange2_r 0.00% > 557.xz_r -0.20% > > Running benchmarks with a combination of the llvm-test-suite plus several > versions of SPEC gave between a 0.2% and 0.4% geomean improvement depending > on the core/run. The instruction count went down by 0.1% too. The performance gain is related to core implementation. For RISCV, the subtarget feature `FeatureEnableSelectOptimize` can be appended to tune features if it's beneficial. https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [compiler-rt] [clang] [lldb] [flang] [clang-tools-extra] [lld] [libcxxabi] [libcxx] [libc] [SelectOpt] Print instruction instead of pointer (PR #80125)
https://github.com/wangpc-pp closed https://github.com/llvm/llvm-project/pull/80125 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang-tools-extra] [libcxx] [libcxxabi] [libc] [flang] [llvm] [lldb] [clang] [compiler-rt] [lld] [SelectOpt] Print instruction instead of pointer (PR #80125)
wangpc-pp wrote: Committed as 995d21bc6ff2220b2887cf9640d936eb99b3c617. Somehow `spr` failed with error so I have to land it manually: ``` #️⃣ Pull Request #80125 🛫 Getting started... 🛑 GitHub: Validation Failed Documentation URL: https://docs.github.com/rest/pulls/pulls#update-a-pull-request Errors: - {"code":"invalid","field":"base","message":"Proposed base branch 'refs/heads/main' is invalid","resource":"PullRequest"} ``` https://github.com/llvm/llvm-project/pull/80125 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [compiler-rt] [llvm] [flang] [libcxx] [libc] [RISCV] Support select optimization (PR #80124)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] [clang] [libc] [libcxx] [flang] [RISCV] Support select optimization (PR #80124)
https://github.com/wangpc-pp updated https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [compiler-rt] [llvm] [clang] [libc] [libcxx] [flang] [RISCV] Support select optimization (PR #80124)
@@ -0,0 +1,873 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py +; RUN: opt -select-optimize -mtriple=riscv64 -S < %s \ +; RUN: | FileCheck %s --check-prefix=CHECK-SELECT +; RUN: opt -select-optimize -mtriple=riscv64 -mattr=+enable-select-opt -S < %s \ +; RUN: | FileCheck %s --check-prefix=CHECK-BRANCH +; RUN: opt -select-optimize -mtriple=riscv64 -mattr=+enable-select-opt,+predictable-select-expensive -S < %s \ +; RUN: | FileCheck %s --check-prefix=CHECK-BRANCH + +%struct.st = type { i32, i64, ptr, ptr, i16, ptr, ptr, i64, i64 } + +; This test has a select at the end of if.then, which is better transformed to a branch on OoO cores. + +define void @replace(ptr nocapture noundef %newst, ptr noundef %t, ptr noundef %h, i64 noundef %c, i64 noundef %rc, i64 noundef %ma, i64 noundef %n) { wangpc-pp wrote: Thanks a lot! This methodology of adding ` llvm_unreachable()` is really useful! I have reduced the tests. cc @davemgreen https://github.com/llvm/llvm-project/pull/80124 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [Clang][RISCV] Refactor builtins to TableGen (PR #80280)
https://github.com/wangpc-pp created https://github.com/llvm/llvm-project/pull/80280 This mechanism is introduced by #68324. This refactor makes the prototype and attributes clear. ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits