[llvm-branch-commits] [llvm] [BOLT] Added more details on heatmap docs. (PR #98162)
https://github.com/paschalis-mpeis created https://github.com/llvm/llvm-project/pull/98162 Suggesting a few more details for Heatmaps.md >From f209cca87cf7c53242a353a505e3bfe34688a1b2 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 9 Jul 2024 08:52:51 +0100 Subject: [PATCH] [BOLT] Added more details on heatmap docs. --- bolt/docs/Heatmaps.md | 53 +++ 1 file changed, 39 insertions(+), 14 deletions(-) diff --git a/bolt/docs/Heatmaps.md b/bolt/docs/Heatmaps.md index e1b59d49ad102..c1a01839d6682 100644 --- a/bolt/docs/Heatmaps.md +++ b/bolt/docs/Heatmaps.md @@ -1,9 +1,9 @@ # Code Heatmaps BOLT has gained the ability to print code heatmaps based on -sampling-based LBR profiles generated by `perf`. The output is produced -in colored ASCII to be displayed in a color-capable terminal. It looks -something like this: +sampling-based profiles generated by `perf`, either with `LBR` data or not. +The output is produced in colored ASCII to be displayed in a color-capable +terminal. It looks something like this:  @@ -32,17 +32,7 @@ $ llvm-bolt-heatmap -p perf.data ``` By default the heatmap will be dumped to *stdout*. You can change it -with `-o ` option. Each character/block in the heatmap -shows the execution data accumulated for corresponding 64 bytes of -code. You can change this granularity with a `-block-size` option. -E.g. set it to 4096 to see code usage grouped by 4K pages. -Other useful options are: - -```bash --line-size= - number of entries per line (default 256) --max-address= - maximum address considered valid for heatmap (default 4GB) --print-mappings= - print mappings in legend, between characters/blocks and text sections (default false) -``` +with `-o ` option. If you prefer to look at the data in a browser (or would like to share it that way), then you can use an HTML conversion tool. E.g.: @@ -50,3 +40,38 @@ it that way), then you can use an HTML conversion tool. E.g.: ```bash $ aha -b -f > .html ``` + +--- + +## Background on heatmaps: +A heatmap is effectively a histogram that is rendered into a grid for better +visualization. +In theory we can generate a heatmap using any binary and a perf profile. + +Each block/character in the heatmap shows the execution data accumulated for +corresponding 64 bytes of code. You can change this granularity with a +`-block-size` option. +E.g. set it to 4096 to see code usage grouped by 4K pages. + + +When a block is shown as a dot, it means that no samples were found for that address. +When it is shown as a letter, it indicates a captured sample on a particular text section of the binary. To show a mapping between letters and text sections in the legend, use `-print-mappings`. When a sampled address does not belong to any of the TextSegments, the characters 'o' or 'O' will be shown. + +The legend shows by default the ranges in the heatmap according to the number +of samples per block. +A color is assigned per range, except the first two ranges that distinguished by +lower and upper case letters. + +Each consecutive line in the heatmap advances by the same amount, +with the binary size covered by a line being dependent on the block size and the +line size. +An empty new line is inserted for bigger gaps between samples. + + +Some useful options are: + +``` +-line-size= - number of entries per line (default 256) +-max-address= - maximum address considered valid for heatmap (default 4GB) +-print-mappings - print mappings in the legend, between characters/blocks and text sections (default false) +``` ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Added more details on heatmap docs. (PR #98162)
https://github.com/paschalis-mpeis ready_for_review https://github.com/llvm/llvm-project/pull/98162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][AArch64] Fix static binary patching for ELF. (PR #97710)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/97710 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][AArch64] Fix static binary patching for ELF. (PR #97710)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/97710 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][AArch64] Fix static binary patching for ELF. (PR #97710)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/97710 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][AArch64] Fix static binary patching for ELF. (PR #97710)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/97710 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Added more details on heatmap docs. (PR #98162)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/98162 >From f209cca87cf7c53242a353a505e3bfe34688a1b2 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 9 Jul 2024 08:52:51 +0100 Subject: [PATCH 1/2] [BOLT] Added more details on heatmap docs. --- bolt/docs/Heatmaps.md | 53 +++ 1 file changed, 39 insertions(+), 14 deletions(-) diff --git a/bolt/docs/Heatmaps.md b/bolt/docs/Heatmaps.md index e1b59d49ad102..c1a01839d6682 100644 --- a/bolt/docs/Heatmaps.md +++ b/bolt/docs/Heatmaps.md @@ -1,9 +1,9 @@ # Code Heatmaps BOLT has gained the ability to print code heatmaps based on -sampling-based LBR profiles generated by `perf`. The output is produced -in colored ASCII to be displayed in a color-capable terminal. It looks -something like this: +sampling-based profiles generated by `perf`, either with `LBR` data or not. +The output is produced in colored ASCII to be displayed in a color-capable +terminal. It looks something like this:  @@ -32,17 +32,7 @@ $ llvm-bolt-heatmap -p perf.data ``` By default the heatmap will be dumped to *stdout*. You can change it -with `-o ` option. Each character/block in the heatmap -shows the execution data accumulated for corresponding 64 bytes of -code. You can change this granularity with a `-block-size` option. -E.g. set it to 4096 to see code usage grouped by 4K pages. -Other useful options are: - -```bash --line-size= - number of entries per line (default 256) --max-address= - maximum address considered valid for heatmap (default 4GB) --print-mappings= - print mappings in legend, between characters/blocks and text sections (default false) -``` +with `-o ` option. If you prefer to look at the data in a browser (or would like to share it that way), then you can use an HTML conversion tool. E.g.: @@ -50,3 +40,38 @@ it that way), then you can use an HTML conversion tool. E.g.: ```bash $ aha -b -f > .html ``` + +--- + +## Background on heatmaps: +A heatmap is effectively a histogram that is rendered into a grid for better +visualization. +In theory we can generate a heatmap using any binary and a perf profile. + +Each block/character in the heatmap shows the execution data accumulated for +corresponding 64 bytes of code. You can change this granularity with a +`-block-size` option. +E.g. set it to 4096 to see code usage grouped by 4K pages. + + +When a block is shown as a dot, it means that no samples were found for that address. +When it is shown as a letter, it indicates a captured sample on a particular text section of the binary. To show a mapping between letters and text sections in the legend, use `-print-mappings`. When a sampled address does not belong to any of the TextSegments, the characters 'o' or 'O' will be shown. + +The legend shows by default the ranges in the heatmap according to the number +of samples per block. +A color is assigned per range, except the first two ranges that distinguished by +lower and upper case letters. + +Each consecutive line in the heatmap advances by the same amount, +with the binary size covered by a line being dependent on the block size and the +line size. +An empty new line is inserted for bigger gaps between samples. + + +Some useful options are: + +``` +-line-size= - number of entries per line (default 256) +-max-address= - maximum address considered valid for heatmap (default 4GB) +-print-mappings - print mappings in the legend, between characters/blocks and text sections (default false) +``` >From 3c7b4df7bc9c5e0af1aba3d6fa95a91b98fca40b Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Thu, 11 Jul 2024 09:05:35 +0100 Subject: [PATCH 2/2] Added explanation for x axis. --- bolt/docs/HeatmapHeader.png | Bin 0 -> 76799 bytes bolt/docs/Heatmaps.md | 29 +++-- 2 files changed, 23 insertions(+), 6 deletions(-) create mode 100644 bolt/docs/HeatmapHeader.png diff --git a/bolt/docs/HeatmapHeader.png b/bolt/docs/HeatmapHeader.png new file mode 100644 index ..a519dc6215d8cf844f78a7592760cc30738b1a14 GIT binary patch literal 76799 zcmeFZWl$X5+9-;90)YU*2Z9qkxDD=Z!68_1g1bgY2yVgMEqGwC7%stG5?lux9Om@A zTkBr-sW1E7Key^oH8V|jueEwTuKi9!P5vGZISvvM(mh2587(9vTofcE;5(pBO`y;JTNA)+NZ>%uHFsMLt5nqrZ zq9iRoPS5J7j55pRH5MD zG}hPk#i7nl;XOEf_!l1DZW_TK2ENPd5}bXgn>gRYYz*M zkCUUbyReTK-CuVI1ILKdoOGbSZt-vsqkE>R0g`rcvj*{VaB*8J371prxv34|&v2y}o2CO0e^eK<%U-$pNuKa7oe;fJiUn6-0xds1y=)Yb1 ze-71kw|0|uaRS!#5dRmx{yzBMFaCX?C?{g;{|1V`!TGPV0HDQjL^=P7G;tj0jN}sl zkCb*Y>N>zLKxKdZEdbxlzz6XQd@A~$Vyu@VA%T$;Wh8Zckarg_eICkEbROzULEFxv zL*9c}L3DdWadaRh%&k#>b(B$mDAhv^Z~^Q46GjlqXn|7osK_f98;)jUx7V*;5!Awc zslV@My@Ihr9e-b=phs41jPp*2j38Tp4pfyF`|fCxU4 zqWm}O<{*W3Rxc-X(LVmuN5%qyL0r|yXiq?~^maJCsgRHY#cW-zA|=kLTL}Hea7OG( z_i2AKZ#O%rlZEb&W+#B5VWCC)4YSz33Nh*I?^o_iu4YIo7fAS6^Lu&+l_cx(qIcGe zwC^iauD=c9jZ$_&gN#N1M*5$BnIa>jgqnpiy_1$%_C=rdreV6keOTjidp?5N>~s5p zUNuYU_xC85>W+Y46+Yfo9B0slhN+
[llvm-branch-commits] [llvm] [NFC][BOLT] Rename createDummyReturnFunction to createReturnBody (PR #98448)
https://github.com/paschalis-mpeis created https://github.com/llvm/llvm-project/pull/98448 `createDummyReturnFunction` is not creating a function but instead only a function body that is simply a return statement. This patch renames it to `createReturnBody` >From b564185beebcd5d4d18036edfd2f1a76370f3f8f Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Thu, 11 Jul 2024 09:32:12 +0100 Subject: [PATCH] [NFC][BOLT] Rename createDummyReturnFunction to createReturnBody createDummyReturnFunction is not creating a function but instead only a function body that is simply a return statement. So it is renamed to: createReturnBody --- bolt/include/bolt/Core/MCPlusBuilder.h | 2 +- bolt/lib/Passes/Instrumentation.cpp| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 885d627f7b64f..c20b0edc36499 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -2044,7 +2044,7 @@ class MCPlusBuilder { /// Returns a function body that contains only a return instruction. An /// example usage is a workaround for the '__bolt_fini_trampoline' of // Instrumentation. - virtual InstructionListType createDummyReturnFunction(MCContext *Ctx) const { + virtual InstructionListType createReturnBody(MCContext *Ctx) const { InstructionListType Insts(1); createReturn(Insts[0]); return Insts; diff --git a/bolt/lib/Passes/Instrumentation.cpp b/bolt/lib/Passes/Instrumentation.cpp index e824a42d82696..805e7a7434f8f 100644 --- a/bolt/lib/Passes/Instrumentation.cpp +++ b/bolt/lib/Passes/Instrumentation.cpp @@ -754,7 +754,7 @@ void Instrumentation::createAuxiliaryFunctions(BinaryContext &BC) { // with unknown symbol in runtime library. E.g. for static PIE // executable createSimpleFunction("__bolt_fini_trampoline", - BC.MIB->createDummyReturnFunction(BC.Ctx.get())); + BC.MIB->createReturnBody(BC.Ctx.get())); } } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][BOLT] Rename createDummyReturnFunction to createReturnBody (PR #98448)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/98448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][BOLT] Rename createDummyReturnFunction to createReturnBody (PR #98448)
https://github.com/paschalis-mpeis ready_for_review https://github.com/llvm/llvm-project/pull/98448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][docs] Expand Heatmaps.md (PR #98162)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/98162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][docs] Expand Heatmaps.md (PR #98162)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/98162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][docs] Expand Heatmaps.md (PR #98162)
paschalis-mpeis wrote: Thanks for your review @aaupov . This is a stacked pull request as the updated docs refer to the `-print-mappings` option. So it will be merged after: - #97567 https://github.com/llvm/llvm-project/pull/98162 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][BOLT] Rename createDummyReturnFunction to createReturnBody (PR #98448)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/98448 >From 2e9d663a9a164735fbe1a2408994acc1abaa8c21 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Thu, 11 Jul 2024 09:32:12 +0100 Subject: [PATCH] [NFC][BOLT] Rename createDummyReturnFunction to createReturnInstructionList createDummyReturnFunction is not creating a function but instead only a function body that is simply a return statement. This patch renames it to: createReturnInstructionList --- bolt/include/bolt/Core/MCPlusBuilder.h | 3 ++- bolt/lib/Passes/Instrumentation.cpp| 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/bolt/include/bolt/Core/MCPlusBuilder.h b/bolt/include/bolt/Core/MCPlusBuilder.h index 885d627f7b64f..c916c6f95751f 100644 --- a/bolt/include/bolt/Core/MCPlusBuilder.h +++ b/bolt/include/bolt/Core/MCPlusBuilder.h @@ -2044,7 +2044,8 @@ class MCPlusBuilder { /// Returns a function body that contains only a return instruction. An /// example usage is a workaround for the '__bolt_fini_trampoline' of // Instrumentation. - virtual InstructionListType createDummyReturnFunction(MCContext *Ctx) const { + virtual InstructionListType + createReturnInstructionList(MCContext *Ctx) const { InstructionListType Insts(1); createReturn(Insts[0]); return Insts; diff --git a/bolt/lib/Passes/Instrumentation.cpp b/bolt/lib/Passes/Instrumentation.cpp index e824a42d82696..ebb3925749b4d 100644 --- a/bolt/lib/Passes/Instrumentation.cpp +++ b/bolt/lib/Passes/Instrumentation.cpp @@ -754,7 +754,7 @@ void Instrumentation::createAuxiliaryFunctions(BinaryContext &BC) { // with unknown symbol in runtime library. E.g. for static PIE // executable createSimpleFunction("__bolt_fini_trampoline", - BC.MIB->createDummyReturnFunction(BC.Ctx.get())); + BC.MIB->createReturnInstructionList(BC.Ctx.get())); } } } ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [NFC][BOLT] Rename createDummyReturnFunction to createReturnBody (PR #98448)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/98448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] NFC][BOLT] Rename createDummyReturnFunction to createReturnInstructionList (PR #98448)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/98448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] NFC][BOLT] Rename createDummyReturnFunction to createReturnInstructi.. (PR #98448)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/98448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] NFC][BOLT] Rename createDummyReturnFunction to createReturnInstructi.. (PR #98448)
paschalis-mpeis wrote: Rebased on top of the updated #96626 Thanks for suggesting `createReturnInstructionList`; I think it's better so I used it. https://github.com/llvm/llvm-project/pull/98448 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/82488 >From 641aaf7c13d520bef52b092726f8346bfecb1c8d Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Wed, 21 Feb 2024 11:53:00 + Subject: [PATCH 1/3] SLP cannot vectorize frem calls in AArch64. It needs updated costs when there are available vector library functions given the VF and type. --- .../SLPVectorizer/AArch64/slp-frem.ll | 71 +++ 1 file changed, 71 insertions(+) create mode 100644 llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll diff --git a/llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll b/llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll new file mode 100644 index 00..45f667f5657889 --- /dev/null +++ b/llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll @@ -0,0 +1,71 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4 +; RUN: opt < %s -S -mtriple=aarch64 -vector-library=ArmPL -passes=slp-vectorizer | FileCheck %s + +@a = common global ptr null, align 8 + +define void @frem_v2double() { +; CHECK-LABEL: define void @frem_v2double() { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[A0:%.*]] = load double, ptr @a, align 8 +; CHECK-NEXT:[[A1:%.*]] = load double, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 +; CHECK-NEXT:[[B0:%.*]] = load double, ptr @a, align 8 +; CHECK-NEXT:[[B1:%.*]] = load double, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 +; CHECK-NEXT:[[R0:%.*]] = frem double [[A0]], [[B0]] +; CHECK-NEXT:[[R1:%.*]] = frem double [[A1]], [[B1]] +; CHECK-NEXT:store double [[R0]], ptr @a, align 8 +; CHECK-NEXT:store double [[R1]], ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 +; CHECK-NEXT:ret void +; +entry: + %a0 = load double, ptr getelementptr inbounds (double, ptr @a, i64 0), align 8 + %a1 = load double, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 + %b0 = load double, ptr getelementptr inbounds (double, ptr @a, i64 0), align 8 + %b1 = load double, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 + %r0 = frem double %a0, %b0 + %r1 = frem double %a1, %b1 + store double %r0, ptr getelementptr inbounds (double, ptr @a, i64 0), align 8 + store double %r1, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 + ret void +} + +define void @frem_v4float() { +; CHECK-LABEL: define void @frem_v4float() { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[A0:%.*]] = load float, ptr @a, align 8 +; CHECK-NEXT:[[A1:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 +; CHECK-NEXT:[[A2:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 +; CHECK-NEXT:[[A3:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 +; CHECK-NEXT:[[B0:%.*]] = load float, ptr @a, align 8 +; CHECK-NEXT:[[B1:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 +; CHECK-NEXT:[[B2:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 +; CHECK-NEXT:[[B3:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 +; CHECK-NEXT:[[R0:%.*]] = frem float [[A0]], [[B0]] +; CHECK-NEXT:[[R1:%.*]] = frem float [[A1]], [[B1]] +; CHECK-NEXT:[[R2:%.*]] = frem float [[A2]], [[B2]] +; CHECK-NEXT:[[R3:%.*]] = frem float [[A3]], [[B3]] +; CHECK-NEXT:store float [[R0]], ptr @a, align 8 +; CHECK-NEXT:store float [[R1]], ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 +; CHECK-NEXT:store float [[R2]], ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 +; CHECK-NEXT:store float [[R3]], ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 +; CHECK-NEXT:ret void +; +entry: + %a0 = load float, ptr getelementptr inbounds (float, ptr @a, i64 0), align 8 + %a1 = load float, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 + %a2 = load float, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 + %a3 = load float, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 + %b0 = load float, ptr getelementptr inbounds (float, ptr @a, i64 0), align 8 + %b1 = load float, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 + %b2 = load float, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 + %b3 = load float, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 + %r0 = frem float %a0, %b0 + %r1 = frem float %a1, %b1 + %r2 = frem float %a2, %b2 + %r3 = frem float %a3, %b3 + store float %r0, ptr getelementptr inbounds (float, ptr @a, i64 0), align 8 + store float %r1, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 + store float %r2, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 + store float %r3, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 + ret void +} + >From 29ae086478e3d4bae6b6250670f87273359626d7 Mon Sep 17
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
@@ -8362,9 +8362,20 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef VectorizedVals, unsigned OpIdx = isa(VL0) ? 0 : 1; TTI::OperandValueInfo Op1Info = getOperandInfo(E->getOperand(0)); TTI::OperandValueInfo Op2Info = getOperandInfo(E->getOperand(OpIdx)); - return TTI->getArithmeticInstrCost(ShuffleOrOp, VecTy, CostKind, Op1Info, - Op2Info) + - CommonCost; + auto VecCost = TTI->getArithmeticInstrCost(ShuffleOrOp, VecTy, CostKind, + Op1Info, Op2Info); + // Some targets can replace frem with vector library calls. + if (ShuffleOrOp == Instruction::FRem) { +LibFunc Func; +if (TLI->getLibFunc(ShuffleOrOp, ScalarTy, Func) && +TLI->isFunctionVectorizable(TLI->getName(Func), +VecTy->getElementCount())) { paschalis-mpeis wrote: Unfortunately TLI (TargetLibraryInfo) is not available in TTI, and changing TTI->getArithmeticInstrCost's signature would require way too many changes. But I agree that part of this logic is better to move, so I've introduced 'getVecLibCallCost' to ensapculate that functionality. https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
paschalis-mpeis wrote: Addressed reviewers and rebased to parent pr: - #80423 Github is now rendering **only** the changes of this patch. https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/82488 >From 641aaf7c13d520bef52b092726f8346bfecb1c8d Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Wed, 21 Feb 2024 11:53:00 + Subject: [PATCH 1/4] SLP cannot vectorize frem calls in AArch64. It needs updated costs when there are available vector library functions given the VF and type. --- .../SLPVectorizer/AArch64/slp-frem.ll | 71 +++ 1 file changed, 71 insertions(+) create mode 100644 llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll diff --git a/llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll b/llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll new file mode 100644 index 00..45f667f5657889 --- /dev/null +++ b/llvm/test/Transforms/SLPVectorizer/AArch64/slp-frem.ll @@ -0,0 +1,71 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4 +; RUN: opt < %s -S -mtriple=aarch64 -vector-library=ArmPL -passes=slp-vectorizer | FileCheck %s + +@a = common global ptr null, align 8 + +define void @frem_v2double() { +; CHECK-LABEL: define void @frem_v2double() { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[A0:%.*]] = load double, ptr @a, align 8 +; CHECK-NEXT:[[A1:%.*]] = load double, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 +; CHECK-NEXT:[[B0:%.*]] = load double, ptr @a, align 8 +; CHECK-NEXT:[[B1:%.*]] = load double, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 +; CHECK-NEXT:[[R0:%.*]] = frem double [[A0]], [[B0]] +; CHECK-NEXT:[[R1:%.*]] = frem double [[A1]], [[B1]] +; CHECK-NEXT:store double [[R0]], ptr @a, align 8 +; CHECK-NEXT:store double [[R1]], ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 +; CHECK-NEXT:ret void +; +entry: + %a0 = load double, ptr getelementptr inbounds (double, ptr @a, i64 0), align 8 + %a1 = load double, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 + %b0 = load double, ptr getelementptr inbounds (double, ptr @a, i64 0), align 8 + %b1 = load double, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 + %r0 = frem double %a0, %b0 + %r1 = frem double %a1, %b1 + store double %r0, ptr getelementptr inbounds (double, ptr @a, i64 0), align 8 + store double %r1, ptr getelementptr inbounds (double, ptr @a, i64 1), align 8 + ret void +} + +define void @frem_v4float() { +; CHECK-LABEL: define void @frem_v4float() { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[A0:%.*]] = load float, ptr @a, align 8 +; CHECK-NEXT:[[A1:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 +; CHECK-NEXT:[[A2:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 +; CHECK-NEXT:[[A3:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 +; CHECK-NEXT:[[B0:%.*]] = load float, ptr @a, align 8 +; CHECK-NEXT:[[B1:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 +; CHECK-NEXT:[[B2:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 +; CHECK-NEXT:[[B3:%.*]] = load float, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 +; CHECK-NEXT:[[R0:%.*]] = frem float [[A0]], [[B0]] +; CHECK-NEXT:[[R1:%.*]] = frem float [[A1]], [[B1]] +; CHECK-NEXT:[[R2:%.*]] = frem float [[A2]], [[B2]] +; CHECK-NEXT:[[R3:%.*]] = frem float [[A3]], [[B3]] +; CHECK-NEXT:store float [[R0]], ptr @a, align 8 +; CHECK-NEXT:store float [[R1]], ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 +; CHECK-NEXT:store float [[R2]], ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 +; CHECK-NEXT:store float [[R3]], ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 +; CHECK-NEXT:ret void +; +entry: + %a0 = load float, ptr getelementptr inbounds (float, ptr @a, i64 0), align 8 + %a1 = load float, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 + %a2 = load float, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 + %a3 = load float, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 + %b0 = load float, ptr getelementptr inbounds (float, ptr @a, i64 0), align 8 + %b1 = load float, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 + %b2 = load float, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 + %b3 = load float, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 + %r0 = frem float %a0, %b0 + %r1 = frem float %a1, %b1 + %r2 = frem float %a2, %b2 + %r3 = frem float %a3, %b3 + store float %r0, ptr getelementptr inbounds (float, ptr @a, i64 0), align 8 + store float %r1, ptr getelementptr inbounds (float, ptr @a, i64 1), align 8 + store float %r2, ptr getelementptr inbounds (float, ptr @a, i64 2), align 8 + store float %r3, ptr getelementptr inbounds (float, ptr @a, i64 3), align 8 + ret void +} + >From 29ae086478e3d4bae6b6250670f87273359626d7 Mon Sep 17
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
@@ -869,6 +870,18 @@ TargetTransformInfo::getOperandInfo(const Value *V) { return {OpInfo, OpProps}; } +InstructionCost TargetTransformInfo::getVecLibCallCost( +const int OpCode, const TargetLibraryInfo *TLI, VectorType *VecTy, +TTI::TargetCostKind CostKind) { + Type *ScalarTy = VecTy->getScalarType(); + LibFunc Func; + if (TLI->getLibFunc(OpCode, ScalarTy, Func) && + TLI->isFunctionVectorizable(TLI->getName(Func), VecTy->getElementCount())) paschalis-mpeis wrote: Actually, in LoopVectorizer we have done it this way. For the sake of reusing the code between the 2 vectorizers, I've put it in TTI, but after our discussion I've now moved it to `VectorUtils`. LoopVectorizer should now reuse this method. https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
@@ -8362,9 +8362,20 @@ BoUpSLP::getEntryCost(const TreeEntry *E, ArrayRef VectorizedVals, unsigned OpIdx = isa(VL0) ? 0 : 1; TTI::OperandValueInfo Op1Info = getOperandInfo(E->getOperand(0)); TTI::OperandValueInfo Op2Info = getOperandInfo(E->getOperand(OpIdx)); - return TTI->getArithmeticInstrCost(ShuffleOrOp, VecTy, CostKind, Op1Info, - Op2Info) + - CommonCost; + auto VecCost = TTI->getArithmeticInstrCost(ShuffleOrOp, VecTy, CostKind, + Op1Info, Op2Info); + // Some targets can replace frem with vector library calls. + if (ShuffleOrOp == Instruction::FRem) { +LibFunc Func; +if (TLI->getLibFunc(ShuffleOrOp, ScalarTy, Func) && +TLI->isFunctionVectorizable(TLI->getName(Func), +VecTy->getElementCount())) { paschalis-mpeis wrote: I'd rather not adapt `TTI.getArithmeticInstrCost` API, as that would cause changes in 150+ places, including every target that overrides that. But I understand your concern as this is specific, affecting one of our instructions. TLI is simply passed to `getVecLibCallCost`, just like it's widely passed around in other places in the SLP code. This isn't making the code any target dependent, TLI/TTI are abstractions to that. Internally, `getVecLibCallCost` does those checks, which have now moved to `VectorUtils`. SLP gets a valid cost if there's one, and picks the minimum between the two. https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [AArch64] SLP can vectorize frem (PR #82488)
paschalis-mpeis wrote: The benefits of having`getFRemInstrCost` in my view are the below: 1. frem is a special case anyway: It's an IR instruction that is not supported by all hw and targets have to specialize. Handling it in a dedicated switch case with a dedicated TTI function call, clearly exposes that information to anyone who reads the code in both vectorizers (and not obscuring it away). Plus it won't add any `if (TLI hasVecLib) doThis else doThat` logic to the vectorizers. 2. This won't be a significant API change. It won't force any other user of the `getArithmeticInstrCost` to go through that change. https://github.com/llvm/llvm-project/pull/82488 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LV][LAA] Vectorize math lib calls with mem write-only attribute (PR #78432)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/78432 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LV][LAA] Vectorize math lib calls with mem write-only attribute (PR #78432)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/78432 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LV][LAA] Vectorize math lib calls with mem write-only attribute (PR #78432)
@@ -0,0 +1,52 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --filter "call.*(frexp|modf)" --version 4 +// RUN: %clang --target=aarch64-linux-gnu -march=armv8-a+sve -O3 -isystem %S/../Headers/Inputs/include -mllvm -vector-library=ArmPL -mllvm -force-vector-interleave=1 -mllvm -prefer-predicate-over-epilogue=predicate-dont-vectorize -emit-llvm -S -o - %s | FileCheck %s + +// REQUIRES: aarch64-registered-target + +/* +Testing vectorization of math functions that have the attribute write-only to +memory set. Given they have vectorized counterparts, they should be able to +vectorize. +*/ + +// The following define is required to access some math functions. +#define _GNU_SOURCE +#include + +// frexp/frexpf have no TLI mappings yet. + +// CHECK-LABEL: define dso_local void @frexp_f64( +// CHECK-SAME: ptr nocapture noundef readonly [[IN:%.*]], ptr nocapture noundef writeonly [[OUT1:%.*]], ptr nocapture noundef writeonly [[OUT2:%.*]], i32 noundef [[N:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] { +// CHECK:[[CALL:%.*]] = tail call double @frexp(double noundef [[TMP0:%.*]], ptr noundef [[ADD_PTR:%.*]]) #[[ATTR2:[0-9]+]] +// +void frexp_f64(double *in, double *out1, int *out2, int N) { + for (int i = 0; i < N; ++i) +*out1 = frexp(in[i], out2+i); +} + +// CHECK-LABEL: define dso_local void @frexp_f32( +// CHECK-SAME: ptr nocapture noundef readonly [[IN:%.*]], ptr nocapture noundef writeonly [[OUT1:%.*]], ptr nocapture noundef writeonly [[OUT2:%.*]], i32 noundef [[N:%.*]]) local_unnamed_addr #[[ATTR0]] { +// CHECK:[[CALL:%.*]] = tail call float @frexpf(float noundef [[TMP0:%.*]], ptr noundef [[ADD_PTR:%.*]]) #[[ATTR2]] +// +void frexp_f32(float *in, float *out1, int *out2, int N) { + for (int i = 0; i < N; ++i) +*out1 = frexpf(in[i], out2+i); +} + +// CHECK-LABEL: define dso_local void @modf_f64( +// CHECK-SAME: ptr nocapture noundef readonly [[IN:%.*]], ptr nocapture noundef writeonly [[OUT1:%.*]], ptr nocapture noundef writeonly [[OUT2:%.*]], i32 noundef [[N:%.*]]) local_unnamed_addr #[[ATTR0]] { +// CHECK:[[CALL:%.*]] = tail call double @modf(double noundef [[TMP0:%.*]], ptr noundef [[ADD_PTR:%.*]]) #[[ATTR3:[0-9]+]] +// +void modf_f64(double *in, double *out1, double *out2, int N) { + for (int i = 0; i < N; ++i) + out1[i] = modf(in[i], out2+i); +} + +// CHECK-LABEL: define dso_local void @modf_f32( +// CHECK-SAME: ptr nocapture noundef readonly [[IN:%.*]], ptr nocapture noundef writeonly [[OUT1:%.*]], ptr nocapture noundef writeonly [[OUT2:%.*]], i32 noundef [[N:%.*]]) local_unnamed_addr #[[ATTR0]] { +// CHECK:[[CALL:%.*]] = tail call float @modff(float noundef [[TMP0:%.*]], ptr noundef [[ADD_PTR:%.*]]) #[[ATTR4:[0-9]+]] +// +void modf_f32(float *in, float *out1, float *out2, int N) { + for (int i = 0; i < N; ++i) + out1[i] = modff(in[i], out2+i); +} paschalis-mpeis wrote: This was converted to LLVM tests for LoopVectorizer, and also added LoopAccessAnalysis tests. Finally, the PR is stacked on top of #83143, which will re-introduce at least the modf/modff mappings https://github.com/llvm/llvm-project/pull/78432 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LV][LAA] Vectorize math lib calls with mem write-only attribute (PR #78432)
https://github.com/paschalis-mpeis ready_for_review https://github.com/llvm/llvm-project/pull/78432 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LV][LAA] Vectorize math lib calls with mem write-only attribute (PR #78432)
https://github.com/paschalis-mpeis converted_to_draft https://github.com/llvm/llvm-project/pull/78432 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [clang] [llvm] [LV][LAA] Vectorize math lib calls with mem write-only attribute (PR #78432)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/78432 >From a74ba110994e4535cd6c9206aa02d50503fb5577 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 27 Feb 2024 15:00:28 + Subject: [PATCH 1/7] [AArch64][TLI] Add TLI mappings for ArmPL modf, sincos, sincospi --- llvm/include/llvm/Analysis/VecFuncs.def | 6 ++ .../AArch64/veclib-function-calls.ll | 66 ++- llvm/test/Transforms/Util/add-TLI-mappings.ll | 32 +++-- 3 files changed, 67 insertions(+), 37 deletions(-) diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def index 394e4a05fbc0cf..10f1333cf8885c 100644 --- a/llvm/include/llvm/Analysis/VecFuncs.def +++ b/llvm/include/llvm/Analysis/VecFuncs.def @@ -1005,6 +1005,8 @@ TLI_DEFINE_VECFUNC("llvm.log2.f32", "armpl_svlog2_f32_x", SCALABLE(4), MASKED, " TLI_DEFINE_VECFUNC("modf", "armpl_vmodfq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2vl8") TLI_DEFINE_VECFUNC("modff", "armpl_vmodfq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4vl4") +TLI_DEFINE_VECFUNC("modf", "armpl_svmodf_f64_x", SCALABLE(2), MASKED, "_ZGVsMxvl8") +TLI_DEFINE_VECFUNC("modff", "armpl_svmodf_f32_x", SCALABLE(4), MASKED, "_ZGVsMxvl4") TLI_DEFINE_VECFUNC("nextafter", "armpl_vnextafterq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2vv") TLI_DEFINE_VECFUNC("nextafterf", "armpl_vnextafterq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4vv") @@ -1033,9 +1035,13 @@ TLI_DEFINE_VECFUNC("llvm.sin.f32", "armpl_svsin_f32_x", SCALABLE(4), MASKED, "_Z TLI_DEFINE_VECFUNC("sincos", "armpl_vsincosq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2vl8l8") TLI_DEFINE_VECFUNC("sincosf", "armpl_vsincosq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4vl4l4") +TLI_DEFINE_VECFUNC("sincos", "armpl_svsincos_f64_x", SCALABLE(2), MASKED, "_ZGVsMxvl8l8") +TLI_DEFINE_VECFUNC("sincosf", "armpl_svsincos_f32_x", SCALABLE(4), MASKED, "_ZGVsMxvl4l4") TLI_DEFINE_VECFUNC("sincospi", "armpl_vsincospiq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2vl8l8") TLI_DEFINE_VECFUNC("sincospif", "armpl_vsincospiq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4vl4l4") +TLI_DEFINE_VECFUNC("sincospi", "armpl_svsincospi_f64_x", SCALABLE(2), MASKED, "_ZGVsMxvl8l8") +TLI_DEFINE_VECFUNC("sincospif", "armpl_svsincospi_f32_x", SCALABLE(4), MASKED, "_ZGVsMxvl4l4") TLI_DEFINE_VECFUNC("sinh", "armpl_vsinhq_f64", FIXED(2), NOMASK, "_ZGV_LLVM_N2v") TLI_DEFINE_VECFUNC("sinhf", "armpl_vsinhq_f32", FIXED(4), NOMASK, "_ZGV_LLVM_N4v") diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/veclib-function-calls.ll b/llvm/test/Transforms/LoopVectorize/AArch64/veclib-function-calls.ll index dd1495626eb984..d9cc630482fc80 100644 --- a/llvm/test/Transforms/LoopVectorize/AArch64/veclib-function-calls.ll +++ b/llvm/test/Transforms/LoopVectorize/AArch64/veclib-function-calls.ll @@ -2925,11 +2925,12 @@ define void @modf_f64(ptr noalias %a, ptr noalias %b, ptr noalias %c) { ; ; ARMPL-SVE-LABEL: define void @modf_f64 ; ARMPL-SVE-SAME: (ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], ptr noalias [[C:%.*]]) #[[ATTR0]] { -; ARMPL-SVE:[[DATA:%.*]] = call double @modf(double [[NUM:%.*]], ptr [[GEPB:%.*]]) #[[ATTR4:[0-9]+]] +; ARMPL-SVE:[[TMP23:%.*]] = call @armpl_svmodf_f64_x( [[WIDE_MASKED_LOAD:%.*]], ptr [[TMP22:%.*]], [[ACTIVE_LANE_MASK:%.*]]) ; ; ARMPL-SVE-NOPRED-LABEL: define void @modf_f64 ; ARMPL-SVE-NOPRED-SAME: (ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], ptr noalias [[C:%.*]]) #[[ATTR0]] { -; ARMPL-SVE-NOPRED:[[TMP5:%.*]] = call <2 x double> @armpl_vmodfq_f64(<2 x double> [[WIDE_LOAD:%.*]], ptr [[TMP4:%.*]]) +; ARMPL-SVE-NOPRED:[[TMP17:%.*]] = call @armpl_svmodf_f64_x( [[WIDE_LOAD:%.*]], ptr [[TMP16:%.*]], shufflevector ( insertelement ( poison, i1 true, i64 0), poison, zeroinitializer)) +; ARMPL-SVE-NOPRED:[[DATA:%.*]] = call double @modf(double [[NUM:%.*]], ptr [[GEPB:%.*]]) #[[ATTR64:[0-9]+]] ; entry: br label %for.body @@ -2970,11 +2971,12 @@ define void @modf_f32(ptr noalias %a, ptr noalias %b, ptr noalias %c) { ; ; ARMPL-SVE-LABEL: define void @modf_f32 ; ARMPL-SVE-SAME: (ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], ptr noalias [[C:%.*]]) #[[ATTR0]] { -; ARMPL-SVE:[[DATA:%.*]] = call float @modff(float [[NUM:%.*]], ptr [[GEPB:%.*]]) #[[ATTR5:[0-9]+]] +; ARMPL-SVE:[[TMP23:%.*]] = call @armpl_svmodf_f32_x( [[WIDE_MASKED_LOAD:%.*]], ptr [[TMP22:%.*]], [[ACTIVE_LANE_MASK:%.*]]) ; ; ARMPL-SVE-NOPRED-LABEL: define void @modf_f32 ; ARMPL-SVE-NOPRED-SAME: (ptr noalias [[A:%.*]], ptr noalias [[B:%.*]], ptr noalias [[C:%.*]]) #[[ATTR0]] { -; ARMPL-SVE-NOPRED:[[TMP5:%.*]] = call <4 x float> @armpl_vmodfq_f32(<4 x float> [[WIDE_LOAD:%.*]], ptr [[TMP4:%.*]]) +; ARMPL-SVE-NOPRED:[[TMP17:%.*]] = call @armpl_svmodf_f32_x( [[WIDE_LOAD:%.*]], ptr [[TMP16:%.*]], shufflevector ( insertelement ( poison, i1 true, i64 0), poison, zeroinitializer)) +; ARMPL-SVE-NOPRED:[[DATA:%.*]] = call float @modff(float [[NUM:%.*]], ptr [[GEPB:%.*]]) #[[A
[llvm-branch-commits] [llvm] 21b30e1 - [NFC][TLI] Improve tests for ArmPL and SLEEF Intrinsics.
Author: Paschalis Mpeis Date: 2023-11-24T16:49:05Z New Revision: 21b30e18814016dc61b1a1ed87609e53454e3553 URL: https://github.com/llvm/llvm-project/commit/21b30e18814016dc61b1a1ed87609e53454e3553 DIFF: https://github.com/llvm/llvm-project/commit/21b30e18814016dc61b1a1ed87609e53454e3553.diff LOG: [NFC][TLI] Improve tests for ArmPL and SLEEF Intrinsics. Auto-generate test `armpl-intrinsics.ll`, and use active lane mask to have shorter `shufflevector` check lines. Update scripts now add `@llvm.compiler.used` instead of using the regex: `@[[LLVM_COMPILER_USED:[a-zA-Z0-9_$"\\.-]+]]` Added: Modified: llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef.ll llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll llvm/test/Transforms/LoopVectorize/AArch64/sleef-intrinsic-calls-aarch64.ll Removed: diff --git a/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll b/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll index a38d4a53407c5d2..18431ae021f9766 100644 --- a/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll +++ b/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll @@ -15,7 +15,7 @@ declare @llvm.cos.nxv2f64() declare @llvm.cos.nxv4f32() ;. -; CHECK: @[[LLVM_COMPILER_USED:[a-zA-Z0-9_$"\\.-]+]] = appending global [16 x ptr] [ptr @armpl_vcosq_f64, ptr @armpl_vcosq_f32, ptr @armpl_vsinq_f64, ptr @armpl_vsinq_f32, ptr @armpl_vexpq_f64, ptr @armpl_vexpq_f32, ptr @armpl_vexp2q_f64, ptr @armpl_vexp2q_f32, ptr @armpl_vexp10q_f64, ptr @armpl_vexp10q_f32, ptr @armpl_vlogq_f64, ptr @armpl_vlogq_f32, ptr @armpl_vlog2q_f64, ptr @armpl_vlog2q_f32, ptr @armpl_vlog10q_f64, ptr @armpl_vlog10q_f32], section "llvm.metadata" +; CHECK: @llvm.compiler.used = appending global [16 x ptr] [ptr @armpl_vcosq_f64, ptr @armpl_vcosq_f32, ptr @armpl_vsinq_f64, ptr @armpl_vsinq_f32, ptr @armpl_vexpq_f64, ptr @armpl_vexpq_f32, ptr @armpl_vexp2q_f64, ptr @armpl_vexp2q_f32, ptr @armpl_vexp10q_f64, ptr @armpl_vexp10q_f32, ptr @armpl_vlogq_f64, ptr @armpl_vlogq_f32, ptr @armpl_vlog2q_f64, ptr @armpl_vlog2q_f32, ptr @armpl_vlog10q_f64, ptr @armpl_vlog10q_f32], section "llvm.metadata" ;. define <2 x double> @llvm_cos_f64(<2 x double> %in) { ; CHECK-LABEL: define <2 x double> @llvm_cos_f64 diff --git a/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef.ll b/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef.ll index cedb7dd85149d00..be247de368056e7 100644 --- a/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef.ll +++ b/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef.ll @@ -4,7 +4,7 @@ target triple = "aarch64-unknown-linux-gnu" ;. -; CHECK: @[[LLVM_COMPILER_USED:[a-zA-Z0-9_$"\\.-]+]] = appending global [16 x ptr] [ptr @_ZGVnN2v_cos, ptr @_ZGVnN4v_cosf, ptr @_ZGVnN2v_exp, ptr @_ZGVnN4v_expf, ptr @_ZGVnN2v_exp2, ptr @_ZGVnN4v_exp2f, ptr @_ZGVnN2v_exp10, ptr @_ZGVnN4v_exp10f, ptr @_ZGVnN2v_log, ptr @_ZGVnN4v_logf, ptr @_ZGVnN2v_log10, ptr @_ZGVnN4v_log10f, ptr @_ZGVnN2v_log2, ptr @_ZGVnN4v_log2f, ptr @_ZGVnN2v_sin, ptr @_ZGVnN4v_sinf], section "llvm.metadata" +; CHECK: @llvm.compiler.used = appending global [16 x ptr] [ptr @_ZGVnN2v_cos, ptr @_ZGVnN4v_cosf, ptr @_ZGVnN2v_exp, ptr @_ZGVnN4v_expf, ptr @_ZGVnN2v_exp2, ptr @_ZGVnN4v_exp2f, ptr @_ZGVnN2v_exp10, ptr @_ZGVnN4v_exp10f, ptr @_ZGVnN2v_log, ptr @_ZGVnN4v_logf, ptr @_ZGVnN2v_log10, ptr @_ZGVnN4v_log10f, ptr @_ZGVnN2v_log2, ptr @_ZGVnN4v_log2f, ptr @_ZGVnN2v_sin, ptr @_ZGVnN4v_sinf], section "llvm.metadata" ;. define <2 x double> @llvm_ceil_f64(<2 x double> %in) { ; CHECK-LABEL: @llvm_ceil_f64( diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll b/llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll index 03d959c928577d5..07b1402b4697fa2 100644 --- a/llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll +++ b/llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll @@ -1,10 +1,9 @@ -; RUN: opt -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize -S < %s | FileCheck %s --check-prefixes=CHECK,NEON -; RUN: opt -mattr=+sve -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize -S < %s | FileCheck %s --check-prefixes=CHECK,SVE +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --filter "(\.|_v|_sv)(ceil|copysign|cos|exp\.|expf?\(|exp2|exp10|fabs|floor|fma|log|m..num|pow|nearbyint|rint|round|sin|sqrt|trunc)|(ret)" --version 2 +; RUN: opt -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize -prefer-predicate-over-epilogue=predicate-dont-vectorize -S < %s | FileCheck %s --check-prefixes=NEON +; RUN: opt -mattr=+sve -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize -prefer-predicate-over-epilogue=predicate-dont-v
[llvm-branch-commits] [llvm] cace1ec - Add `simplifycfg` pass and `noalias` to ensure tail folding.
Author: Paschalis Mpeis Date: 2023-11-27T17:40:30Z New Revision: cace1ec7346d3dfee9fcc5d67d79bce989b207d1 URL: https://github.com/llvm/llvm-project/commit/cace1ec7346d3dfee9fcc5d67d79bce989b207d1 DIFF: https://github.com/llvm/llvm-project/commit/cace1ec7346d3dfee9fcc5d67d79bce989b207d1.diff LOG: Add `simplifycfg` pass and `noalias` to ensure tail folding. `noalias` attribute was added only to the `%in.ptr` parameter of the ArmPL Intrinsics. Added: Modified: llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll llvm/test/Transforms/LoopVectorize/AArch64/sleef-intrinsic-calls-aarch64.ll Removed: diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll b/llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll index 07b1402b4697fa2..96d94f72fabf06d 100644 --- a/llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll +++ b/llvm/test/Transforms/LoopVectorize/AArch64/armpl-intrinsics.ll @@ -1,6 +1,6 @@ ; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --filter "(\.|_v|_sv)(ceil|copysign|cos|exp\.|expf?\(|exp2|exp10|fabs|floor|fma|log|m..num|pow|nearbyint|rint|round|sin|sqrt|trunc)|(ret)" --version 2 -; RUN: opt -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize -prefer-predicate-over-epilogue=predicate-dont-vectorize -S < %s | FileCheck %s --check-prefixes=NEON -; RUN: opt -mattr=+sve -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize -prefer-predicate-over-epilogue=predicate-dont-vectorize -S < %s | FileCheck %s --check-prefixes=SVE +; RUN: opt -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize,simplifycfg -prefer-predicate-over-epilogue=predicate-dont-vectorize -S < %s | FileCheck %s --check-prefixes=NEON +; RUN: opt -mattr=+sve -vector-library=ArmPL -passes=inject-tli-mappings,loop-vectorize,simplifycfg -prefer-predicate-over-epilogue=predicate-dont-vectorize -S < %s | FileCheck %s --check-prefixes=SVE target triple = "aarch64-unknown-linux-gnu" @@ -10,18 +10,16 @@ target triple = "aarch64-unknown-linux-gnu" declare double @llvm.cos.f64(double) declare float @llvm.cos.f32(float) -define void @cos_f64(ptr nocapture %in.ptr, ptr %out.ptr) { +define void @cos_f64(ptr noalias %in.ptr, ptr %out.ptr) { ; ; NEON-LABEL: define void @cos_f64 -; NEON-SAME: (ptr nocapture [[IN_PTR:%.*]], ptr [[OUT_PTR:%.*]]) { -; NEON:[[TMP4:%.*]] = call <2 x double> @armpl_vcosq_f64(<2 x double> [[WIDE_LOAD:%.*]]) -; NEON:[[CALL:%.*]] = tail call double @llvm.cos.f64(double [[IN:%.*]]) #[[ATTR1:[0-9]+]] +; NEON-SAME: (ptr noalias [[IN_PTR:%.*]], ptr [[OUT_PTR:%.*]]) { +; NEON:[[TMP3:%.*]] = call <2 x double> @armpl_vcosq_f64(<2 x double> [[WIDE_LOAD:%.*]]) ; NEON:ret void ; ; SVE-LABEL: define void @cos_f64 -; SVE-SAME: (ptr nocapture [[IN_PTR:%.*]], ptr [[OUT_PTR:%.*]]) #[[ATTR1:[0-9]+]] { -; SVE:[[TMP17:%.*]] = call @armpl_svcos_f64_x( [[WIDE_MASKED_LOAD:%.*]], [[ACTIVE_LANE_MASK:%.*]]) -; SVE:[[CALL:%.*]] = tail call double @llvm.cos.f64(double [[IN:%.*]]) #[[ATTR5:[0-9]+]] +; SVE-SAME: (ptr noalias [[IN_PTR:%.*]], ptr [[OUT_PTR:%.*]]) #[[ATTR1:[0-9]+]] { +; SVE:[[TMP13:%.*]] = call @armpl_svcos_f64_x( [[WIDE_MASKED_LOAD:%.*]], [[ACTIVE_LANE_MASK:%.*]]) ; SVE:ret void ; entry: @@ -42,17 +40,15 @@ define void @cos_f64(ptr nocapture %in.ptr, ptr %out.ptr) { ret void } -define void @cos_f32(ptr nocapture %in.ptr, ptr %out.ptr) { +define void @cos_f32(ptr noalias %in.ptr, ptr %out.ptr) { ; NEON-LABEL: define void @cos_f32 -; NEON-SAME: (ptr nocapture [[IN_PTR:%.*]], ptr [[OUT_PTR:%.*]]) { -; NEON:[[TMP4:%.*]] = call <4 x float> @armpl_vcosq_f32(<4 x float> [[WIDE_LOAD:%.*]]) -; NEON:[[CALL:%.*]] = tail call float @llvm.cos.f32(float [[IN:%.*]]) #[[ATTR2:[0-9]+]] +; NEON-SAME: (ptr noalias [[IN_PTR:%.*]], ptr [[OUT_PTR:%.*]]) { +; NEON:[[TMP3:%.*]] = call <4 x float> @armpl_vcosq_f32(<4 x float> [[WIDE_LOAD:%.*]]) ; NEON:ret void ; ; SVE-LABEL: define void @cos_f32 -; SVE-SAME: (ptr nocapture [[IN_PTR:%.*]], ptr [[OUT_PTR:%.*]]) #[[ATTR1]] { -; SVE:[[TMP17:%.*]] = call @armpl_svcos_f32_x( [[WIDE_MASKED_LOAD:%.*]], [[ACTIVE_LANE_MASK:%.*]]) -; SVE:[[CALL:%.*]] = tail call float @llvm.cos.f32(float [[IN:%.*]]) #[[ATTR6:[0-9]+]] +; SVE-SAME: (ptr noalias [[IN_PTR:%.*]], ptr [[OUT_PTR:%.*]]) #[[ATTR1]] { +; SVE:[[TMP13:%.*]] = call @armpl_svcos_f32_x( [[WIDE_MASKED_LOAD:%.*]], [[ACTIVE_LANE_MASK:%.*]]) ; SVE:ret void ; entry: @@ -76,15 +72,13 @@ define void @cos_f32(ptr nocapture %in.ptr, ptr %out.ptr) { declare double @llvm.exp.f64(double) declare float @llvm.exp.f32(float) -define void @exp_f64(ptr nocapture %in.ptr, ptr %out.ptr) { +define void @exp_f64(ptr noalias %in.ptr, ptr %out.ptr) { ; NEON-LABEL: define void @exp_f64 -; NEON-SAME: (ptr nocapture [[IN_PTR:%.*]], ptr
[llvm-branch-commits] [llvm] 6f79792 - [TLI] Pass replace-with-veclib works with Scalable Vectors.
Author: Paschalis Mpeis Date: 2023-11-28T12:02:12Z New Revision: 6f797921e23fe9a4500222e69ebd75aa7ba53ec1 URL: https://github.com/llvm/llvm-project/commit/6f797921e23fe9a4500222e69ebd75aa7ba53ec1 DIFF: https://github.com/llvm/llvm-project/commit/6f797921e23fe9a4500222e69ebd75aa7ba53ec1.diff LOG: [TLI] Pass replace-with-veclib works with Scalable Vectors. The pass uses the Masked variant of TLI method when the Intrinsic operates on Scalable Vectors and it fails to find a non-Masked variant. Added: Modified: llvm/lib/Analysis/VFABIDemangling.cpp llvm/lib/CodeGen/ReplaceWithVeclib.cpp llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-sleef-scalable.ll Removed: diff --git a/llvm/lib/Analysis/VFABIDemangling.cpp b/llvm/lib/Analysis/VFABIDemangling.cpp index 88f61cfeb9ba4e5..85880257a320860 100644 --- a/llvm/lib/Analysis/VFABIDemangling.cpp +++ b/llvm/lib/Analysis/VFABIDemangling.cpp @@ -126,7 +126,7 @@ static ParseRet tryParseLinearTokenWithRuntimeStep(StringRef &ParseString, return ParseRet::None; } -/// The function looks for the following stringt at the beginning of +/// The function looks for the following string at the beginning of /// the input string `ParseString`: /// /// diff --git a/llvm/lib/CodeGen/ReplaceWithVeclib.cpp b/llvm/lib/CodeGen/ReplaceWithVeclib.cpp index 36c91b7fa97e462..d31a793556dfded 100644 --- a/llvm/lib/CodeGen/ReplaceWithVeclib.cpp +++ b/llvm/lib/CodeGen/ReplaceWithVeclib.cpp @@ -105,6 +105,7 @@ static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI, // all vector operands have identical vector width. ElementCount VF = ElementCount::getFixed(0); SmallVector ScalarTypes; + bool MayBeMasked = false; for (auto Arg : enumerate(CI.args())) { auto *ArgType = Arg.value()->getType(); // Vector calls to intrinsics can still have @@ -121,17 +122,13 @@ static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI, return false; } ElementCount NumElements = VectorArgTy->getElementCount(); - if (NumElements.isScalable()) { -// The current implementation does not support -// scalable vectors. -return false; - } - if (VF.isNonZero() && VF != NumElements) { -// The diff erent arguments diff er in vector size. + if (NumElements.isScalable()) +MayBeMasked = true; + + // The diff erent arguments diff er in vector size. + if (VF.isNonZero() && VF != NumElements) return false; - } else { -VF = NumElements; - } + VF = NumElements; ScalarTypes.push_back(VectorArgTy->getElementType()); } } @@ -152,11 +149,14 @@ static bool replaceWithCallToVeclib(const TargetLibraryInfo &TLI, return false; } + // Assume it has a mask when that is a possibility and has no mapping for + // a Non-Masked variant. + const bool IsMasked = + MayBeMasked && !TLI.getVectorMappingInfo(ScalarName, VF, false); // Try to find the mapping for the scalar version of this intrinsic // and the exact vector width of the call operands in the // TargetLibraryInfo. - StringRef TLIName = TLI.getVectorizedFunction(ScalarName, VF); - + StringRef TLIName = TLI.getVectorizedFunction(ScalarName, VF, IsMasked); LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": Looking up TLI mapping for `" << ScalarName << "` and vector width " << VF << ".\n"); diff --git a/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll b/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll index 18431ae021f9766..633cb220f52464c 100644 --- a/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll +++ b/llvm/test/CodeGen/AArch64/replace-intrinsics-with-veclib-armpl.ll @@ -15,7 +15,7 @@ declare @llvm.cos.nxv2f64() declare @llvm.cos.nxv4f32() ;. -; CHECK: @llvm.compiler.used = appending global [16 x ptr] [ptr @armpl_vcosq_f64, ptr @armpl_vcosq_f32, ptr @armpl_vsinq_f64, ptr @armpl_vsinq_f32, ptr @armpl_vexpq_f64, ptr @armpl_vexpq_f32, ptr @armpl_vexp2q_f64, ptr @armpl_vexp2q_f32, ptr @armpl_vexp10q_f64, ptr @armpl_vexp10q_f32, ptr @armpl_vlogq_f64, ptr @armpl_vlogq_f32, ptr @armpl_vlog2q_f64, ptr @armpl_vlog2q_f32, ptr @armpl_vlog10q_f64, ptr @armpl_vlog10q_f32], section "llvm.metadata" +; CHECK: @llvm.compiler.used = appending global [32 x ptr] [ptr @armpl_vcosq_f64, ptr @armpl_vcosq_f32, ptr @armpl_svcos_f64_x, ptr @armpl_svcos_f32_x, ptr @armpl_vsinq_f64, ptr @armpl_vsinq_f32, ptr @armpl_svsin_f64_x, ptr @armpl_svsin_f32_x, ptr @armpl_vexpq_f64, ptr @armpl_vexpq_f32, ptr @armpl_svexp_f64_x, ptr @armpl_svexp_f32_x, ptr @armpl_vexp2q_f64, ptr @armpl_vexp2q_f32, ptr @armpl_svexp2_f64_x, ptr @armpl_svexp2_f32_x, ptr @armpl_vexp10q_f64, ptr @armpl_vexp10q_f32, ptr @armpl_svexp10_f64_x, ptr @armpl_
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
paschalis-mpeis wrote: Forced-pushed to add the missing code. Also, this PR on now stacked top of #123635. Thanks for the comments @maks. I am not sure if your [concern](https://github.com/llvm/llvm-project/issues/116817#issuecomment-2602866672) on the issue still stands or not. https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis ready_for_review https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
@@ -165,11 +165,17 @@ void BinarySection::flushPendingRelocations(raw_pwrite_stream &OS, OS.pwrite(Patch.Bytes.data(), Patch.Bytes.size(), SectionFileOffset + Patch.Offset); + uint64_t SkippedPendingRelocations = 0; for (Relocation &Reloc : PendingRelocations) { uint64_t Value = Reloc.Addend; if (Reloc.Symbol) Value += Resolver(Reloc.Symbol); +if (!Relocation::canEncodeValue(Reloc.Type, Value, paschalis-mpeis wrote: `addPendingRelocation` now becomes the only way now to add a pending relocation (Parent PR #123635). If it comes from the `scanExternalRefs` optimization, then it is marked as optional. Finally, those relocations can be safely skipped when it's time to flush them. https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis converted_to_draft https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -11,4 +11,4 @@ CHECK-SPE-NO-LBR: PERF2BOLT: Starting data aggregation job RUN: perf record -e cycles -q -o %t.perf.data -- %t.exe RUN: not perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe %t.exe 2>&1 | FileCheck %s --check-prefix=CHECK-SPE-LBR -CHECK-SPE-LBR: PERF2BOLT-ERROR: Arm SPE mode is combined only with BasicAggregation. +CHECK-SPE-LBR: PERF2BOLT: spawning perf job to read SPE branch events paschalis-mpeis wrote: I realized I didn't include proper context in my previous comment about the **'fragility'**: The reason for this fragility is the version of `perf` being used. Since `perf2bolt` is a wrapper over `perf`, older kernel versions may lack `brstack` support. In those cases `perf2bolt` would eventually return an error. So here we intentionally ignore whether `perf2bolt` fails, and instead we only check that its original intent was to parse the SPE data, eg: > PERF2BOLT: spawning perf job to read SPE brstack events This should avoid flakiness in tests. https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
https://github.com/paschalis-mpeis commented: Hey Amir, Thanks for the PR. Unfortunately, it is still failing. The trick below doesn't seem to work on my buildbot machine: > Link against a DSO to ensure PLT entries. So doing: ```bash nm --synthetic callcont-fallthru.s.tmp ``` won't list a `puts@plt` symbol, which is what causes an `link_fdata.py` assertion: > AssertionError: ERROR: symbol puts@plt is not defined in binary On my dev AArch64 instance `--synthetic` does the trick. BTW run lines 4 and 6 appear identical when inspected (`-###`) https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
paschalis-mpeis wrote: Thanks a lot both! In case there's some delay in resolving this edge case, may I suggest temporarily disabling this test on AArch64 until a more consistent workaround is in place? https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
paschalis-mpeis wrote: Hey @yota9, thanks for the suggestions! Indeed, the PLT entries exist in both binaries. For example running: ``` build/bin/llvm-objdump -d -j .plt build/tools/bolt/test/X86/Output/callcont-fallthru.s.tmp ``` shows: ``` build/tools/bolt/test/X86/Output/callcont-fallthru.s.tmp: file format elf64-x86-64 Disassembly of section .plt: 1430 <.plt>: 1430: ff 35 f2 20 00 00 pushq 0x20f2(%rip)# 0x3528 1436: ff 25 f4 20 00 00 jmpq*0x20f4(%rip) # 0x3530 143c: 0f 1f 40 00 nopl(%rax) 1440 : 1440: ff 25 f2 20 00 00 jmpq*0x20f2(%rip) # 0x3538 1446: 68 00 00 00 00pushq $0x0 144b: e9 e0 ff ff ffjmp 0x1430 <.plt> ``` I noticed some code differences in the binaries but I haven't looked deeper into it. **It looks like it's differences in GNU nm though:** On my AArch64 dev-machine, `nm --synthetic` lists `puts@plt`, but when I copy that same binary over to our upcoming AArch64 buildbot, it's missing. Conversely, `nm --synthetic` on the buildbot does not list `puts@plt`, but when if I copy that binary to the dev-machine it does appear. --- I too agree that relying on GNU is not ideal. Essentially using any binary tool that does not come from the built LLVM revision. However, `llvm-nm` does not seem support `--synthetic`. BTW, thanks for all the help! I'm focused on AArch64, so while I may be involved to some extent with this, I'll let Amir drive the fix. That's why I'm looking for a code owner to get #137831 stamped. :) (also cc'ing: @aaupov, @maksfb) https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
paschalis-mpeis wrote: Great. A quick way to use an llvm tool could be: ```bash llvm-objdump -d -j .plt %t | grep @plt ``` This produces output similar to what `nm --synthetic` produces (when it works): ```bash 1430 : ``` You'll need ofc to tweak `link_fdata` to properly parse symbol+address: https://github.com/llvm/llvm-project/blob/fb8d61d8163f1439749a62be614c09215fe65e9f/bolt/test/link_fdata.py#L96-L99 Not sure of any cleaner approach? (@yota9, @MaskRay) https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
paschalis-mpeis wrote: Hey folks, any updates on this? I spent some time experimenting with @MaskRay's suggestion. I used a mock libc shared object that had a `puts` symbol. Indeed there won't be unresolved symbols now, however, still GNU `nm` doesn't show a PLT entry when using `--synthetic` . https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
paschalis-mpeis wrote: Hey @yota9, thanks for the input. I tried something similar. Even when I use `stub.c` and link it with: ```diff -# RUN: %clang %cflags -fpic -shared -xc /dev/null -o %t.so -## Link against a DSO to ensure PLT entries. +# RUN: %clang %cflags %p/../Inputs/stub.c -fPIC -shared -o %t.so ``` then running GNU nm: ```bash nm %t --synthetic ``` would emit only ``` U puts ``` which `link_fdata` rejects. On some other machines though, GNU nm emits: ```txt U puts 1234 T puts@plt ``` which works well. In both cases it was the same nm driver version. TMU this inconsistency was reported on x86 machines too. --- I might've missed something on my end. I briefly discussed this with Amir ([see discord](https://discord.com/channels/636084430946959380/930647188944613406/1365431973505531905)) as I'm trying to unblock our AArch64 buildbot. We figured it's fine to disable this test on AArch64 until the issue gets resolved. Could you mind taking a look at #137831, and consider accepting it? https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
paschalis-mpeis wrote: That is perfect and the way we should go forward with this – thanks @yota9. The problem is that the test is flaky: it passes on some systems but fails on others. Using`XFAIL` would make my AArch64 buildbot happy but it cause failures (`Unexpectedly Passed`) on other AArch64 machines I've tested . 🤷♂️ That's why I propose restricting this to X86 for now, as a way to unblock us in the meantime: - #137831 --- https://github.com/llvm/llvm-project/blob/42d76a34b20487a0e24f0b9f57612d4e67305568/bolt/test/X86/callcont-fallthru.s#L3-L5 https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
paschalis-mpeis wrote: Yeap, good idea. I could add `XFAIL` and modify runline like: ``` # RUN: link_fdata %s %t %t.patplt PREAGGPLT --synthetic --nmtool=llvm-nm ``` The differences would be : - with `REQUIRES` we won't cross-run this x86 lit test on AArch64 (as I do currently in ` #137831`) - with `XFAIL` + `llvm-nm` the test would be expected to fail on both architectures. But once your work is merged, it would unexpectedly pass, which would break the test and prompt us to update it --- I'm happy to proceed with this as well. https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][test] Fix callcont-fallthru.s after #129481 (PR #135867)
paschalis-mpeis wrote: Yes, and that'd actually be better so we don't depend on whatever host GNU nm the machine has. Based on [this](https://github.com/llvm/llvm-project/pull/135867#issuecomment-2841721288), I'd say @aaupov intends to make this change too. I think he's away – let's see what he says once back. https://github.com/llvm/llvm-project/pull/135867 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
paschalis-mpeis wrote: Hey Maksim, Extending Relocations is even better. Thanks for the suggestion and the review. Before proceeding, and regarding the size overheads, I want to highlight an inconsistency with LLVM’s ObjectFile, where the type is 64 bits ([see here](https://github.com/llvm/llvm-project/blob/16cd5cdf4d6387e34d2bb723bc26c331c8d89d75/llvm/include/llvm/Object/ObjectFile.h#L628)). We only have 3 inlined sites of this in `RewriteInstance` (eg one is [here](https://github.com/llvm/llvm-project/blob/3c357a49d61e4c81a1ac016502ee504521bc8dda/bolt/lib/Rewrite/RewriteInstance.cpp#L2408)). If you agree, I'll proceed with an NFCI change, adding assertion overflow checks at these sites. https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -11,4 +11,4 @@ CHECK-SPE-NO-LBR: PERF2BOLT: Starting data aggregation job RUN: perf record -e cycles -q -o %t.perf.data -- %t.exe RUN: not perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe %t.exe 2>&1 | FileCheck %s --check-prefix=CHECK-SPE-LBR -CHECK-SPE-LBR: PERF2BOLT-ERROR: Arm SPE mode is combined only with BasicAggregation. +CHECK-SPE-LBR: PERF2BOLT: spawning perf job to read SPE branch events paschalis-mpeis wrote: This test was not failing on my machine (ie not needing `not perf2bolt`), which made me realize that these tests might be fragile on buildbots and local environments. So I suggest using `perf record` for both non-lbr/lbr testing, and overriding the exit value with a sub shell like so: ```diff -## Check that Arm SPE mode is available on AArch64 with BasicAggregation. +## Check that Arm SPE mode is available on AArch64. REQUIRES: system-linux,perf,target=aarch64{{.*}} RUN: %clang %cflags %p/../../Inputs/asm_foo.s %p/../../Inputs/asm_main.c -o %t.exe -RUN: touch %t.empty.perf.data -RUN: perf2bolt -p %t.empty.perf.data -o %t.perf.boltdata --nl --spe --pa %t.exe 2>&1 | FileCheck %s --check-prefix=CHECK-SPE-NO-LBR +RUN: perf record -e cycles -q -o %t.perf.data -- %t.exe 2> /dev/null -CHECK-SPE-NO-LBR: PERF2BOLT: Starting data aggregation job +RUN: (perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe --nl %t.exe 2> /dev/null; exit 0) | FileCheck %s --check-prefix=CHECK-SPE-NO-LBR -RUN: perf record -e cycles -q -o %t.perf.data -- %t.exe -RUN: not perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe %t.exe 2>&1 | FileCheck %s --check-prefix=CHECK-SPE-LBR +RUN: (perf2bolt -p %t.perf.data -o %t.perf.boltdata --spe %t.exe 2> /dev/null; exit 0) | FileCheck %s --check-prefix=CHECK-SPE-LBR -CHECK-SPE-LBR: PERF2BOLT: spawning perf job to read SPE branch events +CHECK-SPE-NO-LBR: PERF2BOLT: spawning perf job to read SPE branch events (non-lbr) +CHECK-SPE-LBR: PERF2BOLT: spawning perf job to read SPE brstack events ``` (the diff covers the entire test; lbr/non-lbr) https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -1034,7 +1034,11 @@ ErrorOr DataAggregator::parseLBREntry() { if (std::error_code EC = MispredStrRes.getError()) return EC; StringRef MispredStr = MispredStrRes.get(); - if (MispredStr.size() != 1 || + // SPE brstack mispredicted flags might be two characters long: 'PN' or 'MN'. + bool ProperStrSize = (MispredStr.size() == 2 && opts::ArmSPE) + ? (MispredStr[1] == 'N') + : (MispredStr.size() == 1); + if (!ProperStrSize || (MispredStr[0] != 'P' && MispredStr[0] != 'M' && MispredStr[0] != '-')) { reportError("expected single char for mispred bit"); paschalis-mpeis wrote: Here you can show a relevant message for each case, eg the error might be specific to SPE's taken bit or both misspred/taken parsing errors may occur. You could extract the earlier `MispredStr[0]` checks on another boolean (say `PredictionBitErr`) and reuse? Also maybe ProperStrSize could get specialized to something like `SpeTakenBitErr`. https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -88,6 +89,45 @@ struct PerfSpeEventsTestHelper : public testing::Test { return SampleSize == DA.BasicSamples.size(); } + + /// Compare LBREntries + bool checkLBREntry(const LBREntry &Lhs, const LBREntry &Rhs) { +return Lhs.From == Rhs.From && Lhs.To == Rhs.To && + Lhs.Mispred == Rhs.Mispred; + } + + /// Parse and check SPE brstack as LBR + void parseAndCheckBrstackEvents( + uint64_t PID, + const std::vector> &ExpectedSamples) { +int NumSamples = 0; + +DataAggregator DA(""); +DA.ParsingBuf = opts::ReadPerfEvents; +DA.BC = BC.get(); +DataAggregator::MMapInfo MMap; +DA.BinaryMMapInfo.insert(std::make_pair(PID, MMap)); + +// Process buffer. +while (DA.hasData()) { paschalis-mpeis wrote: Would it be possible to call `parseBranchEvents` here instead of having a loop and replicating logic? And then check any relevant buffers (ie `DA.BranchLBRs` in a loop) ? (probably no need to FallthroughLBRs at this point). https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -180,13 +178,16 @@ void DataAggregator::start() { if (opts::ArmSPE) { if (!opts::BasicAggregation) { - errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with " -"BasicAggregation.\n"; - exit(1); + // pidfrom_ip to_ippredicted? + // 12345 0x123/0x456/P/-/-/8/RET/- + launchPerfProcess("SPE branch events", MainEventsPPI, paschalis-mpeis wrote: ```suggestion launchPerfProcess("SPE brstack events", MainEventsPPI, ``` https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -180,13 +178,16 @@ void DataAggregator::start() { if (opts::ArmSPE) { if (!opts::BasicAggregation) { - errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with " -"BasicAggregation.\n"; - exit(1); + // pidfrom_ip to_ippredicted? + // 12345 0x123/0x456/P/-/-/8/RET/- paschalis-mpeis wrote: Not sure if something like the below is more appropriate? BTW the other methods do not list any examples. ```suggestion // pidfrom_ip to_ippredicted/missed not-taken? // 12345 0x123/0x456/PN/-/-/8/RET/- ``` https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -88,6 +89,45 @@ struct PerfSpeEventsTestHelper : public testing::Test { return SampleSize == DA.BasicSamples.size(); } + + /// Compare LBREntries + bool checkLBREntry(const LBREntry &Lhs, const LBREntry &Rhs) { +return Lhs.From == Rhs.From && Lhs.To == Rhs.To && + Lhs.Mispred == Rhs.Mispred; + } + + /// Parse and check SPE brstack as LBR + void parseAndCheckBrstackEvents( + uint64_t PID, + const std::vector> &ExpectedSamples) { +int NumSamples = 0; + +DataAggregator DA(""); +DA.ParsingBuf = opts::ReadPerfEvents; +DA.BC = BC.get(); +DataAggregator::MMapInfo MMap; +DA.BinaryMMapInfo.insert(std::make_pair(PID, MMap)); + +// Process buffer. +while (DA.hasData()) { paschalis-mpeis wrote: Right. To do this we need to expand our context further, creating mock functions with valid addresses, allowing those branches to be parsed+stored in `BranchLBRs`. That may be too much work; if that's the case we can skip. https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
https://github.com/paschalis-mpeis commented: Thanks a lot for your work Adam! I commented on some changes and nits. Also noting that for now this PR is stacked on top of #120741. https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
paschalis-mpeis wrote: I've put up a separate patch that makes Relocation type 32b and adds an `Optional` field: - #130792 https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -113,6 +153,37 @@ TEST_F(PerfSpeEventsTestHelper, SpeBranches) { EXPECT_TRUE(checkEvents(1234, 10, {"branches-spe:"})); } +TEST_F(PerfSpeEventsTestHelper, SpeBranchesWithBrstack) { + // Check perf input with SPE branch events as brstack format. + // Example collection command: + // ``` + // perf record -e 'arm_spe_0/branch_filter=1/u' -- BINARY + // ``` + // How Bolt extracts the branch events: + // ``` + // perf script -F pid,brstack --itrace=bl + // ``` + + opts::ArmSPE = true; + opts::ReadPerfEvents = " 1234 0xa001/0xa002/PN/-/-/10/COND/-\n" + " 1234 0xb001/0xb002/P/-/-/4/RET/-\n" + " 1234 0xc001/0xc002/P/-/-/13/-/-\n" + " 1234 0xd001/0xd002/M/-/-/7/RET/-\n" + " 1234 0xe001/0xe002/P/-/-/14/RET/-\n" + " 1234 0xf001/0xf002/MN/-/-/8/COND/-\n"; + + LBREntry Entry1 = {0xa001, 0xa002, false}; + LBREntry Entry2 = {0xb001, 0xb002, false}; + LBREntry Entry3 = {0xc001, 0xc002, false}; + LBREntry Entry4 = {0xd001, 0xd002, true}; + LBREntry Entry5 = {0xe001, 0xe002, false}; + LBREntry Entry6 = {0xf001, 0xf002, true}; + std::vector> ExpectedSamples = { + {{Entry1}}, {{Entry2}}, {{Entry3}}, {{Entry4}}, {{Entry5}}, {{Entry6}}, + }; paschalis-mpeis wrote: nit: could expand directly the entries here without using the LBREntry variables https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Add initial support for SPE brstack format (PR #129231)
@@ -180,13 +178,16 @@ void DataAggregator::start() { if (opts::ArmSPE) { if (!opts::BasicAggregation) { - errs() << "PERF2BOLT-ERROR: Arm SPE mode is combined only with " -"BasicAggregation.\n"; - exit(1); + // pidfrom_ip to_ippredicted? + // 12345 0x123/0x456/P/-/-/8/RET/- + launchPerfProcess("SPE branch events", MainEventsPPI, +"script -F pid,brstack --itrace=bl", +/*Wait = */ false); +} else { + launchPerfProcess("SPE brstack events", MainEventsPPI, paschalis-mpeis wrote: ```suggestion launchPerfProcess("SPE branch events (non-lbr)", MainEventsPPI, ``` https://github.com/llvm/llvm-project/pull/129231 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
paschalis-mpeis wrote: - force-push to stack this PR on top of #127812. - add code & test to ensure that we skip pending relocations only when `-force-patch` was set https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis ready_for_review https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
@@ -197,6 +218,10 @@ void BinarySection::flushPendingRelocations(raw_pwrite_stream &OS, } clearList(PendingRelocations); + + if (SkippedPendingRelocations > 0) +LLVM_DEBUG(dbgs() << "BOLT-INFO: Skipped " << SkippedPendingRelocations + << " pending relocations as they were out of range\n"); paschalis-mpeis wrote: Done. https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Skip out-of-range pending relocations (PR #116964)
paschalis-mpeis wrote: Thanks for your review Maksim. Forced-push to rebase on top of #133085 and add commit b4e906c to address reviewers. https://github.com/llvm/llvm-project/pull/116964 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/146513 >From 625f9ee79af68a121afd92e06d9b4f91007a9c38 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:37:31 +0100 Subject: [PATCH 1/4] [BOLT] Improve file handling in NFC-Mode This patch introduce the following improvements: - Catch an exception when the CMakeCache.txt is not present - Bail out gracefully when llvm-bolt did not build successfully the current or previous revision. --- bolt/utils/nfc-check-setup.py | 26 +++--- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 7d634d7a88b83..2ff27e5c40b63 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -91,18 +91,26 @@ def main(): source_dir = None # find the repo directory -with open(f"{args.build_dir}/CMakeCache.txt") as f: -for line in f: -m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) -if m: -source_dir = m.groups()[0] -if not source_dir: -sys.exit("Source directory is not found") +try: +CMCacheFilename=f"{args.build_dir}/CMakeCache.txt" +with open(CMCacheFilename) as f: +for line in f: +m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) +if m: +source_dir = m.groups()[0] +if not source_dir: +raise Exception(f"Source directory not found: '{CMCacheFilename}'") +except Exception as e: +sys.exit(e) # build the current commit subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) + +if not os.path.exists(bolt_path): +sys.exit(f"Failed to build the current revision: '{bolt_path}'") + # rename llvm-bolt os.replace(bolt_path, f"{bolt_path}.new") # memorize the old hash for logging @@ -133,11 +141,15 @@ def main(): subprocess.run(shlex.split(f"git checkout -f {args.cmp_rev}"), cwd=source_dir) # get the parent commit hash for logging new_ref = get_git_ref_or_rev(source_dir) + # build the previous commit subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) + # rename llvm-bolt +if not os.path.exists(bolt_path): +sys.exit(f"Failed to build the previous revision: '{bolt_path}'") os.replace(bolt_path, f"{bolt_path}.old") # symlink llvm-bolt-wrapper >From 26e7b9f05f8a365f117f14a0975a232e1ec74202 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:50:08 +0100 Subject: [PATCH 2/4] python formatter and nits --- bolt/utils/nfc-check-setup.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 2ff27e5c40b63..22e8cc646a1c5 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -92,7 +92,7 @@ def main(): source_dir = None # find the repo directory try: -CMCacheFilename=f"{args.build_dir}/CMakeCache.txt" +CMCacheFilename = f"{args.build_dir}/CMakeCache.txt" with open(CMCacheFilename) as f: for line in f: m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) @@ -104,6 +104,7 @@ def main(): sys.exit(e) # build the current commit +print ("NFC-Setup: Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -143,6 +144,7 @@ def main(): new_ref = get_git_ref_or_rev(source_dir) # build the previous commit +print ("NFC-Setup: Building previous revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) >From ca36aa02effc6c5e5da140940a5c55d4183e0422 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:55:46 +0100 Subject: [PATCH 3/4] code formatter (2) --- bolt/utils/nfc-check-setup.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 22e8cc646a1c5..d3248050f16e3 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -104,7 +104,7 @@ def main(): sys.exit(e) # build the current commit -print ("NFC-Setup: Building current revision..") +print("NFC-Setup: Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -144,7 +144,7 @@ def main(): new_ref = get_git_ref_or_rev(source_dir) # build the previous commit -print ("NFC-Setup: Building previous revision..") +print("NFC-Setup: Building previous revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) >From 09363a
[llvm-branch-commits] [llvm] [BOLT] Improve exception handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve exception handling in NFC-Mode (PR #146513)
paschalis-mpeis wrote: Forced-push to rebase since the parent PR now has a `--create-wrapper` flag. In the latest patch, `switch_back` is a function called whenever something goes wrong after checking out the prev revision, ie: - building the old binary fails, or - setting up the wrapper fails. I also delete llvm-bolt at the start, since we rebuild it for the current revision anyway. https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][NFC] Update nfc-check-setup.py guidance (PR #146659)
@@ -156,9 +158,8 @@ def main(): os.replace(bolt_path, f"{bolt_path}.old") print( -f"Build directory {args.build_dir} is ready to run BOLT tests, e.g.\n" -"\tbin/llvm-lit -sv tools/bolt/test\nor\n" -"\tbin/llvm-lit -sv tools/bolttests" +f"Build directory {args.build_dir} is ready for NFC-Mode comparison " +"between the two revisions." paschalis-mpeis wrote: Will do, thanks! Setting up the wrapper now stays under a flag, so I'll reintroduce this example when I rebase the patch. https://github.com/llvm/llvm-project/pull/146659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][NFC] Update nfc-check-setup.py guidance (PR #146659)
https://github.com/paschalis-mpeis created https://github.com/llvm/llvm-project/pull/146659 None >From 8cc8661aa2df5e5a6e04752083335a865b0178fe Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Wed, 2 Jul 2025 10:35:22 +0100 Subject: [PATCH] [BOLT][NFC] Update nfc-check-setup.py guidance --- bolt/utils/nfc-check-setup.py | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 6cf1df5c177ae..87a134aea37ca 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -47,8 +47,10 @@ def main(): parser = argparse.ArgumentParser( description=textwrap.dedent( """ -This script builds two versions of BOLT (with the current and -previous revision). +This script builds two versions of BOLT: +llvm-bolt.new, using the current revision, and llvm-bolt.old using +the previous revision. These can be used to check whether the +current revision changes BOLT's functional behavior. """ ) ) @@ -156,9 +158,8 @@ def main(): os.replace(bolt_path, f"{bolt_path}.old") print( -f"Build directory {args.build_dir} is ready to run BOLT tests, e.g.\n" -"\tbin/llvm-lit -sv tools/bolt/test\nor\n" -"\tbin/llvm-lit -sv tools/bolttests" +f"Build directory {args.build_dir} is ready for NFC-Mode comparison " +"between the two revisions." ) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][NFC] Update nfc-check-setup.py guidance (PR #146659)
https://github.com/paschalis-mpeis ready_for_review https://github.com/llvm/llvm-project/pull/146659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Users/paschalis mpeis/nfc check improve file handling (PR #146513)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Users/paschalis mpeis/nfc check improve file handling (PR #146513)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/146513 >From d55f8a334f1cc0ee72a7a8bcfdbb1695222ef91a Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:37:31 +0100 Subject: [PATCH 1/2] [BOLT] Improve file handling in NFC-Mode This patch introduce the following improvements: - Catch an exception when the CMakeCache.txt is not present - Bail out gracefully when llvm-bolt did not build successfully the current or previous revision. --- bolt/utils/nfc-check-setup.py | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 4884e616a0fa3..812689bdce75a 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -80,18 +80,26 @@ def main(): source_dir = None # find the repo directory -with open(f"{args.build_dir}/CMakeCache.txt") as f: -for line in f: -m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) -if m: -source_dir = m.groups()[0] -if not source_dir: -sys.exit("Source directory is not found") +try: +CMCacheFilename=f"{args.build_dir}/CMakeCache.txt" +with open(CMCacheFilename) as f: +for line in f: +m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) +if m: +source_dir = m.groups()[0] +if not source_dir: +raise Exception(f"Source directory not found: '{CMCacheFilename}'") +except Exception as e: +sys.exit(e) # build the current commit subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) + +if not os.path.exists(bolt_path): +sys.exit(f"Failed to build the current revision: '{bolt_path}'") + # rename llvm-bolt os.replace(bolt_path, f"{bolt_path}.new") # memorize the old hash for logging @@ -122,12 +130,17 @@ def main(): subprocess.run(shlex.split(f"git checkout -f {args.cmp_rev}"), cwd=source_dir) # get the parent commit hash for logging new_ref = get_git_ref_or_rev(source_dir) + # build the previous commit subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) + # rename llvm-bolt +if not os.path.exists(bolt_path): +sys.exit(f"Failed to build the previous revision: '{bolt_path}'") os.replace(bolt_path, f"{bolt_path}.old") + if args.switch_back: if stash: subprocess.run(shlex.split("git stash pop"), cwd=source_dir) >From 5bcb1c9b65e000ba5b2299cff49743820b0af430 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:50:08 +0100 Subject: [PATCH 2/2] python formatter and nits --- bolt/utils/nfc-check-setup.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 812689bdce75a..692b3e48e0d44 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -81,7 +81,7 @@ def main(): source_dir = None # find the repo directory try: -CMCacheFilename=f"{args.build_dir}/CMakeCache.txt" +CMCacheFilename = f"{args.build_dir}/CMakeCache.txt" with open(CMCacheFilename) as f: for line in f: m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) @@ -93,6 +93,7 @@ def main(): sys.exit(e) # build the current commit +print ("NFC-Setup: Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -132,6 +133,7 @@ def main(): new_ref = get_git_ref_or_rev(source_dir) # build the previous commit +print ("NFC-Setup: Building previous revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/146513 >From d55f8a334f1cc0ee72a7a8bcfdbb1695222ef91a Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:37:31 +0100 Subject: [PATCH 1/3] [BOLT] Improve file handling in NFC-Mode This patch introduce the following improvements: - Catch an exception when the CMakeCache.txt is not present - Bail out gracefully when llvm-bolt did not build successfully the current or previous revision. --- bolt/utils/nfc-check-setup.py | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 4884e616a0fa3..812689bdce75a 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -80,18 +80,26 @@ def main(): source_dir = None # find the repo directory -with open(f"{args.build_dir}/CMakeCache.txt") as f: -for line in f: -m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) -if m: -source_dir = m.groups()[0] -if not source_dir: -sys.exit("Source directory is not found") +try: +CMCacheFilename=f"{args.build_dir}/CMakeCache.txt" +with open(CMCacheFilename) as f: +for line in f: +m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) +if m: +source_dir = m.groups()[0] +if not source_dir: +raise Exception(f"Source directory not found: '{CMCacheFilename}'") +except Exception as e: +sys.exit(e) # build the current commit subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) + +if not os.path.exists(bolt_path): +sys.exit(f"Failed to build the current revision: '{bolt_path}'") + # rename llvm-bolt os.replace(bolt_path, f"{bolt_path}.new") # memorize the old hash for logging @@ -122,12 +130,17 @@ def main(): subprocess.run(shlex.split(f"git checkout -f {args.cmp_rev}"), cwd=source_dir) # get the parent commit hash for logging new_ref = get_git_ref_or_rev(source_dir) + # build the previous commit subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) + # rename llvm-bolt +if not os.path.exists(bolt_path): +sys.exit(f"Failed to build the previous revision: '{bolt_path}'") os.replace(bolt_path, f"{bolt_path}.old") + if args.switch_back: if stash: subprocess.run(shlex.split("git stash pop"), cwd=source_dir) >From 5bcb1c9b65e000ba5b2299cff49743820b0af430 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:50:08 +0100 Subject: [PATCH 2/3] python formatter and nits --- bolt/utils/nfc-check-setup.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 812689bdce75a..692b3e48e0d44 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -81,7 +81,7 @@ def main(): source_dir = None # find the repo directory try: -CMCacheFilename=f"{args.build_dir}/CMakeCache.txt" +CMCacheFilename = f"{args.build_dir}/CMakeCache.txt" with open(CMCacheFilename) as f: for line in f: m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) @@ -93,6 +93,7 @@ def main(): sys.exit(e) # build the current commit +print ("NFC-Setup: Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -132,6 +133,7 @@ def main(): new_ref = get_git_ref_or_rev(source_dir) # build the previous commit +print ("NFC-Setup: Building previous revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) >From 2939a3ec27974aee1e7ca1f6e527e89b42512d8b Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:55:46 +0100 Subject: [PATCH 3/3] code formatter (2) --- bolt/utils/nfc-check-setup.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 692b3e48e0d44..497f3997545ff 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -93,7 +93,7 @@ def main(): sys.exit(e) # build the current commit -print ("NFC-Setup: Building current revision..") +print("NFC-Setup: Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -133,7 +133,7 @@ def main(): new_ref = get_git_ref_or_rev(source_dir) # build the previous commit -print ("NFC-Setup: Building previous revision..") +print("NFC-Setup: Building previous revision..") subprocess.run( shlex.s
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/146513 >From d55f8a334f1cc0ee72a7a8bcfdbb1695222ef91a Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:37:31 +0100 Subject: [PATCH 1/4] [BOLT] Improve file handling in NFC-Mode This patch introduce the following improvements: - Catch an exception when the CMakeCache.txt is not present - Bail out gracefully when llvm-bolt did not build successfully the current or previous revision. --- bolt/utils/nfc-check-setup.py | 27 --- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 4884e616a0fa3..812689bdce75a 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -80,18 +80,26 @@ def main(): source_dir = None # find the repo directory -with open(f"{args.build_dir}/CMakeCache.txt") as f: -for line in f: -m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) -if m: -source_dir = m.groups()[0] -if not source_dir: -sys.exit("Source directory is not found") +try: +CMCacheFilename=f"{args.build_dir}/CMakeCache.txt" +with open(CMCacheFilename) as f: +for line in f: +m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) +if m: +source_dir = m.groups()[0] +if not source_dir: +raise Exception(f"Source directory not found: '{CMCacheFilename}'") +except Exception as e: +sys.exit(e) # build the current commit subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) + +if not os.path.exists(bolt_path): +sys.exit(f"Failed to build the current revision: '{bolt_path}'") + # rename llvm-bolt os.replace(bolt_path, f"{bolt_path}.new") # memorize the old hash for logging @@ -122,12 +130,17 @@ def main(): subprocess.run(shlex.split(f"git checkout -f {args.cmp_rev}"), cwd=source_dir) # get the parent commit hash for logging new_ref = get_git_ref_or_rev(source_dir) + # build the previous commit subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) + # rename llvm-bolt +if not os.path.exists(bolt_path): +sys.exit(f"Failed to build the previous revision: '{bolt_path}'") os.replace(bolt_path, f"{bolt_path}.old") + if args.switch_back: if stash: subprocess.run(shlex.split("git stash pop"), cwd=source_dir) >From 5bcb1c9b65e000ba5b2299cff49743820b0af430 Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:50:08 +0100 Subject: [PATCH 2/4] python formatter and nits --- bolt/utils/nfc-check-setup.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 812689bdce75a..692b3e48e0d44 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -81,7 +81,7 @@ def main(): source_dir = None # find the repo directory try: -CMCacheFilename=f"{args.build_dir}/CMakeCache.txt" +CMCacheFilename = f"{args.build_dir}/CMakeCache.txt" with open(CMCacheFilename) as f: for line in f: m = re.match(r"LLVM_SOURCE_DIR:STATIC=(.*)", line) @@ -93,6 +93,7 @@ def main(): sys.exit(e) # build the current commit +print ("NFC-Setup: Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -132,6 +133,7 @@ def main(): new_ref = get_git_ref_or_rev(source_dir) # build the previous commit +print ("NFC-Setup: Building previous revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) >From 2939a3ec27974aee1e7ca1f6e527e89b42512d8b Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Tue, 1 Jul 2025 12:55:46 +0100 Subject: [PATCH 3/4] code formatter (2) --- bolt/utils/nfc-check-setup.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 692b3e48e0d44..497f3997545ff 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -93,7 +93,7 @@ def main(): sys.exit(e) # build the current commit -print ("NFC-Setup: Building current revision..") +print("NFC-Setup: Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -133,7 +133,7 @@ def main(): new_ref = get_git_ref_or_rev(source_dir) # build the previous commit -print ("NFC-Setup: Building previous revision..") +print("NFC-Setup: Building previous revision..") subprocess.run( shlex.s
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
paschalis-mpeis wrote: Deals cases like this: - https://lab.llvm.org/buildbot/#/builders/92/builds/21644 The previous revision had build failures which caused an exception during renaming. This caused the NFC-Mode to stay in the old revision. https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis ready_for_review https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis converted_to_draft https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] Users/paschalis mpeis/nfc check improve file handling (PR #146513)
https://github.com/paschalis-mpeis edited https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT] Improve file handling in NFC-Mode (PR #146513)
https://github.com/paschalis-mpeis ready_for_review https://github.com/llvm/llvm-project/pull/146513 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
[llvm-branch-commits] [llvm] [BOLT][NFC] Update nfc-check-setup.py guidance (PR #146659)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/146659 >From 4284499a9286ceb7708531ae9b1108e25f58267c Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Fri, 4 Jul 2025 14:54:58 +0100 Subject: [PATCH] [BOLT][NFC] Update nfc-check-setup.py guidance --- bolt/utils/nfc-check-setup.py | 53 --- 1 file changed, 31 insertions(+), 22 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 18bf7522de17b..7191bbe122a9f 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -7,6 +7,8 @@ import sys import textwrap +msg_prefix="\n> NFC-Mode:" + def get_relevant_bolt_changes(dir: str) -> str: # Return a list of bolt source changes that are relevant to testing. all_changes = subprocess.run( @@ -49,7 +51,7 @@ def switch_back( # the HEAD is. Must be called after checking out the previous commit on all # exit paths. if switch_back: -print("Switching back to current revision..") +print(f"{msg_prefix} Switching back to current revision..") if stash: subprocess.run(shlex.split("git stash pop"), cwd=source_dir) subprocess.run(shlex.split(f"git checkout {old_ref}"), cwd=source_dir) @@ -64,8 +66,10 @@ def main(): parser = argparse.ArgumentParser( description=textwrap.dedent( """ -This script builds two versions of BOLT (with the current and -previous revision). +This script builds two versions of BOLT: +llvm-bolt.new, using the current revision, and llvm-bolt.old using +the previous revision. These can be used to check whether the +current revision changes BOLT's functional behavior. """ ) ) @@ -104,7 +108,7 @@ def main(): if not args.create_wrapper and len(wrapper_args) > 0: parser.parse_args() -# find the repo directory +# Find the repo directory. source_dir = None try: CMCacheFilename = f"{args.build_dir}/CMakeCache.txt" @@ -118,13 +122,13 @@ def main(): except Exception as e: sys.exit(e) -# clean the previous llvm-bolt if it exists +# Clean the previous llvm-bolt if it exists. bolt_path = f"{args.build_dir}/bin/llvm-bolt" if os.path.exists(bolt_path): os.remove(bolt_path) -# build the current commit -print("NFC-Setup: Building current revision..") +# Build the current commit. +print(f"{msg_prefix} Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -132,9 +136,8 @@ def main(): if not os.path.exists(bolt_path): sys.exit(f"Failed to build the current revision: '{bolt_path}'") -# rename llvm-bolt +# Rename llvm-bolt and memorize the old hash for logging. os.replace(bolt_path, f"{bolt_path}.new") -# memorize the old hash for logging old_ref = get_git_ref_or_rev(source_dir) if args.check_bolt_sources: @@ -147,7 +150,7 @@ def main(): print(f"BOLT source changes were found:\n{file_changes}") open(marker, "a").close() -# determine whether a stash is needed +# Determine whether a stash is needed. stash = subprocess.run( shlex.split("git status --porcelain"), cwd=source_dir, @@ -156,32 +159,33 @@ def main(): text=True, ).stdout if stash: -# save local changes before checkout +# Save local changes before checkout. subprocess.run(shlex.split("git stash push -u"), cwd=source_dir) -# check out the previous/cmp commit + +# Check out the previous/cmp commit and get its commit hash for logging. subprocess.run(shlex.split(f"git checkout -f {args.cmp_rev}"), cwd=source_dir) -# get the parent commit hash for logging new_ref = get_git_ref_or_rev(source_dir) -# build the previous commit -print("NFC-Setup: Building previous revision..") +# Build the previous commit. +print(f"{msg_prefix} Building previous revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) -# rename llvm-bolt +# Rename llvm-bolt. if not os.path.exists(bolt_path): print(f"Failed to build the previous revision: '{bolt_path}'") switch_back(args.switch_back, stash, source_dir, old_ref, new_ref) sys.exit(1) os.replace(bolt_path, f"{bolt_path}.old") -# symlink llvm-bolt-wrapper +# Symlink llvm-bolt-wrapper if args.create_wrapper: +print(f"{msg_prefix} Creating llvm-bolt wrapper..") script_dir = os.path.dirname(os.path.abspath(__file__)) wrapper_path = f"{script_dir}/llvm-bolt-wrapper.py" try: -# set up llvm-bolt-wrapper.ini +# Set up llvm-bolt-wrapper.ini ini = subprocess.che
[llvm-branch-commits] [llvm] [BOLT][NFC] Update nfc-check-setup.py guidance (PR #146659)
https://github.com/paschalis-mpeis updated https://github.com/llvm/llvm-project/pull/146659 >From 2b5e54e8f3ed5f29a495a92e4e93725c74df Mon Sep 17 00:00:00 2001 From: Paschalis Mpeis Date: Fri, 4 Jul 2025 14:54:58 +0100 Subject: [PATCH] [BOLT][NFC] Update nfc-check-setup.py guidance --- bolt/utils/nfc-check-setup.py | 53 --- 1 file changed, 31 insertions(+), 22 deletions(-) diff --git a/bolt/utils/nfc-check-setup.py b/bolt/utils/nfc-check-setup.py index 18bf7522de17b..d8666e2158499 100755 --- a/bolt/utils/nfc-check-setup.py +++ b/bolt/utils/nfc-check-setup.py @@ -7,6 +7,8 @@ import sys import textwrap +msg_prefix = "\n> NFC-Mode:" + def get_relevant_bolt_changes(dir: str) -> str: # Return a list of bolt source changes that are relevant to testing. all_changes = subprocess.run( @@ -49,7 +51,7 @@ def switch_back( # the HEAD is. Must be called after checking out the previous commit on all # exit paths. if switch_back: -print("Switching back to current revision..") +print(f"{msg_prefix} Switching back to current revision..") if stash: subprocess.run(shlex.split("git stash pop"), cwd=source_dir) subprocess.run(shlex.split(f"git checkout {old_ref}"), cwd=source_dir) @@ -64,8 +66,10 @@ def main(): parser = argparse.ArgumentParser( description=textwrap.dedent( """ -This script builds two versions of BOLT (with the current and -previous revision). +This script builds two versions of BOLT: +llvm-bolt.new, using the current revision, and llvm-bolt.old using +the previous revision. These can be used to check whether the +current revision changes BOLT's functional behavior. """ ) ) @@ -104,7 +108,7 @@ def main(): if not args.create_wrapper and len(wrapper_args) > 0: parser.parse_args() -# find the repo directory +# Find the repo directory. source_dir = None try: CMCacheFilename = f"{args.build_dir}/CMakeCache.txt" @@ -118,13 +122,13 @@ def main(): except Exception as e: sys.exit(e) -# clean the previous llvm-bolt if it exists +# Clean the previous llvm-bolt if it exists. bolt_path = f"{args.build_dir}/bin/llvm-bolt" if os.path.exists(bolt_path): os.remove(bolt_path) -# build the current commit -print("NFC-Setup: Building current revision..") +# Build the current commit. +print(f"{msg_prefix} Building current revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) @@ -132,9 +136,8 @@ def main(): if not os.path.exists(bolt_path): sys.exit(f"Failed to build the current revision: '{bolt_path}'") -# rename llvm-bolt +# Rename llvm-bolt and memorize the old hash for logging. os.replace(bolt_path, f"{bolt_path}.new") -# memorize the old hash for logging old_ref = get_git_ref_or_rev(source_dir) if args.check_bolt_sources: @@ -147,7 +150,7 @@ def main(): print(f"BOLT source changes were found:\n{file_changes}") open(marker, "a").close() -# determine whether a stash is needed +# Determine whether a stash is needed. stash = subprocess.run( shlex.split("git status --porcelain"), cwd=source_dir, @@ -156,32 +159,33 @@ def main(): text=True, ).stdout if stash: -# save local changes before checkout +# Save local changes before checkout. subprocess.run(shlex.split("git stash push -u"), cwd=source_dir) -# check out the previous/cmp commit + +# Check out the previous/cmp commit and get its commit hash for logging. subprocess.run(shlex.split(f"git checkout -f {args.cmp_rev}"), cwd=source_dir) -# get the parent commit hash for logging new_ref = get_git_ref_or_rev(source_dir) -# build the previous commit -print("NFC-Setup: Building previous revision..") +# Build the previous commit. +print(f"{msg_prefix} Building previous revision..") subprocess.run( shlex.split("cmake --build . --target llvm-bolt"), cwd=args.build_dir ) -# rename llvm-bolt +# Rename llvm-bolt. if not os.path.exists(bolt_path): print(f"Failed to build the previous revision: '{bolt_path}'") switch_back(args.switch_back, stash, source_dir, old_ref, new_ref) sys.exit(1) os.replace(bolt_path, f"{bolt_path}.old") -# symlink llvm-bolt-wrapper +# Symlink llvm-bolt-wrapper if args.create_wrapper: +print(f"{msg_prefix} Creating llvm-bolt wrapper..") script_dir = os.path.dirname(os.path.abspath(__file__)) wrapper_path = f"{script_dir}/llvm-bolt-wrapper.py" try: -# set up llvm-bolt-wrapper.ini +# Set up llvm-bolt-wrapper.ini ini = subprocess.c
[llvm-branch-commits] [llvm] [BOLT][NFC] Update nfc-check-setup.py guidance (PR #146659)
@@ -156,9 +158,8 @@ def main(): os.replace(bolt_path, f"{bolt_path}.old") print( -f"Build directory {args.build_dir} is ready to run BOLT tests, e.g.\n" -"\tbin/llvm-lit -sv tools/bolt/test\nor\n" -"\tbin/llvm-lit -sv tools/bolttests" +f"Build directory {args.build_dir} is ready for NFC-Mode comparison " +"between the two revisions." paschalis-mpeis wrote: Forced pushed to rebase and handle this. `llvm-bolt` is available only when the `--create-wrapper` flag is used (as a wrapper). In that case, the commands are valid and now printed. Also applied a few more nits. https://github.com/llvm/llvm-project/pull/146659 ___ llvm-branch-commits mailing list llvm-branch-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits