[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
leo-ard wrote: @nikic ping:) https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard updated https://github.com/llvm/llvm-project/pull/70845 From 00d0c18b5414ffe7222e1ee0ad5ecfdb8783704e Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:01:27 -0400 Subject: [PATCH 01/12] Add NonNeg check for InstCombine --- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 7c2ad92f919a3cc..cd287d757fdfd23 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -5554,11 +5554,15 @@ Instruction *InstCombinerImpl::foldICmpWithZextOrSext(ICmpInst &ICmp) { return new ICmpInst(ICmp.getPredicate(), Builder.CreateOr(X, Y), Constant::getNullValue(X->getType())); + // Treat "zext nneg" as "sext" + bool IsNonNeg0 = isa(ICmp.getOperand(0)); + bool IsNonNeg1 = isa(ICmp.getOperand(1)); + // If we have mismatched casts, treat the zext of a non-negative source as // a sext to simulate matching casts. Otherwise, we are done. // TODO: Can we handle some predicates (equality) without non-negative? - if ((IsZext0 && isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT)) || - (IsZext1 && isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT))) + if ((IsZext0 && (IsNonNeg0 || isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT))) || + (IsZext1 && (IsNonNeg1 || isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT IsSignedExt = true; else return nullptr; From ee1978946530e28ff79f924bcc5ffd73dc590549 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:03:44 -0400 Subject: [PATCH 02/12] Add tests for min/max --- clang/test/CodeGen/X86/min_max.c | 19 ++ .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ 2 files changed, 204 insertions(+) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i32 @zext_zext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @zext_zext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = zext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = zext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK:
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -0,0 +1,175 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 3 +; See PR-70845 for more details +; RUN: opt < %s -S -passes=instcombine | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i32 @zext_zext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @zext_zext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CMP2:%.*]] = icmp ugt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = zext i16 %x to i32 + %conv1 = zext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i16 @zext_positive_and_sext(i32 noundef %n, ptr noundef %v) { +; CHECK-LABEL: define signext i16 @zext_positive_and_sext( +; CHECK-SAME: i32 noundef [[N:%.*]], ptr noundef [[V:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:br label [[FOR_COND:%.*]] +; CHECK: for.cond: +; CHECK-NEXT:[[P_0:%.*]] = phi i16 [ 0, [[ENTRY:%.*]] ], [ [[COND_OFF0:%.*]], [[COND_END:%.*]] ] +; CHECK-NEXT:[[I_0:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[INC:%.*]], [[COND_END]] ] +; CHECK-NEXT:[[CMP:%.*]] = icmp slt i32 [[I_0]], [[N]] +; CHECK-NEXT:br i1 [[CMP]], label [[FOR_BODY:%.*]], label [[FOR_COND_CLEANUP:%.*]] +; CHECK: for.body: +; CHECK-NEXT:[[IDXPROM:%.*]] = zext nneg i32 [[I_0]] to i64 +; CHECK-NEXT:[[ARRAYIDX:%.*]] = getelementptr i16, ptr [[V]], i64 [[IDXPROM]] +; CHECK-NEXT:[[TMP0:%.*]] = load i16, ptr [[ARRAYIDX]], align 2 +; CHECK-NEXT:[[CMP2:%.*]] = icmp slt i16 [[P_0]], [[TMP0]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: for.cond.cleanup: +; CHECK-NEXT:ret i16 [[P_0]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND_OFF0]] = phi i16 [ [[TMP0]], [[COND_TRUE]] ], [ [[P_0]], [[COND_FALSE]] ] +; CHECK-NEXT:[[INC]] = add nuw nsw i32 [[I_0]], 1 +; CHECK-NEXT:br label [[FOR_COND]] +; +entry: + br label %for.cond + +for.cond: ; preds = %cond.end, %entry + %p.0 = phi i16 [ 0, %entry ], [ %conv8, %cond.end ] + %i.0 = phi i32 [ 0, %entry ], [ %inc, %cond.end ] + %cmp = icmp slt i32 %i.0, %n + br i1 %cmp, label %for.body, label %for.cond.cleanup + +for.body: ; preds = %for.cond + %conv = zext nneg i16 %p.0 to i32;; %p.0 is always positive here + %idxprom = sext i32 %i.0 to i64 + %arrayidx = getelementptr i16, ptr %v, i64 %idxprom + %0 = load i16, ptr %arrayidx, align 2 + %conv1 = sext i16 %0 to i32 + %cmp2 = icmp slt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +for.cond.cleanup: ; preds = %for.cond + ret i16 %p.0 + +cond.end: ; preds = %cond.false, %cond.t
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -247,6 +355,19 @@ define i1 @sext_zext_uge_op0_wide(i16 %x, i8 %y) { ret i1 %c } + +define i1 @sext_zext_nneg_uge_op0_wide(i16 %x, i8 %y) { +; CHECK-LABEL: @sext_zext_nneg_uge_op0_wide( +; CHECK-NEXT:[[TMP1:%.*]] = sext i8 [[Y:%.*]] to i16 +; CHECK-NEXT:[[C:%.*]] = icmp ule i16 [[TMP1]], [[X:%.*]] +; CHECK-NEXT:ret i1 [[C]] +; + %a = sext i16 %x to i32 + %b = zext nneg i8 %y to i32 + %c = icmp uge i32 %a, %b + ret i1 %c +} + define i1 @zext_sext_sgt_known_nonneg(i8 %x, i8 %y) { ; CHECK-LABEL: @zext_sext_sgt_known_nonneg( leo-ard wrote: Every test in this test-file has an `known_nonneg` variant for mismatched zext/sext. I was wondering on the relevance of this variant as we now have the nneg flag on zext. It would make more sense to have specialized tests for zext -> zext nneg folding with value-tracking and another test for icmp with zext nneg/sext folding into icmp (what I just added). FYI, this test dates from this PR, when there where no nneg flag : https://reviews.llvm.org/D124419. https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -0,0 +1,175 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 3 +; See PR-70845 for more details +; RUN: opt < %s -S -passes=instcombine | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { leo-ard wrote: The technique of commenting and seeing which tests crashed worked really well. Thanks for the trick https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard updated https://github.com/llvm/llvm-project/pull/70845 From 00d0c18b5414ffe7222e1ee0ad5ecfdb8783704e Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:01:27 -0400 Subject: [PATCH 01/13] Add NonNeg check for InstCombine --- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 7c2ad92f919a3cc..cd287d757fdfd23 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -5554,11 +5554,15 @@ Instruction *InstCombinerImpl::foldICmpWithZextOrSext(ICmpInst &ICmp) { return new ICmpInst(ICmp.getPredicate(), Builder.CreateOr(X, Y), Constant::getNullValue(X->getType())); + // Treat "zext nneg" as "sext" + bool IsNonNeg0 = isa(ICmp.getOperand(0)); + bool IsNonNeg1 = isa(ICmp.getOperand(1)); + // If we have mismatched casts, treat the zext of a non-negative source as // a sext to simulate matching casts. Otherwise, we are done. // TODO: Can we handle some predicates (equality) without non-negative? - if ((IsZext0 && isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT)) || - (IsZext1 && isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT))) + if ((IsZext0 && (IsNonNeg0 || isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT))) || + (IsZext1 && (IsNonNeg1 || isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT IsSignedExt = true; else return nullptr; From ee1978946530e28ff79f924bcc5ffd73dc590549 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:03:44 -0400 Subject: [PATCH 02/13] Add tests for min/max --- clang/test/CodeGen/X86/min_max.c | 19 ++ .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ 2 files changed, 204 insertions(+) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i32 @zext_zext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @zext_zext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = zext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = zext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK:
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard edited https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -247,6 +355,19 @@ define i1 @sext_zext_uge_op0_wide(i16 %x, i8 %y) { ret i1 %c } + +define i1 @sext_zext_nneg_uge_op0_wide(i16 %x, i8 %y) { +; CHECK-LABEL: @sext_zext_nneg_uge_op0_wide( +; CHECK-NEXT:[[TMP1:%.*]] = sext i8 [[Y:%.*]] to i16 +; CHECK-NEXT:[[C:%.*]] = icmp ule i16 [[TMP1]], [[X:%.*]] +; CHECK-NEXT:ret i1 [[C]] +; + %a = sext i16 %x to i32 + %b = zext nneg i8 %y to i32 + %c = icmp uge i32 %a, %b + ret i1 %c +} + define i1 @zext_sext_sgt_known_nonneg(i8 %x, i8 %y) { ; CHECK-LABEL: @zext_sext_sgt_known_nonneg( leo-ard wrote: Ok I'll leave them then https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -0,0 +1,145 @@ +; RUN: opt < %s --O3 -S | FileCheck %s leo-ard wrote: How did you get the SSA form ? I wasn't able to do it on my side. Here are the commands that I did : ```bash > build_release/bin/clang -S -emit-llvm min_max.c -fno-discard-value-names -o > min_max.ll > build_release/bin/opt -S -passes=sroa min_max.ll > min_max2.ll > head -n 20 min_max2.ll ; ModuleID = 'min_max.ll' source_filename = "min_max.c" target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128" target triple = "arm64-apple-macosx14.0.0" ; Function Attrs: noinline nounwind optnone ssp uwtable(sync) define signext i16 @vecreduce_smin_v2i16(i32 noundef %n, ptr noundef %v) #0 { entry: %n.addr = alloca i32, align 4 %v.addr = alloca ptr, align 8 %p = alloca i16, align 2 %i = alloca i32, align 4 store i32 %n, ptr %n.addr, align 4 store ptr %v, ptr %v.addr, align 8 store i16 0, ptr %p, align 2 store i32 0, ptr %i, align 4 br label %for.cond for.cond: ; preds = %for.inc, %entry %0 = load i32, ptr %i, align 4 ``` https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard updated https://github.com/llvm/llvm-project/pull/70845 From 00d0c18b5414ffe7222e1ee0ad5ecfdb8783704e Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:01:27 -0400 Subject: [PATCH 01/14] Add NonNeg check for InstCombine --- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 7c2ad92f919a3cc..cd287d757fdfd23 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -5554,11 +5554,15 @@ Instruction *InstCombinerImpl::foldICmpWithZextOrSext(ICmpInst &ICmp) { return new ICmpInst(ICmp.getPredicate(), Builder.CreateOr(X, Y), Constant::getNullValue(X->getType())); + // Treat "zext nneg" as "sext" + bool IsNonNeg0 = isa(ICmp.getOperand(0)); + bool IsNonNeg1 = isa(ICmp.getOperand(1)); + // If we have mismatched casts, treat the zext of a non-negative source as // a sext to simulate matching casts. Otherwise, we are done. // TODO: Can we handle some predicates (equality) without non-negative? - if ((IsZext0 && isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT)) || - (IsZext1 && isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT))) + if ((IsZext0 && (IsNonNeg0 || isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT))) || + (IsZext1 && (IsNonNeg1 || isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT IsSignedExt = true; else return nullptr; From ee1978946530e28ff79f924bcc5ffd73dc590549 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:03:44 -0400 Subject: [PATCH 02/14] Add tests for min/max --- clang/test/CodeGen/X86/min_max.c | 19 ++ .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ 2 files changed, 204 insertions(+) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i32 @zext_zext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @zext_zext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = zext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = zext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK:
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -0,0 +1,112 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 3 +; RUN: opt < %s -O3 -S | FileCheck %s +; See issue #55013 and PR #70845 for more details. +; This test comes from the following C program, compiled with clang +; +;; short vecreduce_smin_v2i16(int n, short* v) +;; { +;; short p = 0; +;; for (int i = 0; i < n; ++i) +;; p = p > v[i] ? v[i] : p; +;; return p; +;; } +; +;; short vecreduce_smax_v2i16(int n, short* v) +;; { +;; short p = 0; +;; for (int i = 0; i < n; ++i) +;; p = p < v[i] ? v[i] : p; +;; return p; +;; } + +define i16 @vecreduce_smin_v2i16(i32 %n, ptr %v) { +; CHECK-LABEL: define i16 @vecreduce_smin_v2i16( +; CHECK:@llvm.smin.v2i16 leo-ard wrote: I didn't keep all the checks generated by `llvm/utils/update_test_checks.py` as in this tests, we only want to know if the intrinsics @llvm.smin.v2i16 has been generated or not. Let me know if you think I should leave all the checks https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard edited https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard updated https://github.com/llvm/llvm-project/pull/70845 From 00d0c18b5414ffe7222e1ee0ad5ecfdb8783704e Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:01:27 -0400 Subject: [PATCH 01/15] Add NonNeg check for InstCombine --- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 7c2ad92f919a3cc..cd287d757fdfd23 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -5554,11 +5554,15 @@ Instruction *InstCombinerImpl::foldICmpWithZextOrSext(ICmpInst &ICmp) { return new ICmpInst(ICmp.getPredicate(), Builder.CreateOr(X, Y), Constant::getNullValue(X->getType())); + // Treat "zext nneg" as "sext" + bool IsNonNeg0 = isa(ICmp.getOperand(0)); + bool IsNonNeg1 = isa(ICmp.getOperand(1)); + // If we have mismatched casts, treat the zext of a non-negative source as // a sext to simulate matching casts. Otherwise, we are done. // TODO: Can we handle some predicates (equality) without non-negative? - if ((IsZext0 && isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT)) || - (IsZext1 && isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT))) + if ((IsZext0 && (IsNonNeg0 || isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT))) || + (IsZext1 && (IsNonNeg1 || isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT IsSignedExt = true; else return nullptr; From ee1978946530e28ff79f924bcc5ffd73dc590549 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:03:44 -0400 Subject: [PATCH 02/15] Add tests for min/max --- clang/test/CodeGen/X86/min_max.c | 19 ++ .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ 2 files changed, 204 insertions(+) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i32 @zext_zext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @zext_zext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = zext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = zext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK:
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard updated https://github.com/llvm/llvm-project/pull/70845 From 00d0c18b5414ffe7222e1ee0ad5ecfdb8783704e Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:01:27 -0400 Subject: [PATCH 01/16] Add NonNeg check for InstCombine --- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 7c2ad92f919a3cc..cd287d757fdfd23 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -5554,11 +5554,15 @@ Instruction *InstCombinerImpl::foldICmpWithZextOrSext(ICmpInst &ICmp) { return new ICmpInst(ICmp.getPredicate(), Builder.CreateOr(X, Y), Constant::getNullValue(X->getType())); + // Treat "zext nneg" as "sext" + bool IsNonNeg0 = isa(ICmp.getOperand(0)); + bool IsNonNeg1 = isa(ICmp.getOperand(1)); + // If we have mismatched casts, treat the zext of a non-negative source as // a sext to simulate matching casts. Otherwise, we are done. // TODO: Can we handle some predicates (equality) without non-negative? - if ((IsZext0 && isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT)) || - (IsZext1 && isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT))) + if ((IsZext0 && (IsNonNeg0 || isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT))) || + (IsZext1 && (IsNonNeg1 || isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT IsSignedExt = true; else return nullptr; From ee1978946530e28ff79f924bcc5ffd73dc590549 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:03:44 -0400 Subject: [PATCH 02/16] Add tests for min/max --- clang/test/CodeGen/X86/min_max.c | 19 ++ .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ 2 files changed, 204 insertions(+) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i32 @zext_zext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @zext_zext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = zext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = zext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK:
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -0,0 +1,111 @@ +; RUN: opt < %s -O3 -S | FileCheck %s +; See issue #55013 and PR #70845 for more details. +; This test comes from the following C program, compiled with clang +; +;; short vecreduce_smin_v2i16(int n, short* v) +;; { +;; short p = 0; +;; for (int i = 0; i < n; ++i) +;; p = p > v[i] ? v[i] : p; +;; return p; +;; } +; +;; short vecreduce_smax_v2i16(int n, short* v) +;; { +;; short p = 0; +;; for (int i = 0; i < n; ++i) +;; p = p < v[i] ? v[i] : p; +;; return p; +;; } + +define i16 @vecreduce_smin_v2i16(i32 %n, ptr %v) { +; CHECK-LABEL: define i16 @vecreduce_smin_v2i16( +; CHECK:@llvm.smin.v2i16 + +entry: + br label %for.cond + +for.cond: ; preds = %for.inc, %entry + %p.0 = phi i16 [ 0, %entry ], [ %conv8, %for.inc ] + %i.0 = phi i32 [ 0, %entry ], [ %inc, %for.inc ] + %cmp = icmp slt i32 %i.0, %n + br i1 %cmp, label %for.body, label %for.end + +for.body: ; preds = %for.cond + %conv = sext i16 %p.0 to i32 + %idxprom = sext i32 %i.0 to i64 + %arrayidx = getelementptr inbounds i16, ptr %v, i64 %idxprom + %0 = load i16, ptr %arrayidx, align 2 + %conv1 = sext i16 %0 to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + %idxprom4 = sext i32 %i.0 to i64 + %arrayidx5 = getelementptr inbounds i16, ptr %v, i64 %idxprom4 + %1 = load i16, ptr %arrayidx5, align 2 + %conv6 = sext i16 %1 to i32 + br label %cond.end + +cond.false: ; preds = %for.body + %conv7 = sext i16 %p.0 to i32 + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ %conv6, %cond.true ], [ %conv7, %cond.false ] + %conv8 = trunc i32 %cond to i16 + br label %for.inc + +for.inc: ; preds = %cond.end + %inc = add nsw i32 %i.0, 1 + br label %for.cond + +for.end: ; preds = %for.cond + ret i16 %p.0 +} + +define signext i16 @vecreduce_smax_v2i16(i32 noundef %n, ptr noundef %v) { leo-ard wrote: oups, I missed those. Just fixed it:) https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [InstCombine] Use zext's nneg flag for icmp folding (PR #70845)
leo-ard wrote: > Do you have the access to merge PR? No I don't. Thanks for the time and insightful comments ! https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard created https://github.com/llvm/llvm-project/pull/70845 This PR fixes https://github.com/llvm/llvm-project/issues/55013 : the max intrinsics is not generated for this simple loop case : https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being folded into a select, thus not generating the max intrinsics. For the story : Since LLVM 14, SCCP pass got smarter by folding sext into zext for positive ranges : https://reviews.llvm.org/D81756. After this change, InstCombine was sometimes unable to fold ICMP correctly as both of the arguments pointed to mismatched zext/sext. To fix this, @rotateright implemented this fix : https://reviews.llvm.org/D124419 that tries to resolve the mismatch by knowing if the argument of a zext is positive (in which case, it is like a sext) by using ValueTracking, however ValueTracking is not smart enough to infer that the value is positive in some cases. Recently, @nikic implemented #67982 which keeps the information that a zext is non-negative. This PR simply uses this information to do the folding accordingly. TLDR : This PR uses the recent nneg tag on zext to fold the icmp accordingly in instcombine. This PR also contains test cases for sext/zext folding with InstCombine as well as a x86 regression tests for the max/min case. From 7db32deed74766a9318eec5bceb3bdd069996e98 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:01:27 -0400 Subject: [PATCH 1/5] Add NonNeg check for InstCombine --- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 2ff27abc79318c4..e9b448b2a7a2b97 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -5587,11 +5587,15 @@ Instruction *InstCombinerImpl::foldICmpWithZextOrSext(ICmpInst &ICmp) { return new ICmpInst(ICmp.getPredicate(), Builder.CreateOr(X, Y), Constant::getNullValue(X->getType())); + // Treat "zext nneg" as "sext" + bool IsNonNeg0 = isa(ICmp.getOperand(0)); + bool IsNonNeg1 = isa(ICmp.getOperand(1)); + // If we have mismatched casts, treat the zext of a non-negative source as // a sext to simulate matching casts. Otherwise, we are done. // TODO: Can we handle some predicates (equality) without non-negative? - if ((IsZext0 && isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT)) || - (IsZext1 && isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT))) + if ((IsZext0 && (IsNonNeg0 || isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT))) || + (IsZext1 && (IsNonNeg1 || isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT IsSignedExt = true; else return nullptr; From aacf6e9bff0b70ca063d9c9435a2877a133e0ea1 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:03:44 -0400 Subject: [PATCH 2/5] Add tests for min/max --- clang/test/CodeGen/X86/min_max.c | 19 ++ .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ 2 files changed, 204 insertions(+) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*
[clang] [llvm] [SCCP] [Transform] Adding ICMP folding for zext and sext in SCCPSolver (PR #67594)
https://github.com/leo-ard closed https://github.com/llvm/llvm-project/pull/67594 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [SCCP] [Transform] Adding ICMP folding for zext and sext in SCCPSolver (PR #67594)
leo-ard wrote: I'm closing this PR in favour of #70845 https://github.com/llvm/llvm-project/pull/67594 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard updated https://github.com/llvm/llvm-project/pull/70845 From 6bb97fd48d59b7f79fdf90a2b27e9220f417fac7 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:01:27 -0400 Subject: [PATCH 1/8] Add NonNeg check for InstCombine --- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 7574987d0e23141..4751d870da7a777 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -5587,11 +5587,15 @@ Instruction *InstCombinerImpl::foldICmpWithZextOrSext(ICmpInst &ICmp) { return new ICmpInst(ICmp.getPredicate(), Builder.CreateOr(X, Y), Constant::getNullValue(X->getType())); + // Treat "zext nneg" as "sext" + bool IsNonNeg0 = isa(ICmp.getOperand(0)); + bool IsNonNeg1 = isa(ICmp.getOperand(1)); + // If we have mismatched casts, treat the zext of a non-negative source as // a sext to simulate matching casts. Otherwise, we are done. // TODO: Can we handle some predicates (equality) without non-negative? - if ((IsZext0 && isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT)) || - (IsZext1 && isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT))) + if ((IsZext0 && (IsNonNeg0 || isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT))) || + (IsZext1 && (IsNonNeg1 || isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT IsSignedExt = true; else return nullptr; From 6570c0a864805408928a0d7a54dd0098f2d58419 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:03:44 -0400 Subject: [PATCH 2/8] Add tests for min/max --- clang/test/CodeGen/X86/min_max.c | 19 ++ .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ 2 files changed, 204 insertions(+) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i32 @zext_zext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @zext_zext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = zext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = zext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK:
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s leo-ard wrote: Out of curiosity, why is end-to-end codegen with clang not used ? Is it too unstable ? https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
@@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN-./bin/opt: opt < %s -S -passes=ipsccp | FileCheck %s leo-ard wrote: yep, a typo on my end https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
leo-ard wrote: Thanks for taking the time to review the PR. I just added another test in PhaseOrdering to make sure that the min/max intrinsics are generated https://github.com/llvm/llvm-project/pull/70845 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[llvm] [clang] [Instcombine] use zext's nneg flag for icmp folding (PR #70845)
https://github.com/leo-ard updated https://github.com/llvm/llvm-project/pull/70845 From 6bb97fd48d59b7f79fdf90a2b27e9220f417fac7 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:01:27 -0400 Subject: [PATCH 1/9] Add NonNeg check for InstCombine --- llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 7574987d0e23141..4751d870da7a777 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -5587,11 +5587,15 @@ Instruction *InstCombinerImpl::foldICmpWithZextOrSext(ICmpInst &ICmp) { return new ICmpInst(ICmp.getPredicate(), Builder.CreateOr(X, Y), Constant::getNullValue(X->getType())); + // Treat "zext nneg" as "sext" + bool IsNonNeg0 = isa(ICmp.getOperand(0)); + bool IsNonNeg1 = isa(ICmp.getOperand(1)); + // If we have mismatched casts, treat the zext of a non-negative source as // a sext to simulate matching casts. Otherwise, we are done. // TODO: Can we handle some predicates (equality) without non-negative? - if ((IsZext0 && isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT)) || - (IsZext1 && isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT))) + if ((IsZext0 && (IsNonNeg0 || isKnownNonNegative(X, DL, 0, &AC, &ICmp, &DT))) || + (IsZext1 && (IsNonNeg1 || isKnownNonNegative(Y, DL, 0, &AC, &ICmp, &DT IsSignedExt = true; else return nullptr; From 6570c0a864805408928a0d7a54dd0098f2d58419 Mon Sep 17 00:00:00 2001 From: leo-ard Date: Mon, 30 Oct 2023 18:03:44 -0400 Subject: [PATCH 2/9] Add tests for min/max --- clang/test/CodeGen/X86/min_max.c | 19 ++ .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ 2 files changed, 204 insertions(+) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true + %cond = phi i32 [ 0, %cond.true ], [ 1, %cond.false ] + ret i32 %cond +} + + +define signext i32 @zext_zext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @zext_zext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = zext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = zext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK:
[clang] [SCCP] [Transform] Adding ICMP folding for zext and sext in SCCPSolver (PR #67594)
https://github.com/leo-ard created https://github.com/llvm/llvm-project/pull/67594 This PR fixes #55013 : the max intrinsics is not generated for this simple loop case : https://godbolt.org/z/hxz1xhMPh. This is caused by a ICMP not being folded into a select, thus not generating the max intrinsics. Since LLVM 14, SCCP pass got smarter by folding sext into zext for positive ranges : https://reviews.llvm.org/D81756. After this change, InstCombine was sometimes unable to fold icmp correctly as both of the arguments pointed to mismatched zext/sext. To fix this, @rotateright implemented this fix : https://reviews.llvm.org/D124419 that tries to resolve the mismatch by knowing if the argument of a zext is positive (in which case, it is like a sext) by using ValueTracking. However, ValueTracking seems to be not smart enough for this case and cannot accurately know that the value is positive or not. This PR implements the folding in SCCP directly, where we have the knowledge that the value are positive or not. This PR also contains test cases for sext/zext folding with SCCP as well as a x86 regression tests for the max/min case. From 399f9d64cfde0761ac8278dd05ba704d879b1f5a Mon Sep 17 00:00:00 2001 From: leo-ard Date: Wed, 27 Sep 2023 13:35:53 -0400 Subject: [PATCH] Adding ICMP folding for SCCPSolver --- clang/test/CodeGen/X86/min_max.c | 19 ++ llvm/lib/Transforms/Utils/SCCPSolver.cpp | 54 + .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ llvm/test/Transforms/SCCP/widening.ll | 4 +- 4 files changed, 260 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/lib/Transforms/Utils/SCCPSolver.cpp b/llvm/lib/Transforms/Utils/SCCPSolver.cpp index 8a67fda7c98e787..09f64a925ab1b7c 100644 --- a/llvm/lib/Transforms/Utils/SCCPSolver.cpp +++ b/llvm/lib/Transforms/Utils/SCCPSolver.cpp @@ -193,6 +193,60 @@ static bool replaceSignedInst(SCCPSolver &Solver, NewInst = BinaryOperator::Create(NewOpcode, Op0, Op1, "", &Inst); break; } + case Instruction::ICmp: { +ICmpInst &ICmp = cast(Inst); + +ZExtInst *Op0_zext = dyn_cast(ICmp.getOperand(0)); +SExtInst *Op0_sext = dyn_cast(ICmp.getOperand(0)); + +ZExtInst *Op1_zext = dyn_cast(ICmp.getOperand(1)); +SExtInst *Op1_sext = dyn_cast(ICmp.getOperand(1)); + +CastInst *Op0; +CastInst *Op1; + +if (Op0_zext) Op0 = Op0_zext; else Op0 = Op0_sext; +if (Op1_zext) Op1 = Op1_zext; else Op1 = Op1_sext; + +bool reversed = false; + +if (!Op0 || !Op1){ + // Op0 and Op1 must be defined + return false; +} + +if (Op1_zext && (! Op0_zext)){ + // We force Op0 to be a zext and reverse the arguments + // at the end if we swap + reversed = true; + + std::swap(Op0_zext, Op1_zext); + std::swap(Op0_sext, Op1_sext); + std::swap(Op0, Op1); +} + + +if(Op0->getType() != Op1->getType()){ + // extensions are not of the same type + // This optimization is done in InstCombine + return false; +} + +// ICMP (sext X) (sext y) => ICMP X, Y +// ICMP (zext X) (zext y) => ICMP X, Y +// ICMP (zext X) (sext Y) => ICMP X, Y if X >= 0 and ICMP signed +if((Op0_zext && Op1_zext) + || (Op0_sext && Op1_sext) + || (ICmp.isSigned() && Op0_zext && Op1_sext && isNonNegative(Op0_zext->getOperand(0{ + + Value *X = Op0->getOperand(0); + Value *Y = Op1->getOperand(0); + NewInst = CmpInst::Create(ICmp.getOpcode(), ICmp.getPredicate(), reversed ? Y : X, reversed ? X : Y, "", &Inst); + break; +} + +return false; + } default: return false; } diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signex
[clang] [SCCP] [Transform] Adding ICMP folding for zext and sext in SCCPSolver (PR #67594)
https://github.com/leo-ard updated https://github.com/llvm/llvm-project/pull/67594 From 399f9d64cfde0761ac8278dd05ba704d879b1f5a Mon Sep 17 00:00:00 2001 From: leo-ard Date: Wed, 27 Sep 2023 13:35:53 -0400 Subject: [PATCH 1/2] Adding ICMP folding for SCCPSolver --- clang/test/CodeGen/X86/min_max.c | 19 ++ llvm/lib/Transforms/Utils/SCCPSolver.cpp | 54 + .../Transforms/SCCP/icmp-fold-with-cast.ll| 185 ++ llvm/test/Transforms/SCCP/widening.ll | 4 +- 4 files changed, 260 insertions(+), 2 deletions(-) create mode 100644 clang/test/CodeGen/X86/min_max.c create mode 100644 llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll diff --git a/clang/test/CodeGen/X86/min_max.c b/clang/test/CodeGen/X86/min_max.c new file mode 100644 index 000..7af8181cc9ff367 --- /dev/null +++ b/clang/test/CodeGen/X86/min_max.c @@ -0,0 +1,19 @@ +// RUN: %clang_cc1 %s -O2 -triple=x86_64-apple-darwin -emit-llvm -o - | FileCheck %s + +short vecreduce_smax_v2i16(int n, short* v) +{ + // CHECK: @llvm.smax + short p = 0; + for (int i = 0; i < n; ++i) +p = p < v[i] ? v[i] : p; + return p; +} + +short vecreduce_smin_v2i16(int n, short* v) +{ + // CHECK: @llvm.smin + short p = 0; + for (int i = 0; i < n; ++i) +p = p > v[i] ? v[i] : p; + return p; +} \ No newline at end of file diff --git a/llvm/lib/Transforms/Utils/SCCPSolver.cpp b/llvm/lib/Transforms/Utils/SCCPSolver.cpp index 8a67fda7c98e787..09f64a925ab1b7c 100644 --- a/llvm/lib/Transforms/Utils/SCCPSolver.cpp +++ b/llvm/lib/Transforms/Utils/SCCPSolver.cpp @@ -193,6 +193,60 @@ static bool replaceSignedInst(SCCPSolver &Solver, NewInst = BinaryOperator::Create(NewOpcode, Op0, Op1, "", &Inst); break; } + case Instruction::ICmp: { +ICmpInst &ICmp = cast(Inst); + +ZExtInst *Op0_zext = dyn_cast(ICmp.getOperand(0)); +SExtInst *Op0_sext = dyn_cast(ICmp.getOperand(0)); + +ZExtInst *Op1_zext = dyn_cast(ICmp.getOperand(1)); +SExtInst *Op1_sext = dyn_cast(ICmp.getOperand(1)); + +CastInst *Op0; +CastInst *Op1; + +if (Op0_zext) Op0 = Op0_zext; else Op0 = Op0_sext; +if (Op1_zext) Op1 = Op1_zext; else Op1 = Op1_sext; + +bool reversed = false; + +if (!Op0 || !Op1){ + // Op0 and Op1 must be defined + return false; +} + +if (Op1_zext && (! Op0_zext)){ + // We force Op0 to be a zext and reverse the arguments + // at the end if we swap + reversed = true; + + std::swap(Op0_zext, Op1_zext); + std::swap(Op0_sext, Op1_sext); + std::swap(Op0, Op1); +} + + +if(Op0->getType() != Op1->getType()){ + // extensions are not of the same type + // This optimization is done in InstCombine + return false; +} + +// ICMP (sext X) (sext y) => ICMP X, Y +// ICMP (zext X) (zext y) => ICMP X, Y +// ICMP (zext X) (sext Y) => ICMP X, Y if X >= 0 and ICMP signed +if((Op0_zext && Op1_zext) + || (Op0_sext && Op1_sext) + || (ICmp.isSigned() && Op0_zext && Op1_sext && isNonNegative(Op0_zext->getOperand(0{ + + Value *X = Op0->getOperand(0); + Value *Y = Op1->getOperand(0); + NewInst = CmpInst::Create(ICmp.getOpcode(), ICmp.getPredicate(), reversed ? Y : X, reversed ? X : Y, "", &Inst); + break; +} + +return false; + } default: return false; } diff --git a/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll new file mode 100644 index 000..90b2c123081fb49 --- /dev/null +++ b/llvm/test/Transforms/SCCP/icmp-fold-with-cast.ll @@ -0,0 +1,185 @@ +; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --tool ./bin/opt --version 3 +; See PRXXX for more details +; RUN: opt < %s -S -passes=ipsccp | FileCheck %s + + +define signext i32 @sext_sext(i16 %x, i16 %y) { +; CHECK-LABEL: define signext i32 @sext_sext( +; CHECK-SAME: i16 [[X:%.*]], i16 [[Y:%.*]]) { +; CHECK-NEXT: entry: +; CHECK-NEXT:[[CONV:%.*]] = sext i16 [[X]] to i32 +; CHECK-NEXT:[[CONV1:%.*]] = sext i16 [[Y]] to i32 +; CHECK-NEXT:[[CMP2:%.*]] = icmp sgt i16 [[X]], [[Y]] +; CHECK-NEXT:br i1 [[CMP2]], label [[COND_TRUE:%.*]], label [[COND_FALSE:%.*]] +; CHECK: cond.true: +; CHECK-NEXT:br label [[COND_END:%.*]] +; CHECK: cond.false: +; CHECK-NEXT:br label [[COND_END]] +; CHECK: cond.end: +; CHECK-NEXT:[[COND:%.*]] = phi i32 [ 0, [[COND_TRUE]] ], [ 1, [[COND_FALSE]] ] +; CHECK-NEXT:ret i32 [[COND]] +; +entry: + %conv = sext i16 %x to i32 + %conv1 = sext i16 %y to i32 + %cmp2 = icmp sgt i32 %conv, %conv1 + br i1 %cmp2, label %cond.true, label %cond.false + +cond.true:; preds = %for.body + br label %cond.end + +cond.false: ; preds = %for.body + br label %cond.end + +cond.end: ; preds = %cond.false, %cond.true
[clang] [SCCP] [Transform] Adding ICMP folding for zext and sext in SCCPSolver (PR #67594)
leo-ard wrote: > If I understood correctly, what you're trying to do here is to apply an icmp > fold early before the sext -> zext information gets lost. I don't think this > is the correct way to approach the problem. The correct way is to preserve > the fact that the operand is non-negative when converting to zext, which > would allow the InstCombine fold to reliably undo this. In fact, there is a > pending patch for this at https://reviews.llvm.org/D156444. When tackling this problem, my first idea was to track the non-negativeness of the zext and use this information in InstCombine. I didn't know that you could do it through flags in the IR, which I think is a cleaner solution. Do you think this patch will be available soon ? https://github.com/llvm/llvm-project/pull/67594 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits