[PATCH] D139006: [UpdateTestChecks] Match define for labels

2022-12-16 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D139006#4000992 , @sebastian-ne wrote: > Adding `--function-signature` by default sounds like a good idea to me. > Is there any reason why we wouldn’t want to enable this by default (for new > tests)? No objection from me for

[PATCH] D139006: [UpdateTestChecks] Match define for labels

2022-12-13 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added subscribers: arsenm, spatel. spatel added a comment. I agree with the revert - we'd need lots of commits like this: b7232dafe69eb04c14217 (cc @arsenm) Another possibility to fix this with less churn: Add a warnin

[PATCH] D136806: [Pipelines] Introduce SROA after (final, full) loop unrolling

2022-11-16 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM based on the timing, results, and alternatives discussed There's some testing in progress according to previous comments, so best to wait for that to finish in case it turns up anything ne

[PATCH] D131980: [Passes] Don't run tail-call-elim in -O1

2022-08-16 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM These clang tests are just awful, but I don't have the patience to fix them... Comment at: llvm/lib/Passes/PassBuilderPipelines.cpp:245 // TODO: Investigate the cost/

[PATCH] D130374: [Passes] add a tail-call-elim pass near the end of the opt pipeline

2022-07-26 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D130374#3679550 , @hiraditya wrote: > In D130374#3675677 , @vdsered wrote: > >> Just JFYI :) >> Yes, probably not worth it > > that is interesting. do we know why? Based on this data:

[PATCH] D130374: [Passes] add a tail-call-elim pass near the end of the opt pipeline

2022-07-25 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGbfb9b8e075ee: [Passes] add a tail-call-elim pass near the end of the opt pipeline (authored by spatel). Herald added subscribers: cfe-commits, pmatos

[PATCH] D127460: Rename GCCBuiltin into ClangBuiltin

2022-06-14 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/include/llvm/IR/Intrinsics.td:387 -/// GCCBuiltin - If this intrinsic exactly corresponds to a GCC builtin, this +/// ClangBuiltin - If this intrinsic exactly corresponds to a GCC builtin, this /// specifies the name of the builti

[PATCH] D124551: [Driver] Add f16 support to -mrecip parsing.

2022-04-28 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM. AFAIK, these FP reciprocal options have never been officially documented. Would that go under here: https://clang.llvm.org/docs/UsersManual.html#controlling-code-generation ? Repositor

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-29 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGa3cffc11509b: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2 (authored by hkmatsumoto, committed by spatel). Repository: rG LLVM

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-29 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D122077/new/ https://reviews.llvm.org/D122077 ___

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-29 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:915 +/// Fold (icmp eq ctpop(X) 1) | (icmp eq X 0) into (icmp ult ctpop(X) 2) and +/// fold (icmp ne ctpop(X) 1) & (icmp ne X 0) into (icmp uge ctpop(X) 2). +static Value *foldIsPowe

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-29 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D122077#3413565 , @xbolva00 wrote: > In D122077#3413550 , @joerg wrote: > >> Why is this fold preferable to `(X & (X-1)) == 0`? At least on all >> architectures without native populatio

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-27 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. Thank you for the making the test changes. I pushed the baseline tests on your behalf here: ebaa28e0750b Please rebase and update the patch here in Phabricator - it should only show changes in the CHE

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-24 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. A few changes for tests suggested inline. There might be some generalization of ctpop analysis that we can make as a follow-up patch. For example, I was looking at a "wrong predicate" combination and noticed that we miss possible optimizations like this: https://alive2.l

[PATCH] D122077: [InstCombine] Fold (ctpop(X) == 1) | (X == 0) into ctpop(X) < 2

2022-03-21 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp:916 +/// fold (icmp ne ctpop(X) 1) & (icmp ne X 0) into (icmp uge ctpop(X) 2). +static Value *foldOrOfCtpop(ICmpInst *Cmp0, ICmpInst *Cmp1, bool IsAnd, +I

[PATCH] D121306: [Sema] add warning for tautological FP compare with literal

2022-03-17 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGab982eace6e4: [Sema] add warning for tautological FP compare with literal (authored by spatel). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorepo

[PATCH] D115886: [CodeGen] remove creation of FP cast function attribute

2021-12-19 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG1965cc469539: [CodeGen] remove creation of FP cast function attribute (authored by spatel). Herald added a project: clang. Herald added a subscriber:

[PATCH] D115804: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-19 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added subscribers: kpn, sepavloff, andrew.w.kaylor. spatel added a comment. In D115804#3201044 , @craig.topper wrote: > What's the plan for constrained intrinsics versions of these intrinsics? The > IRBuilder calls for CreateFPToSI and CreateFPTo

[PATCH] D115804: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-16 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG8c7f2a4f8719: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast… (authored by spatel). Herald added a project: clang. Chan

[PATCH] D115804: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-15 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D115804#3195002 , @nikic wrote: > Looks reasonable. Making float to int cast well defined is exactly why these > intrinsics exist. Should we also drop the "string-float-cast-overflow" > attribute, as UB is now prevented in a d

[PATCH] D92270: [ConstantFold] Fold more operations to poison

2021-12-15 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/lib/IR/ConstantFold.cpp:633 // the input constant. -return UndefValue::get(DestTy); +return PoisonValue::get(DestTy); } aqjune wrote: > neildhar wrote: > > spatel wrote: > > > MatzeB w

[PATCH] D115804: [CodeGen] use saturating FP casts when compiling with "no-strict-float-cast-overflow"

2021-12-15 Thread Sanjay Patel via Phabricator via cfe-commits
spatel created this revision. spatel added reviewers: MatzeB, neildhar, nikic, aqjune, dmgreen. Herald added a subscriber: mcrosier. spatel requested review of this revision. We got an unintended consequence of the optimizer getting smarter when compiling in a non-standard mode, and there's no go

[PATCH] D92270: [ConstantFold] Fold more operations to poison

2021-10-28 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/lib/IR/ConstantFold.cpp:633 // the input constant. -return UndefValue::get(DestTy); +return PoisonValue::get(DestTy); } MatzeB wrote: > I believe this is causing some of our clients tr

[PATCH] D109925: [AST] add warnings for out-of-bounds FP values when using relaxed FP math compile flags

2021-09-16 Thread Sanjay Patel via Phabricator via cfe-commits
spatel created this revision. spatel added reviewers: sepavloff, jyknight, efriedma, dim, kparzysz, rsmith. Herald added a subscriber: mcrosier. spatel requested review of this revision. There's currently ongoing discussion on the dev lists about how to handle related cases with isnan() / isinf()

[PATCH] D108826: [SLP][LTO][WIP]Allow full SLP in LTO only at link time.

2021-08-27 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D108826#2969604 , @ABataev wrote: > In D108826#2969594 , @lebedev.ri > wrote: > >> Aha, so full lto. That is consistent with the phase ordering dilemma @spatel >> discovered: D102002 <

[PATCH] D101759: [PowerPC] Scalar IBM MASS library conversion pass

2021-08-27 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a reviewer: efriedma. spatel added a comment. I'm not familiar with this library, and I haven't looked at current state of how we enable/map optional libs in a while... We definitely want to avoid adding another target option/debug flag, and if we can avoid relying on a function par

[PATCH] D104854: Introduce intrinsic llvm.isnan

2021-08-20 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D104854#2957582 , @lebedev.ri wrote: > In D104854#2957529 , @spatel wrote: > >> In D104854#2957471 , @sepavloff >> wrote: >> >>> In D104854#29

[PATCH] D104854: Introduce intrinsic llvm.isnan

2021-08-20 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D104854#2957471 , @sepavloff wrote: > In D104854#2957423 , @spatel wrote: > >> Is it intentional that we are not canonicalizing the intrinsic call back to >> `fcmp uno` in the default F

[PATCH] D104854: Introduce intrinsic llvm.isnan

2021-08-20 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. Is it intentional that we are not canonicalizing the intrinsic call back to `fcmp uno` in the default FP environment? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D104854/new/ https://reviews.llvm.org/D104854 _

[PATCH] D107843: [X86] Add parentheses around casts in some of the X86 intrinsic headers.

2021-08-13 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. Probably want to address the other cleanups in another patch; the parens fixes and test LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107843/new/ https://reviews.llvm.org/D107843 _

[PATCH] D107843: [X86] Add parentheses around casts in some of the X86 intrinsic headers.

2021-08-10 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. Hard to see through all of the lint noise, but seems like a mechanical fix. Can we add a test like the one in the bug report? https://godbolt.org/z/sPT8e9vx9 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D107843/new/ https://

[PATCH] D105268: [X86] AVX512FP16 instructions enabling 5/6

2021-07-05 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/test/CodeGen/X86/stack-folding-fp-avx512fp16vl-fma.ll:193-194 +define <8 x half> @stack_fold_fmsub123ph(<8 x half> %a0, <8 x half> %a1, <8 x half> %a2) { + ;check-label: stack_fold_fmsub123ph: + ;check: vfmsub213ph {{-?[0-9]

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-05-30 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D101191#2783570 , @rupprecht wrote: > In D101191#2782963 , @rupprecht > wrote: > >> The issue I'm seeing seems more directly caused by SLP vectorization, as it >> goes away with `-fno-

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-05-25 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D101191#2779631 , @spatel wrote: > In D101191#2779007 , @fhahn wrote: > >> Looks like this is causing an infinite loop in instcombine: >> https://bugs.chromium.org/p/oss-fuzz/issues/det

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-05-25 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D101191#2779007 , @fhahn wrote: > Looks like this is causing an infinite loop in instcombine: > https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34661 This seems likely to be another partial undef/poison vector problem (

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-04-30 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll:20 +; FIXME: noundef should be attached to args define i1 @will_not_overflow(i64 %arg, i64 %arg1) { aqjune wrote: > nikic wrote: > > spatel wrote:

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-04-27 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll:20 +; FIXME: noundef should be attached to args define i1 @will_not_overflow(i64 %arg, i64 %arg1) { aqjune wrote: > aqjune wrote: > > spatel wrote

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-04-27 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/test/Transforms/PhaseOrdering/unsigned-multiply-overflow-check.ll:20 +; FIXME: noundef should be attached to args define i1 @will_not_overflow(i64 %arg, i64 %arg1) { Any ideas about what it will take to get those

[PATCH] D101191: [InstCombine] Fully disable select to and/or i1 folding

2021-04-27 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/test/Transforms/PhaseOrdering/X86/vector-reductions.ll:282 +; FIXME: this should be vectorized define i1 @cmp_lt_gt(double %a, double %b, double %c) { I don't think we need to worry about regressing this. It's not

[PATCH] D99898: [clang, test] Fix use of undef FileCheck var

2021-04-06 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: clang/test/CodeGen/libcalls.c:127 // CHECK-NO-DAG: attributes [[NUW_RN]] = { nounwind readnone{{.*}} } +// CHECK-YES-NOT: attributes [[NUW_RN]] = { nounwind readnone{{.*}} } // CHECK-NO-DAG: attributes [[NUW_RNI]] = { nofree nosync noun

[PATCH] D98945: [BranchProbability] move options for 'likely' and 'unlikely'

2021-03-20 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGee8b53815ddf: [BranchProbability] move options for 'likely' and 'unlikely' (authored by spatel). Herald added a project: clang. Herald added a subscriber: cfe-commits. Repository: rG LLVM Github Monorep

[PATCH] D93179: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)

2021-02-14 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. Thanks! LGTM. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D93179/new/ https://reviews.llvm.org/D93179

[PATCH] D93179: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)

2021-02-14 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: clang/lib/Headers/avx512fintrin.h:9305 + * 1. The elements are reassociable when using fadd/fmul intrinsics; + * 2. There's no nan and signed zero in the elements when using fmin/max + intrinsics; pengfei wrote: > spatel

[PATCH] D93179: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)

2021-02-10 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: clang/lib/Headers/avx512fintrin.h:9305 + * 1. The elements are reassociable when using fadd/fmul intrinsics; + * 2. There's no nan and signed zero in the elements when using fmin/max + intrinsics; If I understand correctl

[PATCH] D96231: [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.

2021-02-09 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D96231/new/ https://reviews.llvm.org/D96231 ___ c

[PATCH] D96231: [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.

2021-02-08 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: clang/lib/Headers/avx512fintrin.h:9352 static __inline__ double __DEFAULT_FN_ATTRS512 _mm512_reduce_add_pd(__m512d __W) { return __builtin_ia32_reduce_fadd_pd512(0.0, __W); } Ah - this is where the +0.0 is specified

[PATCH] D96231: [X86] Always assign reassoc flag for intrinsics *reduce_add/mul_ps/pd.

2021-02-08 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:13829 CGM.getIntrinsic(Intrinsic::vector_reduce_fadd, Ops[1]->getType()); +Builder.getFastMathFlags().setAllowReassoc(true); return Builder.CreateCall(F, {Ops[0], Ops[1]}); ---

[PATCH] D93923: Use unary CreateShuffleVector if possible

2020-12-30 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. See inline comments to avoid bot failures - otherwise, LGTM. Thanks for the cleanup! Comment at: llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:3193 IRBuilder<

[PATCH] D93793: [IR] Let IRBuilder's CreateVectorSplat/CreateShuffleVector use poison as placeholder

2020-12-29 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/include/llvm/IR/IRBuilder.h:2527 /// Create a unary shuffle. The second vector operand of the IR instruction /// is undefined. Value *CreateShuffleVector(Value *V, ArrayRef Mask, update code comment: undefin

[PATCH] D93586: [InstCombine] use poison as placeholder for undemanded elems

2020-12-22 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D93586#2467248 , @aqjune wrote: >> There are 792 files in llvm/test, most of which are in test/Transform (119) >> and test/CodeGen(655). >> The transition will be swiftly done (if there's no other issue hopefully) by >> the nex

[PATCH] D93586: [InstCombine] use poison as placeholder for undemanded elems

2020-12-20 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added reviewers: nlopes, regehr, RKSimon, zhengyangl, nikic, hfinkel. spatel added a comment. Thank you for working on this! Looking back at the bug comments (and adding reviewers based on those comments), this is a step towards killing undef that has been discussed for a long time now. :

[PATCH] D93179: [X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506)

2020-12-14 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. If we're going by existing behavior/compatibility, gcc/icc use packed ops too: https://godbolt.org/z/9jEhaW ...so there's an implicit 'nnan nsz' in these intrinsics (and that should be documented in the header file (and file a bug for Intel's page at https://software.inte

[PATCH] D87188: [InstCombine] Canonicalize SPF to abs intrinc

2020-12-10 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D87188#2445506 , @lebedev.ri wrote: > Partial rebase (without updating test coverage) Thanks for reducing! If I'm seeing it correctly, the codegen looks fine - the difference is in the IR icmp predicate changing from `slt` to `

[PATCH] D88712: [CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting

2020-11-24 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D88712#2413877 , @venkataramanan.kumar.llvm wrote: > In D88712#2413688 , @spatel wrote: > >> In D88712#2412874 , >> @venkataramanan.kumar.llvm wr

[PATCH] D88712: [CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting

2020-11-24 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D88712#2412874 , @venkataramanan.kumar.llvm wrote: > In D88712#2412366 , @MaskRay wrote: > >> In D88712#2411841 , >> @venkataramanan.kumar.llvm w

[PATCH] D87188: [InstCombine] Canonicalize SPF to abs intrinc

2020-11-10 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D87188#2348343 , @spatel wrote: > In D87188#2348326 , @nikic wrote: > >> Reopening this so we don't forget... >> >> I believe @spatel is working on the cost modelling. I did not have much

[PATCH] D87188: [InstCombine] Canonicalize SPF to abs intrinc

2020-10-22 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D87188#2348326 , @nikic wrote: > Reopening this so we don't forget... > > I believe @spatel is working on the cost modelling. I did not have much luck > tracking down the miscompile, at least did not spot anything incriminating

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-22 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG57cdc52c4df0: Initial support for vectorization using Libmvec (GLIBC vector math library) (authored by venkataramanan.kumar.llvm, committed by spatel). Repository: rG LLVM Github Monorepo CHANGES SINC

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-20 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM. I'm not sure why we need 3 different test files to test the very similar patterns - just programmer preference? Wait a day or so to commit in case anyone else has comments. CHANGES SIN

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-16 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a subscriber: aeubanks. spatel added a comment. In D88154#2328312 , @venkataramanan.kumar.llvm wrote: > I can add few more float type tests with meta data for VF=8. please let me > know your suggestions. I may be missing some subtlety of

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-13 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/test/Transforms/LoopVectorize/X86/libm-vector-calls-finite.ll:9 +define void @exp_f32(float* nocapture %varray) { +; CHECK-LABEL: @exp_f32( +; CHECK-NEXT: entry: fpetrogalli wrote: > I think you are over-testing her

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-12 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D88154#2325713 , @venkataramanan.kumar.llvm wrote: > As per review comments from Sanjay, updated the test case to use metadata. > Also autogenerated the checks in the test cases using > llvm/utils/update_test_checks.py. Tha

[PATCH] D88154: Initial support for vectorization using Libmvec (GLIBC vector math library).

2020-10-06 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D88154#2310653 , @venkataramanan.kumar.llvm wrote: > In D88154#2290205 , @abique wrote: > >> Looks good to me. >> Regarding the tests, it seems that you check if auto-vectorization takes

[PATCH] D88238: [APFloat] convert SNaN to QNaN in convert() and raise Invalid signal

2020-10-01 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG149f5b573c79: [APFloat] convert SNaN to QNaN in convert() and raise Invalid signal (authored by spatel). Herald added subscribers: cfe-commits, jrtc27. Herald added a project: clang. Repository: rG LLVM

[PATCH] D88664: [AST] do not error on APFloat invalidOp in default mode

2020-10-01 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG686eb0d8ded9: [AST] do not error on APFloat invalidOp in default mode (authored by spatel). Herald added a project: clang. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://review

[PATCH] D88664: [AST] do not error on APFloat invalidOp in default mode

2020-10-01 Thread Sanjay Patel via Phabricator via cfe-commits
spatel created this revision. spatel added reviewers: rjmccall, efriedma, sepavloff. Herald added a subscriber: mcrosier. spatel requested review of this revision. If FP exceptions are ignored, we should not error out of compilation just because APFloat indicated an exception. This is required a

[PATCH] D87188: [InstCombine] Canonicalize SPF to abs intrinc

2020-09-18 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D87188#2281957 , @sanwou01 wrote: > I know this has already been reverted but just FYI that I've bisected a ~2% > regression in SPEC2017 x264_r on AArch64 to this commit. Presumably this is > due to the extra unrolling / cost m

[PATCH] D87101: [X86] Update SSE/AVX ABS intrinsics to emit llvm.abs.* (PR46851)

2020-09-04 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM - I think the odds of code that is using intrinsic vector abs inter-mixing with auto-vectorized code using the abs idiom are small. I noticed one other LLVM codegen test that might want t

[PATCH] D85787: [InstCombine] Aggregate reconstruction simplification (PR47060)

2020-08-16 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. This seems similar in spirit to the recursive element tracking that we do in collectShuffleElements() in this file or foldIdentityShuffles() in InstSimplify, but a bit more complicated (becaus

[PATCH] D85787: [InstCombine] Aggregate reconstruction simplification (PR47060)

2020-08-13 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D85787#2214038 , @lebedev.ri wrote: > @ reviewers - i'm not so much interested in deep code/algo review, > but more like in the general direction disscussion, like, is this okay for > instcombine? :) The test diffs look great,

[PATCH] D78478: [UpdateTestChecks] Add UTC_ARGS support for update_{llc,cc}_test_checks.py

2020-07-02 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D78478#2128631 , @xbolva00 wrote: > >> UTC_ARGS (I can't help associating it with Coordinated Universal Time). > > Me too. Any suggestions for new name? TCU_ARGS? I also reflexively think of universal time when reading "UTC". I

[PATCH] D80584: [utils] change update_test_checks.py use of 'TMP' value names

2020-06-01 Thread Sanjay Patel via Phabricator via cfe-commits
spatel closed this revision. spatel added a comment. In D80584#2066319 , @jdoerfert wrote: > First, I think the warning is strictly a good thing. Thanks for keeping that > in! Sure - we've been missing that for a long time. > I don't think the options

[PATCH] D80584: [utils] change update_test_checks.py use of 'TMP' value names

2020-06-01 Thread Sanjay Patel via Phabricator via cfe-commits
spatel reopened this revision. spatel added a comment. This revision is now accepted and ready to land. Effectively reverted (but still have the name conflict warning at least) with: rGe5b877275673 If we can change instnamer,

[PATCH] D80584: [utils] change update_test_checks.py use of 'TMP' value names

2020-06-01 Thread Sanjay Patel via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGe5b877275673: [utils] change default nameless value to "TMP" (authored by spatel). Herald added a project: clang. Herald added a subscriber: cfe-commits. Changed prior to commit: https://reviews.llvm.or

[PATCH] D80251: [X86] Update some av512 shift intrinsics to use "unsigned int" parameter instead of int to match Intel documentaiton

2020-05-20 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. Is it possible to fix those other 5 in the Intel docs for consistency, or is there some functional reason that those are different? CHANGES SINCE LAST ACTION https://reviews.llvm.org/D80251/new/ https://reviews.llvm.org/D80251 __

[PATCH] D79472: [X86] Remove support some inline assembly constraints that are no longer supported by gcc.

2020-05-06 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp:154-156 +BasicBlock::iterator It(New); +while (isa(*It) || isa(*It)) + ++It; Unrelated diff? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST AC

[PATCH] D72467: Remove "mask" operand from shufflevector.

2020-03-27 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. Looks like D76649 landed and added another IRBuilder call that needs updating: [2974/5437] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86PartialReduction.cpp.o FAILED: lib/Target/X86/CMakeFiles/LLVMX86CodeGen.d

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-03-12 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D72675#1919793 , @mibintc wrote: > In D72675#1919685 , @spatel wrote: > > > In D72675#1917844 , @wristow wrote: > > > > > Revisiting this patch. I

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-03-12 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a subscriber: mibintc. spatel added a comment. In D72675#1917844 , @wristow wrote: > Revisiting this patch. I think that before fixing the `-ffp-contract=off` > problem I originally raised here, there are two questions that have come up > i

[PATCH] D75467: [instcombine] remove fsub to fneg hacks; only emit fneg

2020-03-10 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. LGTM - see inline for 1 more comment nit. Comment at: llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp:2132 // Subtraction from -0.0 is the canonical form of fneg. + // fsub -0.0, X ==> fneg X Th

[PATCH] D75467: [instcombine] remove fsub to fneg hacks; only emit fneg

2020-03-10 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/include/llvm/IR/PatternMatch.h:622 inline bind_ty m_Instruction(Instruction *&I) { return I; } /// Match a binary operator, capturing it if we match. +inline bind_ty m_UnOp(UnaryOperator *&I) { return I; } Match a

[PATCH] D74500: clang: Treat ieee mode as the default for denormal-fp-math

2020-03-02 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/include/llvm/ADT/FloatingPointMode.h:48 - DenormalMode() = default; - DenormalMode(DenormalModeKind Out, DenormalModeKind In) : + constexpr DenormalMode() = default; + constexpr DenormalMode(DenormalModeKind Out, DenormalModeKi

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2020-02-10 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM - the PS4 behavior was confirmed off-list. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D69979/new/ https://reviews.llvm.org/D69979 ___

[PATCH] D72467: Remove "mask" operand from shufflevector.

2020-01-24 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D72467#1839172 , @efriedma wrote: > In D72467#1838591 , @spatel wrote: > > > LGTM, but should get a 2nd opinion since I'm not familiar with some of the > > parts. > > > Any specific part

[PATCH] D72467: Remove "mask" operand from shufflevector.

2020-01-24 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. LGTM, but should get a 2nd opinion since I'm not familiar with some of the parts. Also, please update the related LangRef text for shufflevector in or alongside this patch. Could add some asc

[PATCH] D72467: Remove "mask" operand from shufflevector.

2020-01-23 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/include/llvm/IR/Constants.h:1220 + /// Assert that this is an shufflevector and return the mask. See class + /// ShuffleVectorInst for a description of the mask representation. an shufflevector -> a shufflevector

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-20 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D72675#1827893 , @wristow wrote: > How to handle this seems like an implementation question. The code here is > just deciding whether or not we intend to pass "-ffast-math" to cc1 (it isn't > directly defining `__FAST_MATH__`)

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-17 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added reviewers: efriedma, scanon, arsenm, echristo, RKSimon. spatel added a comment. Herald added a subscriber: wdng. Adding potential FP reviewers for question of gcc- (potentially-buggy-) compatibility. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D72675/new/ https://reviews.

[PATCH] D72675: [Clang][Driver] Fix -ffast-math/-ffp-contract interaction

2020-01-17 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. This follows the reasoning that we discussed in the earlier discussion, but after re-reading the gcc comment in particular, I'm wondering whether this is what we really want to do... If __FAST_MATH__ is purely here for compatibility with gcc, then should we mimic gcc beh

[PATCH] D72675: Fix -ffast-math/-ffp-contract interaction

2020-01-16 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D72675#1824659 , @wristow wrote: > This all sounds good to me. > > So to make sure we're all on the same page, my understanding is that the plan > forward is: > > 1. Make the Clang change first (including adding another pair of

[PATCH] D72675: Fix -ffast-math/-ffp-contract interaction

2020-01-16 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D72675#1823227 , @mcberg2017 wrote: > We crossed that bridge internally at Apple a while ago, meaning I have some > code debt for cleaning up open source for fma formation that uses contract > and reassoc differently than we do

[PATCH] D72675: Fix -ffast-math/-ffp-contract interaction

2020-01-15 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: llvm/test/CodeGen/PowerPC/fmf-propagation.ll:201-203 ; fma(X, 7.0, X * 42.0) --> X * 49.0 -; This is the minimum FMF needed for this transform - the FMA allows reassociation. +; This is the minimum FMF needed for this transform - the 'c

[PATCH] D72028: [CodeGen] Emit conj/conjf/confjl libcalls as fneg instructions if possible.

2019-12-31 Thread Sanjay Patel via Phabricator via cfe-commits
spatel accepted this revision. spatel added a comment. This revision is now accepted and ready to land. Code change LGTM - see inline for a question about the testing. Comment at: clang/test/CodeGen/complex-libcalls.c:115-118 +// NO__ERRNO-NOT: .conj +// NO__ERRNO-NOT: @conj +/

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-12-11 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added inline comments. Comment at: clang/test/Driver/default-denormal-fp-math.c:7 +// crtfastmath enables ftz and daz +// RUN: %clang -### -target x86_64-unknown-linux-gnu -ffast-math --sysroot=%S/Inputs/basic_linux_tree -c %s -v 2>&1 | FileCheck -check-prefix=CHECK-ZERO

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-12-02 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a subscriber: andreadb. spatel added inline comments. Comment at: clang/include/clang/Driver/ToolChain.h:580 + /// This checks for presence of the -Ofast, -ffast-math or -funsafe-math flags. + virtual bool FastMathRuntimeIsAvailable( +const llvm::opt::ArgList

[PATCH] D70488: [InstCombine] Infer fast math flags on fadd/fsub/fmul/fcmp

2019-11-20 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added reviewers: cameron.mcinally, mcberg2017, arsenm. spatel added a comment. Herald added a subscriber: wdng. I like the idea, but I'd be more comfortable reviewing the diffs in stages, so we know that the test coverage for the value tracking calls is good. So I'd prefer if we split thi

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-11-18 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D69979#1749198 , @arsenm wrote: > I just posted the test I wrote here: https://github.com/arsenm/subnormal_test Thanks. I tried compiling with gcc (can't trust clang since it doesn't honor #pragma STDC FENV_ACCESS ON?). And r

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-11-14 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D69979#1740294 , @arsenm wrote: > In D69979#1738099 , @craig.topper > wrote: > > > I checked Redhat 7.4 that's on the server I'm using for work. And I had a > > coworker check his Ubuntu

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-11-08 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a reviewer: cameron.mcinally. spatel added a comment. Also, I may have missed some discussions. Does this patch series replace the proposal to add instruction-level FMF for denorms? http://lists.llvm.org/pipermail/llvm-dev/2019-September/135183.html Ie, did we decide that a function

[PATCH] D69979: clang: Guess at some platform FTZ/DAZ default settings

2019-11-08 Thread Sanjay Patel via Phabricator via cfe-commits
spatel added a comment. In D69979#1738099 , @craig.topper wrote: > I checked Redhat 7.4 that's on the server I'm using for work. And I had a > coworker check his Ubuntu 18.04 system with this program. And both systems > printed 1f80 as the value of MXCS

  1   2   3   >