Re: [RFC] D support for S/390

2019-03-19 Thread Robin Dapp
> This would mean that StructFlags and ClassFlags will also both have a > wrong value as well. Yes, can confirm that m_flags = 0 (instead of 1) for a struct containing a pointer. > If there's a compiler/library discrepancy, the compiler should be > adjusted to write out the value at the correct s

Re: [RFC] D support for S/390

2019-03-20 Thread Robin Dapp
Hi, the unicode tables in std.internal.unicode_tables are apparently auto generated and loaded at (libphobos) compile time. They are also in little endian format. Is the tool to generate them available somewhere? I wanted to start converting them to little endian before loading but this will pr

Re: [RFC] D support for S/390

2019-03-22 Thread Robin Dapp
Hi, > Are the values inside the tables the problem? Or just some of the > helper functions/templates that interact with them to generate the > static data? > > If the latter, then a rebuild of the files may not be necessary. I managed to get this to work without rebuilding the files. After chec

[PATCH] S/390: Add arch13 pipeline description

2019-04-10 Thread Robin Dapp
Hi, this patch adds the pipeline description and the cpu model number for arch13. Bootstrapped and regtested on s390x. Regards Robin -- gcc/ChangeLog: 2019-04-10 Robin Dapp * config/s390/8561.md: New file. * config/s390/driver-native.c (s390_host_detect_local_cpu): Add

Re: [RFC] D support for S/390

2019-04-11 Thread Robin Dapp
Hi Rainer, > This will occur on any 32-bit target. The following patch (using > ssize_t instead) allowed the code to compile: thanks, included your fix and attempted a more generic version of the 186 test. I also continued debugging some fails further: - Most of the MurmurHash fails are simply

Re: [PATCH] S/390: Fix PR89952 incorrect CFI

2019-04-18 Thread Robin Dapp
Hi, > + Establish an ANTI dependency between r11 and r15 restores from FPRs > + to prevent the instructions scheduler from reordering them since > + this would break CFI. No further handling in the sched_reorder > + hook is required since the r11 and r15 restore will never appear in > +

Re: [RFC] D support for S/390

2019-04-18 Thread Robin Dapp
Hi Rainer, > I noticed you missed one piece of Iain's typeinfo.cc patch, btw.: > > diff --git a/gcc/d/typeinfo.cc b/gcc/d/typeinfo.cc > --- a/gcc/d/typeinfo.cc > +++ b/gcc/d/typeinfo.cc > @@ -886,7 +886,7 @@ public: > if (cd->isCOMinterface ()) > flags |= ClassFlags::isCOMclass; >

Re: [RFC] D support for S/390

2019-04-24 Thread Robin Dapp
ll: all-am +PWD_COMMAND = $${PWDCMD-pwd} .SUFFIXES: $(srcdir)/Makefile.in: @MAINTAINER_MODE_TRUE@ $(srcdir)/Makefile.am $(am__configure_deps) Regards Robin -- gcc/d/ChangeLog: 2019-04-24 Robin Dapp * typeinfo.cc (create_typeinfo): Set fields with proper length. gcc/testsuite/Change

Re: [RFC] D support for S/390

2019-04-29 Thread Robin Dapp
> Robin, have you been testing with --disable-multilib or something > similar? yes, I believe so... stupid mistake :( Thanks for fixing it so quickly.

[RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-09 Thread Robin Dapp
Hi, while trying to improve s390 code generation for rotate and shift I noticed superfluous subregs for shift count operands. In our backend we already have quite cumbersome patterns that would need to be duplicated (or complicated further by more subst patterns) in order to get rid of the subregs

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-10 Thread Robin Dapp
>> Bit tests on x86 also truncate [1], if the bit base operand specifies >> a register, and we don't use BT with a memory location as a bit base. >> I don't know what is referred with "(real or pretended) bit field >> operations" in the documentation for SHIFT_COUNT_TRUNCATED: >> >> However, o

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-05-15 Thread Robin Dapp
> It would really help if you could provide testcases which show the > suboptimal code and any analysis you've done. I tried introducing a define_subst pattern that substitutes something one of two other subst patterns already changed. The first subst pattern helps remove a superfluous and on the

[PATCH] S/390: Add -march to test case

2019-05-15 Thread Robin Dapp
Hi, this patch adds -march=z900 to a test case that expects larl for loading a value via the GOT. On z10 and later, lgrl is used which is tested in a new test case. Regards Robin -- gcc/testsuite/ChangeLog: 2019-05-15 Robin Dapp * gcc.target/s390/global-array-element-pic.c: Add

[PATCH] Testsuite: Add s390 exceptions for gen-vect

2019-05-15 Thread Robin Dapp
Hi, this patch changes three gen-vect testcases so they do not expect vectorization of an unaligned access. Vectorization happens regardless, we just ignore misalignment. Regards Robin -- gcc/testsuite/ChangeLog: 2019-05-15 Robin Dapp * gcc.dg/tree-ssa/gen-vect-26.c: Do not

[PATCH] S/390: Implement vectory copysign

2019-02-07 Thread Robin Dapp
Hi, this patch implements vector copysign using vector select on S/390. Regtested and bootstrapped on s390x. Regards Robin -- gcc/ChangeLog: 2019-02-07 Robin Dapp * config/s390/vector.md: Implement vector copysign. gcc/testsuite/ChangeLog: 2019-02-07 Robin Dapp

Re: [PATCH] Tree-level fix for PR 69526

2016-08-22 Thread Robin Dapp
extra function for now because I find extract_range_from_binary_expr_1 somewhat lengthy and hard to follow already :) Wouldn't it be better to "separate concerns"/split it up in the long run and merge the functionality needed here at some time? Bootstrapped and reg-tested on s390

Re: [PATCH] Tree-level fix for PR 69526

2016-08-23 Thread Robin Dapp
gah, this + return true; + if (TREE_CODE (t1) != SSA_NAME) should of course be like this + if (TREE_CODE (t1) != SSA_NAME) + return true; in the last patch.

Re: [PATCH] Use RPO order for fwprop iteration

2016-09-02 Thread Robin Dapp
This causes a performance regression in the xalancbmk SPECint2006 benchmark on s390x. At first sight, the produced asm output doesn't look too different but I'll have a closer look. Is the fwprop order supposed to have major performance implications? Regards Robin > This changes it from PRE on t

Re: [PATCH] Tree-level fix for PR 69526

2016-09-05 Thread Robin Dapp
Ping. diff --git a/gcc/gimple-match-head.c b/gcc/gimple-match-head.c index 2beadbc..d66fcb1 100644 --- a/gcc/gimple-match-head.c +++ b/gcc/gimple-match-head.c @@ -39,6 +39,7 @@ along with GCC; see the file COPYING3. If not see #include "internal-fn.h" #include "case-cfn-macros.h" #include "gimp

Re: [PATCH 2/2] S/390: Do not end groups after fallthru edge

2017-10-17 Thread Robin Dapp
n -- gcc/ChangeLog: 2017-10-17 Robin Dapp * config/s390/s390.c (s390_bb_fallthru_entry_likely): New function. (s390_sched_init): Do not reset s390_sched_state if we entered the current basic block via a fallthru edge and all others are very unlikely. di

Re: [PATCH 2/2] S/390: Do not end groups after fallthru edge

2017-10-18 Thread Robin Dapp
> Preserving the sched state across basic blocks for your case works only if > the BBs are traversed > with the fall through edges coming first. Is that the case? We probably > should have a description > for s390_last_sched_state stating this. Committed as attached with an additional comment an

Re: [PATCH 2/3] Simplify wrapped binops

2017-07-05 Thread Robin Dapp
> While the initialization value doesn't matter (wi::add will overwrite it) > better initialize both to false ;) Ah, you mean because we want to > transform only if get_range_info returned VR_RANGE. Indeed somewhat > unintuitive (but still the best variant for now). > so I'm still missing a comm

Re: [PATCH 2/3] Simplify wrapped binops

2017-07-05 Thread Robin Dapp
[3/3] Tests -- gcc/testsuite/ChangeLog: 2017-07-05 Robin Dapp * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. * gcc.dg/wrapped-binop-simplify-signed-2.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. * gcc.dg/wrapped-binop-simplify

[PATCH] Fix PR81362: Vector peeling

2017-07-12 Thread Robin Dapp
d the body_cost_vec parameter which is not used elsewhere. Regards Robin -- gcc/ChangeLog: 2017-07-12 Robin Dapp * (vect_enhance_data_refs_alignment): Remove body_cost_vec from _vect_peel_extended_info. tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): D

[RFC] If conversion min/max search, costs and problems

2017-07-25 Thread Robin Dapp
Hi, recently I wondered why a snippet like the following is not being if-converted at all on s390: int foo (int *a, unsigned int n) { int min = 99; int bla = 0; for (int i = 0; i < n; i++) { if (a[i] < min) { min = a[i]; bla = 1; } }

Re: [RFC] If conversion min/max search, costs and problems

2017-07-26 Thread Robin Dapp
> Do you have an example where wrong code is generated through the > noce_convert_multiple_sets_p path (with or without bodged costs)? > > Both AArch64 and x86-64 reject your testcase along this codepath because > of the constant set of 1. If we work around that by setting bla = n rather > than bl

[PATCH, committed] Add myself to MAINTAINERS

2017-07-31 Thread Robin Dapp
ChangeLog: 2017-07-31 Robin Dapp * MAINTAINERS (write after approval): Add myself. Index: MAINTAINERS === --- MAINTAINERS (revision 250740) +++ MAINTAINERS (working copy) @@ -356,6 +356,7 @@ Lawrence Crowl Ian Dall

Re: [RFC] S/390: Alignment peeling prolog generation

2017-05-08 Thread Robin Dapp
> So the new part is the last point? There's a lot of refactoring in 3/3 that > makes it hard to see what is actually changed ... you need to resist > in doing this, it makes review very hard. The new part is actually spread across the three last "-"s. Attached is a new version of [3/3] split u

[PATCH 3/4] Vect peeling cost model

2017-05-08 Thread Robin Dapp
gcc/ChangeLog: 2017-05-08 Robin Dapp * tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling): Return peel info. (vect_enhance_data_refs_alignment): Compute full costs when peeling for unknown alignment, compare to costs for peeling for known

[PATCH 4/4] Vect peeling cost model

2017-05-08 Thread Robin Dapp
gcc/ChangeLog: 2017-05-08 Robin Dapp * tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): Remove unused variable. (vect_enhance_data_refs_alignment): Compare best peelings costs to doing no peeling and choose no peeling if equal. diff --git a

Re: [PATCH] Tree-level fix for PR 69526

2017-05-09 Thread Robin Dapp
ping.

Re: [RFC] S/390: Alignment peeling prolog generation

2017-05-11 Thread Robin Dapp
Included the requested changes in the patches (to follow). I removed the alignment count check now altogether. > I'm not sure why you test for unlimited_cost_model here as I said > elsewhere I'm not sure > what not cost modeling means for static decisions. The purpose of > unlimited_cost_model >

[PATCH 1/5] Vect peeling cost model

2017-05-11 Thread Robin Dapp
gcc/ChangeLog: 2017-05-11 Robin Dapp * tree-vectorizer.h (dr_misalignment): Introduce DR_MISALIGNMENT_UNKNOWN. * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Refactoring. (vect_update_misalignment_for_peel): Use DR_MISALIGNMENT_UNKNOWN

[PATCH 2/5] Vect peeling cost model

2017-05-11 Thread Robin Dapp
gcc/ChangeLog: 2017-05-11 Robin Dapp * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Change comment and rename variable. (vect_get_peeling_costs_all_drs): New function. (vect_peeling_hash_get_lowest_cost): Use. (vect_peeling_supportable

[PATCH 3/5] Vect peeling cost model

2017-05-11 Thread Robin Dapp
gcc/ChangeLog: 2017-05-11 Robin Dapp * tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling): Return peeling info and set costs to zero for unlimited cost model. (vect_enhance_data_refs_alignment): Also inspect all datarefs with unknown

[PATCH 4/5] Vect peeling cost model

2017-05-11 Thread Robin Dapp
gcc/ChangeLog: 2017-05-11 Robin Dapp * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Remove check for supportable_dr_alignment, compute costs for doing no peeling at all, compare to the best peeling costs so far and do no peeling if cheaper. diff

[PATCH 5/5] Vect peeling cost model

2017-05-11 Thread Robin Dapp
gcc/testsuite/ChangeLog: 2017-05-11 Robin Dapp * gcc.target/s390/vector/vec-nopeel-2.c: New test. diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-2.c new file mode 100644 index 000..9b67793 --- /dev/null +++ b/gcc

[PATCH 4/5 v2] Vect peeling cost model

2017-05-11 Thread Robin Dapp
Included the workaround for SLP now. With it, testsuite is clean on x86 as well. gcc/ChangeLog: 2017-05-11 Robin Dapp * tree-vect-data-refs.c (vect_get_data_access_cost): Workaround for SLP handling. (vect_enhance_data_refs_alignment): Remove check for

Re: [PATCH] Tree-level fix for PR 69526

2017-05-18 Thread Robin Dapp
> Hmm, won't (uint32_t + uint32_t-CST) doesn't overflow be sufficient > condition for such transformation? Yes, in principle this should suffice. What we're actually looking for is something like a "proper" (or no) overflow, i.e. an overflow in both min and max of the value range. In (a + cst1

[PATCH 1/3] Simplify wrapped binops

2017-05-18 Thread Robin Dapp
This tries to fold unconditionally and fixes some test cases. gcc/ChangeLog: 2017-05-18 Robin Dapp * tree-ssa-propagate.c (substitute_and_fold_dom_walker::before_dom_children): Always try to fold. gcc/testsuite/ChangeLog: 2017-05-18 Robin Dapp * g++.dg/tree

[PATCH 2/3] Simplify wrapped binops

2017-05-18 Thread Robin Dapp
match.pd part of the patch. gcc/ChangeLog: 2017-05-18 Robin Dapp * match.pd: Simplify wrapped binary operations. * tree-vrp.c (extract_range_from_binary_expr_1): Add overflow parameter. (extract_range_from_binary_expr): Likewise. * tree-vrp.h: Export

[PATCH 3/3] Simplify wrapped binops

2017-05-18 Thread Robin Dapp
New testcases. gcc/testsuite/ChangeLog: 2017-05-18 Robin Dapp * gcc.dg/wrapped-binop-simplify-signed-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-1.c: New test. * gcc.dg/wrapped-binop-simplify-unsigned-2.c: New test. diff --git a/gcc/testsuite/gcc.dg

Re: [PATCH 2/3] Simplify wrapped binops

2017-05-18 Thread Robin Dapp
> Any reason to expose tree-vrp.c internal interface here? The function > looks quite expensive. Overflow check can be done by get_range_info > and simple wi::cmp calls. Existing code like in > tree-ssa-loop-niters.c already does that. Also could you avoid using > comma expressions in condition

Re: [PATCH 2/3] Simplify wrapped binops

2017-05-19 Thread Robin Dapp
> I can guess what is happening here. It's a 40 bits unsigned long long > field, (s.b-8) will be like: > _1 = s.b > _2 = _1 + 0xf8 > Also get_range_info returns value range [0, 0xFF] for _1. > You'd need to check if _1(with range [0, 0xFF]) + 0xf8 > overflows agains

[PATCH 0/5 v3] Vect peeling cost model

2017-05-23 Thread Robin Dapp
The last version of the patch series caused some regressions for ppc64. This was largely due to incorrect handling of unsupportable alignment and should be fixed with the new version. p2 and p5 have not changed but I'm posting the whole series again for reference. p1 only changed comment wording,

[PATCH 1/5 v3] Vect peeling cost model

2017-05-23 Thread Robin Dapp
gcc/ChangeLog: 2017-05-23 Robin Dapp * tree-vect-data-refs.c (vect_compute_data_ref_alignment): Create DR_HAS_NEGATIVE_STEP. (vect_update_misalignment_for_peel): Define DR_MISALIGNMENT. (vect_enhance_data_refs_alignment): Use

[PATCH 2/5 v3] Vect peeling cost model

2017-05-23 Thread Robin Dapp
gcc/ChangeLog: 2017-05-23 Robin Dapp * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Rename. (vect_get_peeling_costs_all_drs): Create function. (vect_peeling_hash_get_lowest_cost): Use vect_get_peeling_costs_all_drs

[PATCH 3/5 v3] Vect peeling cost model

2017-05-23 Thread Robin Dapp
gcc/ChangeLog: 2017-05-23 Robin Dapp * tree-vect-data-refs.c (vect_peeling_hash_choose_best_peeling): Return peeling info and set costs to zero for unlimited cost model. (vect_enhance_data_refs_alignment): Also inspect all datarefs with unknown

[PATCH 4/5 v3] Vect peeling cost model

2017-05-23 Thread Robin Dapp
gcc/ChangeLog: 2017-05-23 Robin Dapp * tree-vect-data-refs.c (vect_get_data_access_cost): Workaround for SLP handling. (vect_enhance_data_refs_alignment): Compute costs for doing no peeling at all, compare to the best peeling costs so far and avoid

[PATCH 5/5 v3] Vect peeling cost model

2017-05-23 Thread Robin Dapp
gcc/testsuite/ChangeLog: 2017-05-23 Robin Dapp * gcc.target/s390/vector/vec-nopeel-2.c: New test. diff --git a/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-2.c b/gcc/testsuite/gcc.target/s390/vector/vec-nopeel-2.c new file mode 100644 index 000..9b67793 --- /dev/null +++ b

Re: [PATCH 2/5 v3] Vect peeling cost model

2017-05-24 Thread Robin Dapp
> Not sure I've understood the series TBH, but is the npeel == vf / 2 > there specifically for the "unknown number of peels" case? How do > we distinguish that from the case in which the number of peels is > known to be vf / 2 at compile time? Or have I missed the point > completely? (probably ye

Re: [PATCH 0/5 v3] Vect peeling cost model

2017-05-24 Thread Robin Dapp
but the old series itself (-p3) doesn't apply to trunk anymore (because of the change in vect_enhance_data_refs_alignment). Regards Robin -- gcc/ChangeLog: 2017-05-24 Robin Dapp * tree-vect-data-refs.c (vect_get_peeling_costs_all_drs): Introduce unknown_misalignment

Re: [PATCH 4/5 v3] Vect peeling cost model

2017-05-31 Thread Robin Dapp
> Since this commit (r248678), I've noticed regressions on some arm targets. > Executed from: gcc.dg/tree-ssa/tree-ssa.exp > gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "Alignment > of access forced using peeling" 1 > gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect > "

Re: [PATCH 0/5 v3] Vect peeling cost model

2017-06-06 Thread Robin Dapp
> Patch 6 breaks no-vfa-vect-57.c on powerpc. Which CPU model (power6/7/8?) and which compile options (-maltivec/ -mpower8-vector?) have been used for running and compiling the test? As discussed in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80925 this has an influence on the cost function and

Re: [PATCH] Tree-level fix for PR 69526

2017-01-10 Thread Robin Dapp
Perhaps I'm still missing how some cases are handled or not handled, sorry for the noise. > I'm not sure there is anything to "interpret" -- the operation is unsigned > and overflow is when the operation may wrap around zero. There might > be clever ways of re-writing the expression to > (uint64_

Re: [PATCH] Tree-level fix for PR 69526

2017-01-16 Thread Robin Dapp
Ping. To put it shortly, I'm not sure how to differentiate between: example range of a: [3,3] (ulong)(a + UINT_MAX) + 1 --> (ulong)(a) + (ulong)(-1 + 1), sign extend example range of a: [0,0] (ulong)(a + UINT_MAX) + 1 --> (ulong)(a) + (ulong)(UINT_MAX + 1), no sign extend In this case, there is

[PATCH] Fix s390 testcase vcond-shift

2017-03-27 Thread Robin Dapp
Hi, this patch fixes the vcond shift testcase that failed since setting PARAM_MIN_VECT_LOOP_BOUND in the s390 backend. Regards Robin -- gcc/testsuite/ChangeLog: 2017-03-27 Robin Dapp * gcc.target/s390/vector/vcond-shift.c (void foo): Increase iteration count and assume

[RFC] S/390: Alignment peeling prolog generation

2017-04-11 Thread Robin Dapp
Hi, when looking at various vectorization examples on s390x I noticed that we still peel vf/2 iterations for alignment even though vectorization costs of unaligned loads and stores are the same as normal loads/stores. A simple example is void foo(int *restrict a, int *restrict b, unsigned int n)

Re: [RFC] S/390: Alignment peeling prolog generation

2017-04-11 Thread Robin Dapp
Hi Bin, > Seems Richi added code like below comparing costs between aligned and > unsigned loads, and only peeling if it's beneficial: > > /* In case there are only loads with different unknown misalignments, > use > peeling only if it may help to align other accesses in the loop

Re: [RFC] S/390: Alignment peeling prolog generation

2017-04-12 Thread Robin Dapp
> Note I was very conservative here to allow store bandwidth starved > CPUs to benefit from aligning a store. > > I think it would be reasonable to apply the same heuristic to the > store case that we only peel for same cost if peeling would at least > align two refs. Do you mean checking if peel

Re: [RFC] S/390: Alignment peeling prolog generation

2017-05-04 Thread Robin Dapp
Hi, > This one only works for known misalignment, otherwise it's overkill. > > OTOH if with some refactoring we can end up using a single cost model > that would be great. That is for the SAME_ALIGN_REFS we want to > choose the unknown misalignment with the maximum number of > SAME_ALIGN_REFS. A

[PATCH 1/3] Vect peeling cost model

2017-05-04 Thread Robin Dapp
Some refactoring and definitions to use for (unknown) DR_MISALIGNMENT, gcc/ChangeLog: 2017-04-26 Robin Dapp * tree-data-ref.h (struct data_reference): Create DR_HAS_NEGATIVE_STEP. * tree-vectorizer.h (dr_misalignment): Define DR_MISALIGNMENT. * tree-vect-data-refs.c

[PATCH 2/3] Vect peeling cost model

2017-05-04 Thread Robin Dapp
Wrap some frequently used snippets in separate functions. gcc/ChangeLog: 2017-04-26 Robin Dapp * tree-vect-data-refs.c (vect_update_misalignment_for_peel): Rename. (vect_get_peeling_costs_all_drs): Create function. (vect_peeling_hash_get_lowest_cost): Use

[PATCH 3/3] Vect peeling cost model

2017-05-04 Thread Robin Dapp
gcc/ChangeLog: 2017-04-26 Robin Dapp * tree-vect-data-refs.c (vect_peeling_hash_get_lowest_cost): Change cost model. (vect_peeling_hash_choose_best_peeling): Return extended peel info. (vect_peeling_supportable): Return peeling status. diff --git a/gcc/tree

[PATCH 1/2] S/390: Handle long-running instructions

2017-10-11 Thread Robin Dapp
This patch introduces balancing of long-running instructions that may clog the pipeline. gcc/ChangeLog: 2017-10-11 Robin Dapp * config/s390/s390.c (NUM_SIDES): New constant. (LONGRUNNING_THRESHOLD): New constant. (LATENCY_FACTOR): New constant

[PATCH 2/2] S/390: Do not end groups after fallthru edge

2017-10-11 Thread Robin Dapp
This patch fixes cases where we start a new group although the previous one has not ended. Regression tested on s390x. gcc/ChangeLog: 2017-10-11 Robin Dapp * config/s390/s390.c (s390_has_ok_fallthru): New function. (s390_sched_score): Temporarily change s390_sched_state

Re: [PATCH] Tree-level fix for PR 69526

2017-02-02 Thread Robin Dapp
I skimmed through the code to see where transformation like (a - 1) -> (a + UINT_MAX) are performed. It seems there are only two places, match.pd (/* A - B -> A + (-B) if B is easily negatable. */) and fold-const.c. In order to be able to reliably know whether to zero-extend or to sign-extend the

[PATCH] S/390: Change 2-byte NOPs

2017-03-01 Thread Robin Dapp
Hi, the following patch changes "nopr %r7" to "nopr %r0" which is advantageous from a hardware perspective. It will only be emitted for hotpatching and should not impact normal code. Bootstrapped and regression tested on s390 and s390x. Regards Robin gcc/ChangeLog: 20

[PATCH] S/390: Disable vectorization for loops with few iterations

2017-03-02 Thread Robin Dapp
ening. Regards Robin [1] https://gcc.gnu.org/ml/gcc/2017-01/msg00234.html [2] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg01562.html -- gcc/ChangeLog: 2017-03-02 Robin Dapp * config/s390/s390.c (s390_option_override_internal): Set PARAM_MIN_VECT_LOOP_BOUND diff --git a/gc

[RFC] 69526 - ivopts candidate strangeness

2016-03-20 Thread Robin Dapp
s390x but did not yet perform bootstrapping and more testing due to the premature nature of the patch. Thanks Robin gcc/ChangeLog: 2016-03-17 Robin Dapp * cfgloop.h (struct GTY): Add second number of iterations * loop-doloop.c (doloop_condition_get): Fix whitespace

[PATCH] Some tree-vect-data-refs.c cleanup

2016-04-13 Thread Robin Dapp
regressions on s390x and amd64. Regards Robin -- gcc/ChangeLog: 2016-04-13 Robin Dapp * tree-vectorizer.h (dr_misalignment): Introduce named DR_MISALIGNMENT constants. (aligned_access_p): Use constants. (known_alignment_for_access_p): Likewise

Re: [PATCH] Tree-level fix for PR 69526

2016-09-20 Thread Robin Dapp
t is usable despite the overflow. Do you think it should be handled differently? Revised version attached. Regards Robin -- gcc/ChangeLog: 2016-09-20 Robin Dapp PR middle-end/69526 This enables combining of wrapped binary operations and fixes the tree level par

Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer

2016-09-26 Thread Robin Dapp
i_p(). ok to commit? Regards Robin -- gcc/ChangeLog: 2016-09-26 Robin Dapp * tree-vect-loop-manip.c (create_intersect_range_checks_index): Add tree_fits_uhwi_p check. diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c index 8203040..8be0c17 100644 --- a/gcc/t

Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer

2016-09-26 Thread Robin Dapp
(I didn't manage to run it independently in this directory via RUNTESTFLAGS=vect.exp=... or otherwise) Bootstrapped on x86 and s390. -- gcc/ChangeLog: 2016-09-26 Robin Dapp * tree-vect-loop-manip.c (create_intersect_range_checks_index): Add tree_fits_shwi_p check. g

Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer

2016-09-27 Thread Robin Dapp
> Also the '=' in the split line goes to the next line according to > coding conventions. fixed, I had only looked at an instance one function above which had it wrong as well. Also changed comment grammar slightly. Regards Robin -- gcc/ChangeLog: 2016-09-27 Robin Dapp

Re: [PATCH] Fix PR77407

2016-10-01 Thread Robin Dapp
This introduces an ICE ("bogus comparison result type") on s390 for the following test case: #include void foo(int dim) { int ba, sign; ba = abs (dim); sign = dim / ba; } Doing diff --git a/gcc/match.pd b/gcc/match.pd index ba7e013..2455592 100644 --- a/gcc/match.pd +++ b/gcc/match.

Re: [PATCH] Tree-level fix for PR 69526

2016-10-05 Thread Robin Dapp
Ping.

Re: [PATCH] Tree-level fix for PR 69526

2016-10-14 Thread Robin Dapp
Ping :)

[PATCH] Tree-level fix for PR 69526

2016-07-21 Thread Robin Dapp
As described in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69526, we currently fail to simplify cases like (unsigned long)(a - 1) + 1 to (unsigned long)a when VRP knows that (a - 1) does not overflow. This patch introduces a match.pd pattern as well as a helper function that checks for overf

Re: [PATCH] Tree-level fix for PR 69526

2016-11-24 Thread Robin Dapp
Ping.

Re: [PATCH] Tree-level fix for PR 69526

2016-11-28 Thread Robin Dapp
>> + /* Sign-extend @1 to TYPE. */ >> + w1 = w1.from (w1, TYPE_PRECISION (type), SIGNED); >> >> not sure why you do always sign-extend. If the inner op is unsigned >> and we widen then that's certainly bogus considering your UINT_MAX >> example above. Does >> >>

Re: [PATCH] Tree-level fix for PR 69526

2016-12-04 Thread Robin Dapp
Ping. Any idea how to tackle this?

Re: [PATCH] Tree-level fix for PR 69526

2016-12-07 Thread Robin Dapp
> So we have (uint64_t)(uint32 + -1U) + 1 and using TYPE_SIGN (inner_type) > produces (uint64_t)uint32 + -1U + 1. This simply means that we cannot ignore > overflow of the inner operation and for some reason your change > to extract_range_from_binary_expr didn't catch this. That is _8 + 429496729

CSE pass prevents loop-invariant motion

2015-09-15 Thread Robin Dapp
Hi, recently, I came across a problem that keeps a load instruction in a loop although it is loop-invariant. A simple example is: #include #define SZ 256 int a[SZ], b[SZ], c[SZ]; int main() { int i; for (i = 0; i < SZ; i++) { a[i] = b[i] + c[i]; } printf("%d\n", a[0]); } The re

Re: [PATCH] Tree-level fix for PR 69526

2016-11-16 Thread Robin Dapp
Found some time to look into this again. > Index: tree-ssa-propagate.c > === > --- tree-ssa-propagate.c(revision 240133) > +++ tree-ssa-propagate.c(working copy) > @@ -1105,10 +1105,10 @@ substitute_and_fold_dom_walker

[Patch] S/390: Simplify vector conditionals

2015-12-15 Thread Robin Dapp
ree-level. Bootstrapped and regression-tested on s390. Regards Robin gcc/ChangeLog: 2015-12-15 Robin Dapp * config/s390/s390.c (s390_expand_vcond): Convert vector conditional into shift. * config/s390/vector.md: Change operand predicate. gcc/testsuite/ChangeLog: 2015-12-

Re: [Patch] S/390: Simplify vector conditionals

2015-12-17 Thread Robin Dapp
Hi, the attached patch renames the constm1_operand predicate to all_ones_operand and introduces a check for int mode. It should be applied on top of the last patch ([Patch] S/390: Simplify vector conditionals). Regtested on s390. Regards Robin gcc/ChangeLog: 2015-12-15 Robin Dapp

[PATCH] RISC-V: Use biggest_mode as mode for constants.

2024-10-15 Thread Robin Dapp
Hi, in compute_nregs_for_mode we expect that the current variable's mode is at most as large as the biggest mode to be used for vectorization. This might not be true for constants as they don't actually have a mode. In that case, just use the biggest mode so max_number_of_live_regs returns 1. Th

Re: [PATCH v4] RISC-V: add option -m(no-)autovec-segment

2024-10-15 Thread Robin Dapp
> Quick question. We did something like this to aid internal > testing/bringup. Our variant adjusted a ton of the mode iterators in > vector-iterators.md and the TUPLE_ENTRY stuff in riscv-vector-switch.def. > > Robin, do you remember why you had to adjust all the iterators? Was it > that LTO

[PATCH] reassoc: Do not sort likely vectorizable ops by rank.

2024-10-16 Thread Robin Dapp
Hi, this is probably rather an RFC than a patch as I'm not sure whether reassoc is the right place to fix it. On top, the heuristic might be a bit "ad-hoc". Maybe we can also work around it in the vectorizer? The following function is vectorized in a very inefficient way because we construct ve

[PATCH v2 4/8] vect: Add maskload else value support.

2024-10-18 Thread Robin Dapp
This patch adds an else operand to vectorized masked load calls. The current implementation adds else-value arguments to the respective target-querying functions that is used to supply the vectorizer with the proper else value. Right now, the only spot where a zero else value is actually enforced

[PATCH v2 7/8] i386: Add else operand to masked loads.

2024-10-18 Thread Robin Dapp
This patch adds a zero else operand to masked loads, in particular the masked gather load builtins that are used for gather vectorization. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Add else-operand handling. (ix86_expand_builtin): Ditt

[PATCH v2 1/8] docs: Document maskload else operand and behavior.

2024-10-18 Thread Robin Dapp
This patch amends the documentation for masked loads (maskload, vec_mask_load_lanes, and mask_gather_load as well as their len counterparts) with an else operand. gcc/ChangeLog: * doc/md.texi: Document masked load else operand. --- gcc/doc/md.texi | 63 ---

[PATCH v2 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-10-18 Thread Robin Dapp
When predicating a load we implicitly assume that the else value is zero. This matters in case the loaded value is padded (like e.g. a Bool) and we must ensure that the padding bytes are zero on targets that don't implicitly zero inactive elements. In order to formalize this this patch queries th

[PATCH v2 5/8] aarch64: Add masked-load else operands.

2024-10-18 Thread Robin Dapp
This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. For the lack of a better idea I used a function call property to s

[PATCH v2 8/8] RISC-V: Add else operand to masked loads [PR115336].

2024-10-18 Thread Robin Dapp
This patch adds else operands to masked loads. Currently the default else operand predicate accepts "undefined" (i.e. SCRATCH) as well as all-ones values. Note that this series introduces a large number of new RVV FAILs for riscv. All of them are due to us not being able to elide redundant vec_c

[PATCH v2 0/8] Add maskload else operand.

2024-10-18 Thread Robin Dapp
wer10, x86 and aarch64. Regtested on rv64gcv. Testing on GCN would be much appreciated. Robin Dapp (8): docs: Document maskload else operand and behavior. ifn: Add else-operand handling. tree-ifcvt: Enforce zero else value after maskload. vect: Add maskload else value support. aarch64

[PATCH v2 2/8] ifn: Add else-operand handling.

2024-10-18 Thread Robin Dapp
This patch adds else-operand handling to the internal functions. gcc/ChangeLog: * internal-fn.cc (add_mask_and_len_args): Rename... (add_mask_else_and_len_args): ...to this and add else handling. (expand_partial_load_optab_fn): Use adjusted function. (expand_partia

[PATCH v2 6/8] gcn: Add else operand to masked loads.

2024-10-18 Thread Robin Dapp
This patch adds an undefined else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate. --- gcc/config/gcn/gcn-valu.md | 12 gcc/config/gcn/predicates.md |

Re: [PATCH] reassoc: Do not sort likely vectorizable ops by rank.

2024-10-16 Thread Robin Dapp
> Interesting - this is bleh | bswap (..), right, so having > bla1 | (bleh | bla2) fails to recognize bla1 | bla2 as bswap. Yes, exactly. > I'd expect this kind of pattern to fail bswap detection easily > if you mangle it a bit. So possibly bswap detection should learn > to better pick the "piec

<    1   2   3   4   5   6   7   8   9   10   >