[PATCH] RISC-V: Add testcase for PR114749.

2024-04-25 Thread Robin Dapp
Hi, this adds a test case for PR114749. Going to commit as obvious unless somebody complains. Regards Robin gcc/testsuite/ChangeLog: PR tree-optimization/114749 * gcc.target/riscv/rvv/autovec/pr114749.c: New test. --- .../gcc.target/riscv/rvv/autovec/pr114749.c | 15 +++

[PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-10 Thread Robin Dapp
Hi, this only forces the first comparison operator into a register if it is not already suitable. Bootstrap and regtest is running on x86 and aarch64, successful on p10. Regtested on riscv. gcc/ChangeLog: PR middle-end/113474 * internal-fn.cc (expand_vec_cond_mask_optab_fn): O

Re: [PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-12 Thread Robin Dapp
> How does this make a difference in the end? I'd expect say forwprop to > fix things? In general we try to only add the masking "boilerplate" of our instructions at split time so fwprop, combine et al. can do their work uninhibited of it (and we don't need numerous (if_then_else ... (if_then_els

Re: [PATCH] internal-fn: Do not force vcond operand to reg.

2024-05-13 Thread Robin Dapp
> What happens if we simply remove all of the force_reg here? On x86 I bootstrapped and tested the attached without fallout (gcc188, so it's no avx512-native machine and therefore limited coverage). riscv regtest is unchanged. For aarch64 I would to rely on the pre-commit CI to pick it up (does t

Re: [PATCH v1 3/3] RISC-V: Enable vectorizable early exit test

2024-05-13 Thread Robin Dapp
Hi Pan, > > @@ -4114,6 +4115,7 @@ proc check_effective_target_vect_early_break_hw { } { > || [check_effective_target_arm_v8_neon_hw] > || [check_sse4_hw_available] > || [istarget amdgcn-*-*] > + || [check_effective_target_riscv_v] > }}] > } I believe this should be

[PATCH] RISC-V: Do not allow v0 as dest when merging [PR115068].

2024-05-13 Thread Robin Dapp
Hi, this patch splits the vfw...wf pattern so we do not emit e.g. vfwadd.wf v0,v8,fa5,v0.t anymore. Regtested on rv64gcv_zvfh. Regards Robin gcc/ChangeLog: PR target/115068 * config/riscv/vector.md: Split vfw.wf pattern. gcc/testsuite/ChangeLog: * gcc.target/riscv/

Re: [PATCH v1 2/3] RISC-V: Implement vectorizable early exit with vcond_mask_len

2024-05-13 Thread Robin Dapp
Hi Pan, thanks for working on this. In general the patch looks reasonable to me but I'd rather have some more comments about the high-level idea. E.g. cbranch is implemented like aarch64 by xor'ing the bitmasks and comparing the result against zero (so we branch based on mask equality). > +;; vc

[PATCH] RISC-V: Fix effective target check.

2024-08-30 Thread Robin Dapp
Hi, I messed up the return value in check_effective_target_rvv_zvl256b_ok and check_effective_target_rvv_zvl512b_ok. This fixes it and also just uses the current march for the check. Going to commit as obvious. Regards Robin gcc/testsuite/ChangeLog: * lib/target-supports.exp: Fix eff

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-05 Thread Robin Dapp
> > +(define_predicate "maskload_else_operand" > > + (and (match_code "const_int,const_vector") > > + (match_test "op == CONST0_RTX (GET_MODE (op))"))) > > This forces maskload and mask_gather_load to only accept zero here, but > in fact the hardware would allow us to accept any value (incl

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-06 Thread Robin Dapp
> There were absolutely problems without this. It's a while ago now, so I'm > struggling with the details, but as GCC only applies the mask to selected > operations there were all sorts of issues that crept in. Zeroing the > undefined lanes seemed to match the middle end assumptions (or, at least i

Re: [PATCH 6/8] gcn: Add else operand to masked loads.

2024-09-06 Thread Robin Dapp
> > So we only found two instances of this problem and both were related to > > _Bools. In case you have more cases, it would be greatly appreciated > > to verify the series with them. If you don't mind, would it be possible > > to comment out the zeroing, re-run the testsuite and check for FAILs

[PATCH] RISC-V: Add more vector-vector extract cases.

2024-09-06 Thread Robin Dapp
Hi, this adds a V16SI -> V4SI and related i.e. "quartering" vector-vector extract expander for VLS modes. It helps with unnecessary spills in x264. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (vec_extract): Add quarter vec-vec extrac

Re: [PATCH] RISC-V: Fixed incorrect semantic description in DF to DI pattern in the Zfa extension on rv32.

2024-09-06 Thread Robin Dapp
> In the process of DF to SI, we generally use "unsigned_fix" rather than > "truncate" for conversion. Although this has no effect in general, > unexpected ICE often occurs when precise semantic analysis is required, > such as analysis in function "simplify_const_unary_operation" in > simplify-rtx.

[PATCH] vect: Do not try to duplicate_and_interleave one-element mode.

2024-09-06 Thread Robin Dapp
Hi, PR112694 shows that we try to create sub-vectors of single-element vectors because can_duplicate_and_interleave_p returns true. The problem resurfaced in PR116611. This patch makes can_duplicate_and_interleave_p return false if count / nvectors > 0 and removes the corresponding check in the r

Re: [PATCH] Try fixing RISC-V .SELECT_VL with SLP

2024-09-14 Thread Robin Dapp
> The following simply removes a seemingly bogus guard. > > * tree-vect-loop.cc (vect_analyze_loop_1): Remove SLP guard > from .SELECT_VL disabling. > --- > gcc/tree-vect-loop.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-v

Re: CSE pass prevents loop-invariant motion

2015-09-23 Thread Robin Dapp
On 09/15/2015 05:25 PM, Jeff Law wrote: > On 09/15/2015 06:11 AM, Robin Dapp wrote: >> Hi, >> >> recently, I came across a problem that keeps a load instruction in a >> loop although it is loop-invariant. [..] > You might want to check your costing model -- cprop is

[Patch] S/390: Fix symbol ref alignment

2015-10-23 Thread Robin Dapp
always be generated. This patch uses separate flags for 2-, 4-, and 8-byte alignment to fix the problem. Bootstrapped, no regressions on s390. Regards Robin gcc/testsuite/ChangeLog: 2015-10-23 Robin Dapp * gcc.target/s390/load-relative-check.c: New test to check generation

[PATCH] MAINTAINERS: Change my email address.

2023-04-27 Thread Robin Dapp
Robin Dapp +Robin Dapp +Robin Dapp Simon Dardis Sudakshina Das Bud Davis @@ -

[PATCH] S/390: Add undef for MUSL_DYNAMIC_LINKERxx

2019-11-26 Thread Robin Dapp
Hi, I committed this patch (obvious). It fixes the s390 bootstrap by undefining existing defines before redefining them. Regards Robin -- gcc/ChangeLog: 2019-11-26 Robin Dapp * config/s390/linux.h: Add undef for MUSL_DYNAMIC_LINKERxx. commit

[PATCH] [dlang/phobos] S/390: Fix PR91628

2019-11-27 Thread Robin Dapp
Hi, in order to not use a glibc-internal symbol anymore, this patch adds separate .S files for s390x and s390 that allow to obtain the tls offset. I bootstrapped on s390x -m64 and -m31 and test on s390x, s390 seeing no new regressions. Regards Robin -- libphobos/ChangeLog: 2019-11-27 Robin

Re: [PATCH] [dlang/phobos] S/390: Fix PR91628

2019-11-27 Thread Robin Dapp
Hi Iain, > OK from me, what about earlier comments of using __asm__ in a C > source file? I don't mind too much either way but I gathered from the discussion in the bugzilla that .S was preferred for now. Regards Robin

Re: [PATCH] [dlang/phobos] S/390: Fix PR91628

2019-11-28 Thread Robin Dapp
> OK from me, what about earlier comments of using __asm__ in a C > source file? > > I wouldn't really object to converting all .S sources (infact I can > do this myself) if it meant slightly better portability. Adding to yesterday's message: feel free to apply the current version if it's OK. Th

Re: [PATCH 0/5 v3] Vect peeling cost model

2017-06-07 Thread Robin Dapp
> http://gcc.gnu.org/ml/gcc-testresults/2017-06/msg00297.html What machine is this running on? power4 BE? The tests are compiled with --with-cpu-64=power4 apparently. I cannot reproduce this on power7 -m32. Is it possible to get more detailed logs or machine access to reproduce? Regards Robin

Re: [PATCH 2/3] Simplify wrapped binops

2017-06-20 Thread Robin Dapp
ue (min xor max overflow, split/anti range). Test suite on s390x has no regressions, bootstrap is ok, x86 running. Regards Robin -- gcc/ChangeLog: 2017-06-19 Robin Dapp * match.pd: Simplify wrapped binary operations. diff --git a/gcc/match.pd b/gcc/match.pd index 80a17ba..66

Re: [PATCH 2/3] Simplify wrapped binops

2017-06-21 Thread Robin Dapp
> use INTEGRAL_TYPE_P. Done. > but you do not actually _use_ vr_outer. Do you think that if > vr_outer is a VR_RANGE then the outer operation may not > possibly have wrapped? That's a false conclusion. These were remains of a previous version. vr_outer is indeed not needed anymore; removed.

Re: [PATCH 2/3] Simplify wrapped binops

2017-06-27 Thread Robin Dapp
Ping.

Re: [PATCH 2/3] Simplify wrapped binops

2017-06-28 Thread Robin Dapp
> ideally you'd use a wide-int here and defer the tree allocation to the result Did that in the attached version. > So I guess we never run into the outer_op == minus case as the above is > clearly wrong for that? Right, damn, not only was the treatment for this missing but it was bogus in the o

fwprop addressing costs

2018-05-31 Thread Robin Dapp
Hi, when investigating a regression, I realized that we create a superfluous load on S390. The snippet looks something like LA %r10, 0(%r8,%r9) LLH %r4, 0(%r10) meaning the address in r10 is computed by an LA even though LLH supports the addressing already. The same address is used multiple t

[PATCH, S390] Change mtune default

2018-06-04 Thread Robin Dapp
explicitly state -march=z13 -mtune=zEC12. Regards Robin -- gcc/ChangeLog: 2018-06-04 Robin Dapp * config/s390/s390.h (enum processor_flags): Do not use default tune parameter when -march was specified. diff --git a/gcc/config/s390/s390.h b/gcc/config/s390/s390.h index a372981ff3a

[PATCH 1/2] zEC12 pipeline

2018-09-06 Thread Robin Dapp
Hi, this patch increases the latency of some floating point instructions to better match the real machine's behavior. Regards Robin -- gcc/ChangeLog: 2018-09-06 Robin Dapp * config/s390/2827.md: Increase latencies for some FP instructions. --- gcc/config/s390/2827.md

[PATCH 2/2] z13 pipeline

2018-09-06 Thread Robin Dapp
Similar to zEC12, the change in latencies helps match the real machine's behavior better. -- gcc/ChangeLog: 2018-09-06 Robin Dapp * config/s390/2964.md: Increase latencies for some FP instructions. --- gcc/config/s390/2964.md | 80 ++--- 1

[S/390] Re: [PATCH 1/2] zEC12 pipeline

2018-09-06 Thread Robin Dapp
Sorry, forgot the [S/390] tag in the subject.

sched2 priorities and replacements

2018-10-04 Thread Robin Dapp
Hi, I'm working on some insn latency changes in the s390 backend and noticed a regression in the SPEC2006 bzip2 test case that was due to some insns being scheduled differently. The sequence in short form before my change is ;; | insn | prio | ;; | 823 |1 | %r1=%r1+0x1

Re: sched2 priorities and replacements

2018-10-08 Thread Robin Dapp
ping, any insight on this? Regards Robin

Re: [PATCH 1/2] zEC12 pipeline

2018-10-08 Thread Robin Dapp
Hi, committed only the zEC12 part for now. Performance behavior of z13 with the patch is still unclear and will be tackled separately. Regards Robin

[PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-10 Thread Robin Dapp
urrently running (current HEAD didn't bootstrap for me on x86). The actual code changes throughout SPEC2006 are minor and the performance impact is negligible provided we do not hit a fixable bad case as described in my last message. Regards Robin -- gcc/ChangeLog: 2018-10-10 Ro

Re: [RFC] SHIFT_COUNT_TRUNCATED and shift_truncation_mask

2019-06-04 Thread Robin Dapp
>> Now, in order to get rid of the subregs in the pattern combine creates, >> I would need to be able to do something like >> >> (define_subst "subreg_subst" >> [(set (match_operand:DI 0 "" "") >> (shift:DI (match_operand:DI 1 "" "") >>(subreg:SI (match_dup:DI 2)))] >> >>

Re: [PATCH] Testsuite: Add s390 exceptions for gen-vect

2019-06-05 Thread Robin Dapp
Ping. > gcc/testsuite/ChangeLog: > > 2019-05-15 Robin Dapp > > * gcc.dg/tree-ssa/gen-vect-26.c: Do not expect unaligned access > vectorization on s390. > * gcc.dg/tree-ssa/gen-vect-28.c: Likewise. > * gcc.dg/tree-ssa/gen-vect-32.c: Likewise. >

[PATCH 2/9] ifcvt: Use enum instead of transform_name string.

2019-08-02 Thread Robin Dapp
This patch introduces an enum for ifcvt's various noce transformations. As the transformation might be queried by the backend, I find it nicer to allow checking for a proper type instead of a string comparison. --- gcc/ifcvt.c | 46 ++-- gcc/ifcvt.h | 67 ++

[PATCH 1/9] ifcvt: Store the number of created cmovs.

2019-08-02 Thread Robin Dapp
This patch saves the number of created conditional moves by noce_convert_multiple_sets in the IF_INFO struct. This may be used by the backend to easier decide whether to accept a generated sequence or not. --- gcc/ifcvt.c | 10 -- gcc/ifcvt.h | 4 2 files changed, 12 insertions(+),

[PATCH 0/9] Improve icvt "convert multiple"

2019-08-02 Thread Robin Dapp
later time. Regards Robin Robin Dapp (9): ifcvt: Store the number of created cmovs. ifcvt: Use enum instead of transform_name string. ifcvt: Only created temporaries as needed. ifcvt: Estimate original costs before convert_multiple. ifcvt: Allow constants operands in noce_convert_mu

[PATCH 6/9] ifcvt: Extract cc comparison from jump.

2019-08-02 Thread Robin Dapp
This patch extracts a cc comparison from the initial compare/jump insn and allows it to be passed to noce_emit_cmove and emit_conditional_move. --- gcc/ifcvt.c | 68 gcc/optabs.c | 7 -- gcc/optabs.h | 2 +- 3 files changed, 69 insertions

[PATCH 7/9] ifcvt: Emit two cmov variants and choose the less expensive one.

2019-08-02 Thread Robin Dapp
This patch duplicates the previous noce_emit_cmove logic. First it passes the canonical comparison emits the sequence and costs it. Then, a second, separate sequence is created by passing the cc compare we extracted before. The costs of both sequences are compared and the cheaper one is emitted.

[PATCH 4/9] ifcvt: Estimate original costs before convert_multiple.

2019-08-02 Thread Robin Dapp
This patch extends bb_ok_for_noce_convert_multiple_sets by a temporary cost estimation that can be used by noce_convert_multiple_sets. --- gcc/ifcvt.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/gcc/ifcvt.c b/gcc/ifcvt.c index 253b8a96c1a..55205cac153 10

[PATCH 3/9] ifcvt: Only created temporaries as needed.

2019-08-02 Thread Robin Dapp
noce_convert_multiple_sets creates temporaries for the destination of every emitted cmov and expects subsequent passes to get rid of them. This does not happen every time and even if the temporaries are removed, code generation can be affected adversely. In this patch, temporaries are only create

[PATCH 8/9] ifcvt: Handle swap-style idioms differently.

2019-08-02 Thread Robin Dapp
A swap-style idiom like tmp = a a = b b = tmp would be transformed like tmp_tmp = cond ? a : tmp tmp_a = cond ? b : a tmp_b = cond ? tmp_tmp : b [...] including rewiring the first source operand to previous writes (e.g. tmp -> tmp_tmp). The code would recognize this, though, and cha

[PATCH 5/9] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2019-08-02 Thread Robin Dapp
This patch checks allows immediate then/else operands for cmovs. We rely on,emit_conditional_move returning NULL if something unsupported was generated. Also, minor refactoring is performed. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c (have_const_cmov): New function

[PATCH 9/9] ifcvt: Also pass reversed cc comparison.

2019-08-02 Thread Robin Dapp
When then and else are reversed, we would swap new_val and old_val. The same has to be done for our new code paths. Also, emit_conditional_move may perform swapping. In case we need to swap, the cc comparison also needs to be swapped and for this we pass the reversed cc comparison directly. An al

Re: [PATCH 3/9] ifcvt: Only created temporaries as needed.

2019-08-08 Thread Robin Dapp
Hi Richard, > Is the separate need_temps scan required for correctness? It looked > like we could test: > > if (reg_overlap_mentioned_p (dest, cond)) > ... > > on-the-fly during the main noce_convert_multiple_sets loop. right, I didn't re-check it but after changes during interal p

Re: [PATCH 5/9] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2019-08-08 Thread Robin Dapp
> It seems like this is making noce_convert_multiple_sets overlap > a lot with cond_move_process_if_block (although that uses CONSTANT_P > instead of CONST_INT_P). How do they fit together after this patch, > i.e. which cases is each one meant to handle that the other doesn't? IMHO all of icvt is

[PATCH 0/3] Simplify wrapped binops.

2019-08-13 Thread Robin Dapp
manifests similarly to addr1,-1 extend r1,r1 addr1,1 where the adds could be avoided entirely. This is the tree part of the fix, it will still be necessary to correct rtl code generation in doloop later. Bootstrapped and regtested on s390x, x86 running. Regards Robin -- Robin Dapp (3

[PATCH 1/3] Perform fold when propagating.

2019-08-13 Thread Robin Dapp
This patch performs more aggressive folding in order for the match.pd changes to kick in later. Some test cases rely on VRP doing something which now already happens during CCP so adjust them accordingly. Also, the loop versioning pass was missing one case when deconstructing addresses that would

[PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
We would like to simplify code like (larger_type)(var + const1) + const2 to (larger_type)(var + combined_const1_const2) when we know that no overflow happens. --- gcc/match.pd | 101 +++ 1 file changed, 101 insertions(+) diff --git a/gcc/match.pd

[PATCH 3/3] Add new test cases for wrapped binop simplification.

2019-08-13 Thread Robin Dapp
--- .../gcc.dg/tree-ssa/copy-headers-5.c | 2 +- .../gcc.dg/tree-ssa/copy-headers-7.c | 2 +- .../gcc.dg/wrapped-binop-simplify-run.c | 52 .../gcc.dg/wrapped-binop-simplify-signed-1.c | 60 +++ .../wrapped-binop-simplify-unsigned-1.c

Re: [PATCH 1/3] Perform fold when propagating.

2019-08-13 Thread Robin Dapp
> May I suggest to add a parameter to the substitute-and-fold engine > so we can do the folding on all stmts only when enabled and enable > it just for VRP? That also avoids the testsuite noise. Would something along these lines do? diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
> I have become rather wary of INTEGRAL_TYPE_P recently because it > includes enum types, which with -fstrict-enum can have a surprising > behavior. If I have > enum E { A, B, C }; > and e has type enum E, with -fstrict-enum, do your tests manage to > prevent (long)e+1 from becoming (long)(e+1) wit

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
> +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST */ > Do you want to handle MINUS? What about POINTER_PLUS_EXPR? When I last attempted this patch I had the MINUS still in it but got confused easily by needing to think of too many cases at once leading to lots of stupid mistakes. Hence, I left it ou

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-16 Thread Robin Dapp
> So - what are you really after? (sorry if I don't remeber, testcase(s) > are missing > from this patch) > > To me it seems that 1) loses information if A + CST was done in a signed type > and we know that overflow doesn't happen because of that. For the reverse > transformation we don't. Btw,

Re: [PATCH 8/9] ifcvt: Handle swap-style idioms differently.

2019-08-16 Thread Robin Dapp
> Looks like a nice optimisation, but could we just test whether the > destination of a set isn't live on exit from the then block? I think > we could do that on the fly during the main noce_convert_multiple_sets > loop. I included this locally along with the rest of the remarks. Any comments on

Re: [PATCH 8/9] ifcvt: Handle swap-style idioms differently.

2019-08-17 Thread Robin Dapp
> I'm still a bit worried about the overlap between the expanded > noce_convert_multiple_sets and cond_move_process_if_block (5/9). > It seems like we're making noce_convert_multiple_set handle most of > the conditional move cases that cond_move_process_if_block can handle. > But like you say, noce

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-20 Thread Robin Dapp
> So - which case is it? IIRC we want to handle small signed > constants but the code can end up unsigned. For the > above we could write (unsigned long)((int)a + 1 - 1) and thus > sign-extend? Or even avoid this if we know the range. > That is, it becomes the first case again (operation perform

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-21 Thread Robin Dapp
I'm going to commit the attached two patches. Removed the redundant changes in test cases and added constructor initialization of fold_all_stmts. Regards Robin -- gcc/ChangeLog: 2019-08-21 Robin Dapp * gimple-loop-versioning.cc (loop_versioning::record_address_fra

[PATCH/RFC] Simplify wrapped RTL op

2019-08-27 Thread Robin Dapp
Hi, as announced in the wrapped-binop gimple patch mail, on s390 we still emit odd code in front of loops: void v1 (unsigned long *in, unsigned long *out, unsigned int n) { int i; for (i = 0; i < n; i++) { out[i] = in[i]; } } --> aghi%r1,-8 srlg%r1,

Re: [PATCH/RFC] Simplify wrapped RTL op

2019-08-29 Thread Robin Dapp
>> PR37451. Not clear what target that regressed on, btw. > > And PR55190 and PR67288 and probably more. Thanks for finding those. So the hope is to get this fixed or rather move towards a fix with the patch series that's currently reviewed which injects some doloop knowledge into ivopts? As s

Re: [PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-15 Thread Robin Dapp
ng. Regards Robin gcc/ChangeLog: 2018-10-15 Robin Dapp * haifa-sched.c (priority): Add force_recompute parameter. (apply_replacement): Call priority () with force_recompute = true. (restore_pattern): Likewise. diff --git a/gcc/haifa-sched.c b/gcc/haifa-s

Re: [PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-15 Thread Robin Dapp
> A C++ style nit/question: instead of adding a new overload > > priority (rtx_insn *, bool) > > you can add a parameter with a default value in the existing > static function > > priority (rtx_insn *insn, bool force_recompute = false) Sometimes I'm still stuck in C land with GCC :), thank

[PATCH] S/390: Allow immediates in loc expander

2018-10-17 Thread Robin Dapp
Hi, this allows immediates in the load-on-condition expander on z13 or later. Regtested on z14. Regards Robin -- gcc/ChangeLog: 2018-10-17 Robin Dapp * config/s390/predicates.md: Allow immediate operand in loc_operand for z13. * config/s390/s390.md: Use

Re: [PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-18 Thread Robin Dapp
/ChangeLog: 2018-10-16 Robin Dapp * haifa-sched.c (priority): Add force_recompute parameter. (apply_replacement): Call priority () with force_recompute = true. (restore_pattern): Likewise. diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index 1fdc9df9fb2..2c84ce38143 100644

[PATCH] S/390: Add loc patterns for QImode and HImode

2018-10-18 Thread Robin Dapp
Hi, this enables QImode and HImode for load on condition. For SPEC2006 this reduces code size overall, performance impact is negligible. Regtested on s390x. Regards Robin -- gcc/ChangeLog: 2018-10-18 Robin Dapp * config/s390/s390.md: Add movcc for QImode and HImode. diff --git

Re: [PATCH] Reset insn priority after inc/ref replacement in haifa sched

2018-10-19 Thread Robin Dapp
> Still OK :-) Committed as r265304. Regards Robin

Re: [PATCH] S/390: Allow immediates in loc expander

2018-10-26 Thread Robin Dapp
/ChangeLog: 2018-10-26 Robin Dapp * config/s390/predicates.md: Fix typo. * config/s390/s390.md: Allow immediates for load on condition. gcc/testsuite/ChangeLog: 2018-10-26 Robin Dapp * gcc.dg/loop-8.c: On s390, always run the test with -march=zEC12. diff --git a/gcc

Re: [PATCH] S/390: Add loc patterns for QImode and HImode

2018-10-26 Thread Robin Dapp
Hi, this is v2 of the patch with less quirky pattern syntax and two tests. Regards Robin -- gcc/ChangeLog: 2018-10-26 Robin Dapp * config/s390/s390.md: QImode and HImode for load on condition. gcc/testsuite/ChangeLog: 2018-10-26 Robin Dapp * gcc.target/s390/ifcvt

[PATCH] S/390: Increase register move costs for CC_REGS

2018-11-05 Thread Robin Dapp
Hi, the attached patch increases the move costs for moves involving the CC register. This saves us some instructions in SPEC CPU2006. Regards Robin -- gcc/ChangeLog: 2018-11-05 Robin Dapp * config/s390/s390.c (s390_register_move_cost): Increase costs for moves involving

[PATCH 0/6] If conversion with multiple sets.

2018-11-14 Thread Robin Dapp
Hi, the follow patch set was created in an attempt to allow multiple sets to be if converted. I was not able to make it work out of the box since I found the cost estimation for the newly created sequence to always be much higher than the sequence before. This is due to noce_convert_multiple_set

[PATCH 2/6] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2018-11-14 Thread Robin Dapp
This patch checks whether the current target supports conditional moves with immediate then/else operands and allows noce_convert_multiple_sets to deal with constants subsequently. Also, minor refactoring is performed. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c

[PATCH 3/6] ifcvt: Use enum instead of transform_name string.

2018-11-14 Thread Robin Dapp
This patch introduces an enum for ifcvt's various noce transformations. As the transformation might be queried by the backend, I find it nicer to allow checking for a proper type instead of a string comparison. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c (noce_try_move)

[PATCH 5/6] ifcvt: Only created temporaries as needed.

2018-11-14 Thread Robin Dapp
created if the destination of a set is used in an emitted condition check. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c (check_need_temps): New function. (noce_convert_multiple_sets): Only created temporaries if needed. --- gcc/ifcvt.c | 54

[PATCH 6/6] S/390: Add test for noce_convert_multiple_sets.

2018-11-14 Thread Robin Dapp
New test. -- gcc/testsuite/ChangeLog: 2018-11-14 Robin Dapp * gcc.target/s390/ifcvt-two-insns-int.c: New test. --- .../gcc.target/s390/ifcvt-two-insns-int.c | 26 +++ 1 file changed, 26 insertions(+) create mode 100644 gcc/testsuite/gcc.target/s390/ifcvt-two

[PATCH 1/6] ifcvt: Store the number of created cmovs.

2018-11-14 Thread Robin Dapp
This patch saves the number of created conditional moves by noce_convert_multiple_sets in the IF_INFO struct. This may be used by the backend to easier decide whether to accept a generated sequence or not. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * ifcvt.c

[PATCH 4/6] S/390: Implement noce_conversion_profitable_p.

2018-11-14 Thread Robin Dapp
This patch implements noce_conversion_profitable_p by checking for the transformation ifcvt used and only return positively if noce_convert_multiple_sets created less than MAX_IFCVT_INSNS insns. -- gcc/ChangeLog: 2018-11-14 Robin Dapp * config/s390/s390.c (MAX_IFCVT_INSNS): Define

Re: [PATCH 2/6] ifcvt: Allow constants operands in noce_convert_multiple_sets.

2018-11-15 Thread Robin Dapp
> This may ultimately be too simplistic. There are targets where some > constants are OK, but others may not be. By checking the predicate > like this I think you can cause over-aggressive if-conversion if the > target allows a range of integers in the expander's operand predicate, > but allows

Re: [PATCH 5/6] ifcvt: Only created temporaries as needed.

2018-11-15 Thread Robin Dapp
> This looks pretty reasonable. ISTM it ought to be able to go forward if > it's tested independently. The test suite already passes, any other tests you have in mind? To be honest I suppose noce_convert_multiple_sets will currently never successfully return (due to the costing problems I descri

[PATCH 0/3] S/390: Shift count improvements.

2019-07-07 Thread Robin Dapp
second patch adds some tests. The third patch defines the shift_truncation_mask and adds a test for it. Bootstrapped and regtested. Regards Robin --- Robin Dapp (3): S/390: Rework shift count handling. S/390: Shift count tests. S/390: Define shift_truncation_mask. gcc/config/s390

[PATCH 3/3] S/390: Define shift_truncation_mask.

2019-07-07 Thread Robin Dapp
Define s390_shift_truncation_mask to allow the optabs optimization sh = (64 - sh) -> sh = -sh for a rotation operation. -- gcc/ChangeLog: 2019-07-05 Robin Dapp * config/s390/s390.c (s390_shift_truncation_mask): Define. (TARGET_SHIFT_TRUNCATION_MASK): Define.

[PATCH 2/3] S/390: Shift count tests.

2019-07-07 Thread Robin Dapp
Tests to check for the changed shift-count handling. -- gcc/testsuite/ChangeLog: 2019-07-05 Robin Dapp * gcc.target/s390/combine-rotate-modulo.c: New test. * gcc.target/s390/combine-shift-rotate-add-mod.c: New test. * gcc.target/s390/vector/combine-shift-vec.c: New

[PATCH 1/3] S/390: Rework shift count handling.

2019-07-07 Thread Robin Dapp
Add s390_valid_shift_count to determine the validity of a shift-count operand. This is used to replace increasingly complex substitutions that should have allowed address-style shift-count handling, an and mask as well as no-op subregs on the operand. -- gcc/ChangeLog: 2019-07-05 Robin Dapp

[PATCH 0/7] S/390: Rework instruction scheduling.

2019-03-11 Thread Robin Dapp
Hi, this patch set adds new pipeline descriptions for z13 and z14. Based on that, the scoring and some properties are handled differently in the scheduler hooks. Regards Robin Robin Dapp (7): S/390: Change z13 pipeline description. S/390: Add z14 pipeline description. S/390: Change

[PATCH 3/7] S/390: Change handling of long-running instructions.

2019-03-11 Thread Robin Dapp
This patch makes the detection of long-running instructions independent of their latency and checks the execution unit instead. --- gcc/config/s390/s390.c | 73 +++--- 1 file changed, 55 insertions(+), 18 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/

[PATCH 1/7] S/390: Change z13 pipeline description.

2019-03-11 Thread Robin Dapp
This patch adapts the z13 pipeline description. --- gcc/config/s390/2964.md | 372 ++-- gcc/config/s390/s390.c | 39 ++--- 2 files changed, 226 insertions(+), 185 deletions(-) diff --git a/gcc/config/s390/2964.md b/gcc/config/s390/2964.md index 19e641bd252..

[PATCH 4/7] S/390: Change handling of group end.

2019-03-11 Thread Robin Dapp
This patch adds a scheduling state struct and changes the handling of end-group conditions. --- gcc/config/s390/s390.c | 158 ++--- 1 file changed, 68 insertions(+), 90 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 15926ec88cd

[PATCH 2/7] S/390: Add z14 pipeline description.

2019-03-11 Thread Robin Dapp
This patch adds the z14 pipeline description. --- gcc/config/s390/3906.md | 282 gcc/config/s390/s390.c | 23 +++- gcc/config/s390/s390.h | 2 +- gcc/config/s390/s390.md | 3 + 4 files changed, 307 insertions(+), 3 deletions(-) create mode 100644 g

[PATCH 5/7] S/390: Add side to schedule-mix calculations.

2019-03-11 Thread Robin Dapp
This patch makes the scheduling score execution-side aware. --- gcc/config/s390/s390.c | 32 ++-- 1 file changed, 18 insertions(+), 14 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 249df00268a..4dcf1be4445 100644 --- a/gcc/config/s390

[PATCH 7/7] S/390: Tune scheduling parameters.

2019-03-11 Thread Robin Dapp
This patch adapts some scheduling-related parameters. --- gcc/config/s390/s390.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 78a707267e8..901807e7833 100644 --- a/gcc/config/s390/s390.c +++ b/gcc/config/s390/

[PATCH 6/7] S/390: Add handling for group-of-two instructions.

2019-03-11 Thread Robin Dapp
This patch adds handling of group-of-two instructions. --- gcc/config/s390/s390.c | 36 +++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c index 4dcf1be4445..78a707267e8 100644 --- a/gcc/config/s390/s3

Re: [PATCH 0/7] S/390: Rework instruction scheduling.

2019-03-12 Thread Robin Dapp
> Please adjust the year and the author in gcc/config/s390/3906.md. Ok with > that change. Changed that and also simplified the longrunning checks. gcc/ChangeLog: 2019-03-12 Robin Dapp * config/s390/s390.c (LONGRUNNING_THRESHOLD): Remove. (s390_is_fpd

Re: [PATCH 8/8] S/390: Change test case to reflect scheduling changes.

2019-03-12 Thread Robin Dapp
This fixes a newly introduced test failure. --- 2019-03-12 Robin Dapp * gcc.target/s390/memset-1.c: Do not require stcy. diff --git a/gcc/testsuite/gcc.target/s390/memset-1.c b/gcc/testsuite/gcc.target/s390/memset-1.c index 3e201df1aed..9463a77208b 100644 --- a/gcc/testsuite

[PATCH] S/390: Perform more aggressive inlining

2019-03-12 Thread Robin Dapp
Hi, this patch sets the inlining parameters for z13 and later to rather aggressive values in response to PR85103 that caused performance regressions in SPEC2006's sjeng and gobmk benchmarks. Regards Robin -- gcc/ChangeLog: 2019-03-12 Robin Dapp * config/s390/s

[PATCH] S/390: Fix tests that expect unquoted option names

2019-03-15 Thread Robin Dapp
Hi, r269586 puts single quotes around option names. This patch fixes tests that expect the old format. Regards Robin --- gcc/testsuite/ChangeLog: 2019-03-15 Robin Dapp * gcc.target/s390/target-attribute/tattr-1.c (htm0): -mhtm -> '-mhtm'. * gcc.targe

[RFC] D support for S/390

2019-03-15 Thread Robin Dapp
Hi, during the last few days I tried to get D running on s390x (apparently the first Big Endian platform to try it?). I did not yet go through the code systematically and add a version(SystemZ) in every place where it might be needed but rather tried to fix test failures as they arose. After en

Re: [RFC] D support for S/390

2019-03-19 Thread Robin Dapp
Hi, > Alignment is written to TypeInfo, I don't think it should ever be > zero. That would mean that it isn't being generated by the compiler, > or read by the library correctly, so something else is amiss. it took me a while to see that in libphobos/libdruntime/object.d override @property siz

<    1   2   3   4   5   6   7   8   9   10   >