[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007 --- Comment #17 from iii at gcc dot gnu.org --- Author: iii Date: Mon Oct 28 10:04:31 2019 New Revision: 277507 URL: https://gcc.gnu.org/viewcvs?rev=277507&root=gcc&view=rev Log: Move jump threading before reload r266734 has introduced a new instance of jump threading pass in order to take advantage of opportunities that combine opens up. It was perceived back then that it was beneficial to delay it after reload, since that might produce even more such opportunities. Unfortunately jump threading interferes with hot/cold partitioning. In the code from PR92007, it converts the following +-- 2/HOT + | | v v 3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT | ^ | | +---+ into the following: +-- 2/HOT --+ | | v v 3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT This makes hot bb 6 dominated by cold bb 11, and because of this fixup_partitions makes bb 6 cold as well, which in turn makes EH edge 6->16 a crossing one. Not only can't we have crossing EH edges, we are also not allowed to introduce new crossing edges after reload in general, since it might require extra registers on some targets. Therefore, move the jump threading pass between combine and hot/cold partitioning. Building SPEC 2006 and SPEC 2017 with the old and the new code indicates that: * When doing jump threading right after reload, 3889 edges are threaded. * When doing jump threading right after combine, 3918 edges are threaded. This means this change will not introduce performance regressions. gcc/ChangeLog: 2019-10-28 Ilya Leoshkevich PR rtl-optimization/92007 * cfgcleanup.c (thread_jump): Add an assertion that we don't call it after reload if hot/cold partitioning has been done. (class pass_postreload_jump): Rename to pass_jump_after_combine. (make_pass_postreload_jump): Rename to make_pass_jump_after_combine. * passes.def(pass_postreload_jump): Move before reload, rename to pass_jump_after_combine. * tree-pass.h (make_pass_postreload_jump): Rename to make_pass_jump_after_combine. gcc/testsuite/ChangeLog: 2019-10-28 Ilya Leoshkevich PR rtl-optimization/92007 * g++.dg/opt/pr92007.C: New test (from Arseny Solokha). Added: trunk/gcc/testsuite/g++.dg/opt/pr92007.C Modified: trunk/gcc/ChangeLog trunk/gcc/cfgcleanup.c trunk/gcc/passes.def trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-pass.h
[Bug rtl-optimization/92007] [9/10 Regression] ICE: verify_flow_info failed (error: EH edge crosses section boundary in bb 7)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92007 --- Comment #18 from iii at gcc dot gnu.org --- Author: iii Date: Mon Oct 28 13:09:54 2019 New Revision: 277515 URL: https://gcc.gnu.org/viewcvs?rev=277515&root=gcc&view=rev Log: Move jump threading before reload r266734 has introduced a new instance of jump threading pass in order to take advantage of opportunities that combine opens up. It was perceived back then that it was beneficial to delay it after reload, since that might produce even more such opportunities. Unfortunately jump threading interferes with hot/cold partitioning. In the code from PR92007, it converts the following +-- 2/HOT + | | v v 3/HOT --> 5/HOT --> 8/HOT --> 11/COLD --> 6/HOT --EH--> 16/HOT | ^ | | +---+ into the following: +-- 2/HOT --+ | | v v 3/HOT --> 8/HOT --> 11/COLD --> 6/COLD --EH--> 16/HOT This makes hot bb 6 dominated by cold bb 11, and because of this fixup_partitions makes bb 6 cold as well, which in turn makes EH edge 6->16 a crossing one. Not only can't we have crossing EH edges, we are also not allowed to introduce new crossing edges after reload in general, since it might require extra registers on some targets. Therefore, move the jump threading pass between combine and hot/cold partitioning. Building SPEC 2006 and SPEC 2017 with the old and the new code indicates that: * When doing jump threading right after reload, 3889 edges are threaded. * When doing jump threading right after combine, 3918 edges are threaded. This means this change will not introduce performance regressions. gcc/ChangeLog: 2019-10-28 Ilya Leoshkevich Backport from mainline PR rtl-optimization/92007 * cfgcleanup.c (thread_jump): Add an assertion that we don't call it after reload if hot/cold partitioning has been done. (class pass_postreload_jump): Rename to pass_jump_after_combine. (make_pass_postreload_jump): Rename to make_pass_jump_after_combine. * passes.def(pass_postreload_jump): Move before reload, rename to pass_jump_after_combine. * tree-pass.h (make_pass_postreload_jump): Rename to make_pass_jump_after_combine. gcc/testsuite/ChangeLog: 2019-10-28 Ilya Leoshkevich Backport from mainline PR rtl-optimization/92007 * g++.dg/opt/pr92007.C: New test (from Arseny Solokha). Added: branches/gcc-9-branch/gcc/testsuite/g++.dg/opt/pr92007.C Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/cfgcleanup.c branches/gcc-9-branch/gcc/passes.def branches/gcc-9-branch/gcc/testsuite/ChangeLog branches/gcc-9-branch/gcc/tree-pass.h
[Bug rtl-optimization/92430] [9/10 Regression] Compile-time hog w/ -Os -fno-if-conversion -fno-tree-dce -fno-tree-loop-optimize -fno-tree-vrp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92430 --- Comment #5 from iii at gcc dot gnu.org --- Author: iii Date: Tue Nov 12 14:24:35 2019 New Revision: 278095 URL: https://gcc.gnu.org/viewcvs?rev=278095&root=gcc&view=rev Log: Free dominance info at the beginning of pass_jump_after_combine try_forward_edges does not update dominance info, and merge_blocks relies on it being up-to-date. In PR92430 stale dominance info makes merge_blocks produce a loop in the dominator tree, which in turn makes delete_basic_block loop forever. Fix by freeing dominance info at the beginning of cleanup_cfg. gcc/ChangeLog: 2019-11-12 Ilya Leoshkevich PR rtl-optimization/92430 * cfgcleanup.c (pass_jump_after_combine::execute): Free dominance info at the beginning. gcc/testsuite/ChangeLog: 2019-11-12 Ilya Leoshkevich PR rtl-optimization/92430 * gcc.dg/pr92430.c: New test (from Arseny Solokha). Added: trunk/gcc/testsuite/gcc.dg/pr92430.c Modified: trunk/gcc/ChangeLog trunk/gcc/cfgcleanup.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/92430] [9 Regression] Compile-time hog w/ -Os -fno-if-conversion -fno-tree-dce -fno-tree-loop-optimize -fno-tree-vrp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92430 --- Comment #7 from iii at gcc dot gnu.org --- Author: iii Date: Thu Nov 14 16:40:33 2019 New Revision: 278254 URL: https://gcc.gnu.org/viewcvs?rev=278254&root=gcc&view=rev Log: Make flag_thread_jumps a gate of pass_jump_after_combine This is a follow-up to https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00919.html (r278095). Dominance info is deleted even if we don't perform jump threading. Since the whole point of this pass is to perform jump threading (other cleanups are not valuable at this point), skip it completely when flag_thread_jumps is not set. gcc/ChangeLog: 2019-11-14 Ilya Leoshkevich PR rtl-optimization/92430 * cfgcleanup.c (pass_jump_after_combine::gate): New function. (pass_jump_after_combine::execute): Perform jump threading unconditionally. Modified: trunk/gcc/ChangeLog trunk/gcc/cfgcleanup.c
[Bug rtl-optimization/92430] [9 Regression] Compile-time hog w/ -Os -fno-if-conversion -fno-tree-dce -fno-tree-loop-optimize -fno-tree-vrp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92430 --- Comment #8 from iii at gcc dot gnu.org --- Author: iii Date: Fri Nov 15 12:55:05 2019 New Revision: 278291 URL: https://gcc.gnu.org/viewcvs?rev=278291&root=gcc&view=rev Log: Free dominance info at the beginning of pass_jump_after_combine try_forward_edges does not update dominance info, and merge_blocks relies on it being up-to-date. In PR92430 stale dominance info makes merge_blocks produce a loop in the dominator tree, which in turn makes delete_basic_block loop forever. Fix by freeing dominance info at the beginning of cleanup_cfg. Also, since the whole point of this pass is to perform jump threading (other cleanups are not valuable at this point), skip it completely when flag_thread_jumps is not set. gcc/ChangeLog: 2019-11-15 Ilya Leoshkevich Backport from mainline PR rtl-optimization/92430 * cfgcleanup.c (pass_jump_after_combine::gate): New function. (pass_jump_after_combine::execute): Free dominance info at the beginning. gcc/testsuite/ChangeLog: 2019-11-15 Ilya Leoshkevich Backport from mainline PR rtl-optimization/92430 * gcc.dg/pr92430.c: New test (from Arseny Solokha). Added: branches/gcc-9-branch/gcc/testsuite/gcc.dg/pr92430.c Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/cfgcleanup.c branches/gcc-9-branch/gcc/testsuite/ChangeLog
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #3 from iii at gcc dot gnu.org --- Author: iii Date: Mon Sep 30 17:40:02 2019 New Revision: 276360 URL: https://gcc.gnu.org/viewcvs?rev=276360&root=gcc&view=rev Log: S/390: Remove code duplication in vec_unordered vec_unordered is vec_ordered plus a negation at the end. Reuse vec_unordered logic. gcc/ChangeLog: 2019-09-30 Ilya Leoshkevich PR target/77918 * config/s390/vector.md (vec_unordered): Call gen_vec_ordered. Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/vector.md
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #4 from iii at gcc dot gnu.org --- Author: iii Date: Tue Oct 1 14:03:08 2019 New Revision: 276408 URL: https://gcc.gnu.org/viewcvs?rev=276408&root=gcc&view=rev Log: S/390: Implement vcond expander for V1TI,V1TF Currently gcc does not emit wf{c,k}* instructions when comparing long double values. Middle-end actually adds them in the first place, but then veclower pass replaces them with floating point register pair operations, because the corresponding expander is missing. gcc/ChangeLog: 2019-10-01 Ilya Leoshkevich PR target/77918 * config/s390/vector.md (V_HW): Add V1TI in order to make vcond$a$b generate vcondv1tiv1tf. Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/vector.md
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #5 from iii at gcc dot gnu.org --- Author: iii Date: Tue Oct 1 14:04:08 2019 New Revision: 276409 URL: https://gcc.gnu.org/viewcvs?rev=276409&root=gcc&view=rev Log: S/390: Remove code duplication in vec_* comparison expanders s390.md uses a lot of near-identical expanders that perform dispatching to other expanders based on operand types. Since the following patch would require even more of these, avoid copy-pasting the code by generating these expanders using an iterator. gcc/ChangeLog: 2019-10-01 Ilya Leoshkevich PR target/77918 * config/s390/s390.c (s390_expand_vec_compare): Use gen_vec_cmpordered and gen_vec_cmpunordered. * config/s390/vector.md (vec_cmpuneq, vec_cmpltgt, vec_ordered, vec_unordered): Delete. (vec_ordered): Rename to vec_cmpordered. (vec_unordered): Rename to vec_cmpunordered. (VEC_CMP_EXPAND): New iterator for the generic dispatcher. (vec_cmp): Generic dispatcher. Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/s390.c trunk/gcc/config/s390/vector.md
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #6 from iii at gcc dot gnu.org --- Author: iii Date: Mon Oct 7 14:59:00 2019 New Revision: 276659 URL: https://gcc.gnu.org/viewcvs?rev=276659&root=gcc&view=rev Log: Allow COND_EXPR and VEC_COND_EXPR condtions to trap Right now gimplifier does not allow VEC_COND_EXPR's condition to trap and introduces a temporary if this could happen, for example, generating _5 = _4 > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }; _6 = VEC_COND_EXPR <_5, { -1, -1, -1, -1 }, { 0, 0, 0, 0 }>; from GENERIC VEC_COND_EXPR < (*b > { 2.0e+0, 2.0e+0, 2.0e+0, 2.0e+0 }) , { -1, -1, -1, -1 } , { 0, 0, 0, 0 } > This is not necessary and makes the resulting GIMPLE harder to analyze. Change the gimplifier so as to allow COND_EXPR and VEC_COND_EXPR conditions to trap. This patch takes special care to avoid introducing trapping comparisons in GIMPLE_COND. They are not allowed, because they would require 3 outgoing edges (then, else and EH), which is awkward to say the least. Therefore, computations of such conditions should live in their own basic blocks. gcc/ChangeLog: 2019-10-07 Ilya Leoshkevich PR target/77918 * gimple-expr.c (gimple_cond_get_ops_from_tree): Assert that the caller passes a non-trapping condition. (is_gimple_condexpr): Allow trapping conditions. (is_gimple_condexpr_1): New helper function. (is_gimple_condexpr_for_cond): New function, acts like old is_gimple_condexpr. * gimple-expr.h (is_gimple_condexpr_for_cond): New function. * gimple.c (gimple_could_trap_p_1): Handle COND_EXPR and VEC_COND_EXPR. Fix an issue with statements like i = (fp < 1.). * gimplify.c (gimplify_cond_expr): Use is_gimple_condexpr_for_cond. (gimplify_expr): Allow is_gimple_condexpr_for_cond. * tree-eh.c (operation_could_trap_p): Assert on COND_EXPR and VEC_COND_EXPR. (tree_could_trap_p): Handle COND_EXPR and VEC_COND_EXPR. * tree-ssa-forwprop.c (forward_propagate_into_gimple_cond): Use is_gimple_condexpr_for_cond, remove pointless tmp check (forward_propagate_into_cond): Remove pointless tmp check. Modified: trunk/gcc/ChangeLog trunk/gcc/gimple-expr.c trunk/gcc/gimple-expr.h trunk/gcc/gimple.c trunk/gcc/gimplify.c trunk/gcc/tree-eh.c trunk/gcc/tree-ssa-forwprop.c
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #7 from iii at gcc dot gnu.org --- Author: iii Date: Mon Oct 7 15:01:15 2019 New Revision: 276660 URL: https://gcc.gnu.org/viewcvs?rev=276660&root=gcc&view=rev Log: Introduce can_vcond_compare_p function z13 supports only non-signaling vector comparisons. This means we cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13. However, we cannot express this restriction today: the code only checks whether vcond$a$b optab exists, but this does not say anything about the operation. Introduce a function that checks whether back-end supports vector comparisons with individual rtx codes by matching vcond expander's third argument with a fake comparison with the corresponding rtx code. gcc/ChangeLog: 2019-10-07 Ilya Leoshkevich PR target/77918 * optabs-tree.c (vcond_icode_p): New function. (vcond_eq_icode_p): Likewise. (expand_vec_cond_expr_p): Use vcond_icode_p and vcond_eq_icode_p. * optabs.c (can_vcond_compare_p): New function. * optabs.h (can_vcond_compare_p): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/optabs-tree.c trunk/gcc/optabs.c trunk/gcc/optabs.h
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #8 from iii at gcc dot gnu.org --- Author: iii Date: Thu Oct 10 17:00:29 2019 New Revision: 276842 URL: https://gcc.gnu.org/viewcvs?rev=276842&root=gcc&view=rev Log: [PATCH 1/3] S/390: Do not use signaling vector comparisons on z13 z13 supports only non-signaling vector comparisons. This means we cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13. Notify middle-end about this by using more restrictive operator predicate in vcond. gcc/ChangeLog: 2019-10-10 Ilya Leoshkevich PR target/77918 * config/s390/vector.md (vcond_comparison_operator): New predicate. (vcond): Use vcond_comparison_operator. Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/vector.md
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #9 from iii at gcc dot gnu.org --- Author: iii Date: Fri Oct 11 09:00:26 2019 New Revision: 276871 URL: https://gcc.gnu.org/viewcvs?rev=276871&root=gcc&view=rev Log: S/390: Use signaling FP comparison instructions dg-torture.exp=inf-compare-1.c is failing, because (qNaN > +Inf) comparison is compiled to CDB instruction, which does not signal an invalid operation exception. KDB should have been used instead. This patch introduces a new CCmode and a new pattern in order to generate signaling instructions in this and similar cases. gcc/ChangeLog: 2019-10-11 Ilya Leoshkevich PR target/77918 * config/s390/2827.md: Add new opcodes. * config/s390/2964.md: Likewise. * config/s390/3906.md: Likewise. * config/s390/8561.md: Likewise. * config/s390/s390-builtins.def (s390_vfchesb): Use the new vec_cmpgev4sf_quiet_nocc. (s390_vfchedb): Use the new vec_cmpgev2df_quiet_nocc. (s390_vfchsb): Use the new vec_cmpgtv4sf_quiet_nocc. (s390_vfchdb): Use the new vec_cmpgtv2df_quiet_nocc. (vec_cmplev4sf): Use the new vec_cmplev4sf_quiet_nocc. (vec_cmplev2df): Use the new vec_cmplev2df_quiet_nocc. (vec_cmpltv4sf): Use the new vec_cmpltv4sf_quiet_nocc. (vec_cmpltv2df): Use the new vec_cmpltv2df_quiet_nocc. * config/s390/s390-modes.def (CCSFPS): New mode. * config/s390/s390.c (s390_match_ccmode_set): Support CCSFPS. (s390_select_ccmode): Return CCSFPS for LT, LE, GT, GE and LTGT. (s390_branch_condition_mask): Reuse CCS for CCSFPS. (s390_expand_vec_compare): Use non-signaling patterns where necessary. (s390_reverse_condition): Support CCSFPS. * config/s390/s390.md (*cmp_ccsfps): New pattern. * config/s390/vector.md: (VFCMP_HW_OP): Remove. (asm_fcmp_op): Likewise. (*smaxv2df3_vx): Use pattern for quiet comparison. (*sminv2df3_vx): Likewise. (*vec_cmp_nocc): Remove. (*vec_cmpeq_quiet_nocc): New pattern. (vec_cmpgt_quiet_nocc): Likewise. (vec_cmplt_quiet_nocc): New expander. (vec_cmpge_quiet_nocc): New pattern. (vec_cmple_quiet_nocc): New expander. (*vec_cmpeq_signaling_nocc): New pattern. (*vec_cmpgt_signaling_nocc): Likewise. (*vec_cmpgt_signaling_finite_nocc): Likewise. (*vec_cmpge_signaling_nocc): Likewise. (*vec_cmpge_signaling_finite_nocc): Likewise. (vec_cmpungt): New expander. (vec_cmpunge): Likewise. (vec_cmpuneq): Use quiet patterns. (vec_cmpltgt): Allow only on z14+. (vec_cmpordered): Use quiet patterns. (vec_cmpunordered): Likewise. (VEC_CMP_EXPAND): Add ungt and unge. gcc/testsuite/ChangeLog: 2019-10-11 Ilya Leoshkevich * gcc.target/s390/vector/vec-scalar-cmp-1.c: Adjust expectations. Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/2827.md trunk/gcc/config/s390/2964.md trunk/gcc/config/s390/3906.md trunk/gcc/config/s390/8561.md trunk/gcc/config/s390/s390-builtins.def trunk/gcc/config/s390/s390-modes.def trunk/gcc/config/s390/s390.c trunk/gcc/config/s390/s390.md trunk/gcc/config/s390/vector.md trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/s390/vector/vec-scalar-cmp-1.c
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #11 from iii at gcc dot gnu.org --- Author: iii Date: Fri Oct 11 09:03:00 2019 New Revision: 276872 URL: https://gcc.gnu.org/viewcvs?rev=276872&root=gcc&view=rev Log: S/390: Test signaling FP comparison instructions gcc/testsuite/ChangeLog: 2019-10-11 Ilya Leoshkevich PR target/77918 * gcc.target/s390/s390.exp: Enable Fortran tests. * gcc.target/s390/zvector/autovec-double-quiet-eq.c: New test. * gcc.target/s390/zvector/autovec-double-quiet-ge.c: New test. * gcc.target/s390/zvector/autovec-double-quiet-gt.c: New test. * gcc.target/s390/zvector/autovec-double-quiet-le.c: New test. * gcc.target/s390/zvector/autovec-double-quiet-lt.c: New test. * gcc.target/s390/zvector/autovec-double-quiet-ordered.c: New test. * gcc.target/s390/zvector/autovec-double-quiet-uneq.c: New test. * gcc.target/s390/zvector/autovec-double-quiet-unordered.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-eq-z13-finite.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-eq-z13.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-eq.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-ge-z13-finite.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-ge-z13.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-ge.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-gt-z13-finite.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-gt-z13.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-gt.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-le-z13-finite.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-le-z13.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-le.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-lt-z13-finite.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-lt-z13.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-lt.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-ltgt-z13-finite.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-ltgt-z13.c: New test. * gcc.target/s390/zvector/autovec-double-signaling-ltgt.c: New test. * gcc.target/s390/zvector/autovec-double-smax-z13.F90: New test. * gcc.target/s390/zvector/autovec-double-smax.F90: New test. * gcc.target/s390/zvector/autovec-double-smin-z13.F90: New test. * gcc.target/s390/zvector/autovec-double-smin.F90: New test. * gcc.target/s390/zvector/autovec-float-quiet-eq.c: New test. * gcc.target/s390/zvector/autovec-float-quiet-ge.c: New test. * gcc.target/s390/zvector/autovec-float-quiet-gt.c: New test. * gcc.target/s390/zvector/autovec-float-quiet-le.c: New test. * gcc.target/s390/zvector/autovec-float-quiet-lt.c: New test. * gcc.target/s390/zvector/autovec-float-quiet-ordered.c: New test. * gcc.target/s390/zvector/autovec-float-quiet-uneq.c: New test. * gcc.target/s390/zvector/autovec-float-quiet-unordered.c: New test. * gcc.target/s390/zvector/autovec-float-signaling-eq.c: New test. * gcc.target/s390/zvector/autovec-float-signaling-ge.c: New test. * gcc.target/s390/zvector/autovec-float-signaling-gt.c: New test. * gcc.target/s390/zvector/autovec-float-signaling-le.c: New test. * gcc.target/s390/zvector/autovec-float-signaling-lt.c: New test. * gcc.target/s390/zvector/autovec-float-signaling-ltgt.c: New test. * gcc.target/s390/zvector/autovec-fortran.h: New test. * gcc.target/s390/zvector/autovec-long-double-signaling-ge.c: New test. * gcc.target/s390/zvector/autovec-long-double-signaling-gt.c: New test. * gcc.target/s390/zvector/autovec-long-double-signaling-le.c: New test. * gcc.target/s390/zvector/autovec-long-double-signaling-lt.c: New test. * gcc.target/s390/zvector/autovec.h: New test. Added: trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-eq.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-ge.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-gt.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-le.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-lt.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-ordered.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-uneq.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-quiet-unordered.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq-z13-finite.c trunk/gcc/testsuite/gcc.target/s390/zvector/autovec-double-signaling-eq-z13.c t
[Bug target/89233] [9 Regression] ICE in change_address_1, at emit-rtl.c:2286
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89233 --- Comment #3 from iii at gcc dot gnu.org --- Author: iii Date: Tue Feb 12 14:51:39 2019 New Revision: 268798 URL: https://gcc.gnu.org/viewcvs?rev=268798&root=gcc&view=rev Log: S/390: Reject invalid Q/R/S/T addresses after LRA The following insn: (insn (set (reg:DI %r2) (sign_extend:DI (mem:SI (const:DI (plus:DI (symbol_ref:DI ("*.LC0")) (const_int 16))) is correctly recognized by LRA as RIL alternative of extendsidi2 define_insn. However, when recognition runs after LRA, it returns RXY alternative, which is incorrect, since the offset 16 points past the end of of *.LC0 literal pool entry. Such addresses are normally rejected by s390_decompose_address (). This inconsistency confuses annotate_constant_pool_refs: the selected alternative makes it proceed with annotation, only to find that the annotated address is invalid, causing ICE. This patch fixes the root cause, namely, that s390_check_qrst_address () behaves differently during and after LRA. gcc/ChangeLog: 2019-02-12 Ilya Leoshkevich PR target/89233 * config/s390/s390.c (s390_decompose_address): Update comment. (s390_check_qrst_address): Reject invalid address forms after LRA. gcc/testsuite/ChangeLog: 2019-02-12 Ilya Leoshkevich PR target/89233 * gcc.target/s390/pr89233.c: New test. Added: trunk/gcc/testsuite/gcc.target/s390/pr89233.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/s390.c trunk/gcc/testsuite/ChangeLog
[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080 --- Comment #15 from iii at gcc dot gnu.org --- Author: iii Date: Mon Sep 24 14:21:03 2018 New Revision: 264535 URL: https://gcc.gnu.org/viewcvs?rev=264535&root=gcc&view=rev Log: S/390: Fix conditional returns on z196+ S/390 epilogue ends with (parallel [(return) (use %r14)]) instead of the more usual (return) or (simple_return). This sequence is not recognized by the conditional return logic in try_optimize_cfg (). This was introduced for processors older than z196, where it is sometimes profitable to use call-clobbered register for returning instead of %r14. On newer processors we always return via %r14, for which the fact that it's used is already reflected by EPILOGUE_USES. In this case a simple (return) suffices. This patch changes return_use () to emit simple (return)s when returning via %r14. The resulting sequences are recognized by the conditional return logic in try_optimize_cfg (). gcc/ChangeLog: 2018-09-24 Ilya Leoshkevich PR target/80080 * config/s390/s390.c (s390_emit_epilogue): Do not use PARALLEL RETURN+USE when returning via %r14. gcc/testsuite/ChangeLog: 2018-09-24 Ilya Leoshkevich PR target/80080 * gcc.target/s390/risbg-ll-3.c: Expect conditional returns. * gcc.target/s390/zvector/vec-cmp-2.c: Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/s390.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/s390/risbg-ll-3.c trunk/gcc/testsuite/gcc.target/s390/zvector/vec-cmp-2.c
[Bug bootstrap/87417] [9 regression] Internal error: abort in attr_alt_intersection, at genattrtab.c:2357
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87417 --- Comment #8 from iii at gcc dot gnu.org --- Author: iii Date: Tue Sep 25 06:38:20 2018 New Revision: 264556 URL: https://gcc.gnu.org/viewcvs?rev=264556&root=gcc&view=rev Log: Fix EQ_ATTR_ALT size calculation (PR bootstrap/87417) "r264537: Change EQ_ATTR_ALT to support up to 64 alternatives" changed the format of EQ_ATTR_ALT from ii to ww. This broke the bootstrap on 32-bit systems, because the formula for rtx_code_size assumed that only certain codes contain HOST_WIDE_INTs. This did not surface on 64-bit systems, because rtunion is 8 bytes anyway, but on 32-bit systems it's only 4 bytes. This resulted in out-of-bounds writes and memory corruptions in genattrtab. gcc/ChangeLog: 2018-09-25 Ilya Leoshkevich PR bootstrap/87417 * rtl.c (rtx_code_size): Take into account that EQ_ATTR_ALT contains HOST_WIDE_INTs when computing its size. Modified: trunk/gcc/ChangeLog trunk/gcc/rtl.c
[Bug rtl-optimization/87596] [9 Regression] ICE: Segmentation fault (in spill_hard_reg_in_range)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87596 --- Comment #6 from iii at gcc dot gnu.org --- Author: iii Date: Fri Oct 19 08:33:52 2018 New Revision: 265306 URL: https://gcc.gnu.org/viewcvs?rev=265306&root=gcc&view=rev Log: lra: fix spill_hard_reg_in_range clobber check FROM..TO range might contain NOTE_INSN_DELETED insns, for which the corresponding entries in lra_insn_recog_data[] are NULLs. Example from the problematic code from PR87596: (note 148 154 68 7 NOTE_INSN_DELETED) lra_insn_recog_data[] is used directly only when the insn in question is taken from insn_bitmap, which is not the case here. In other situations lra_get_insn_recog_data () guarded by INSN_P () or other stricter predicate are used. So we need to do this here as well. A tiny detail worth noting: I put the INSN_P () check before the insn_bitmap check, because I believe that insn_bitmap can contain only real insns anyway. gcc/ChangeLog: 2018-10-19 Ilya Leoshkevich PR rtl-optimization/87596 * lra-constraints.c (spill_hard_reg_in_range): Use INSN_P () + lra_get_insn_recog_data () instead of lra_insn_recog_data[] for instructions in FROM..TO range. gcc/testsuite/ChangeLog: 2018-10-19 Ilya Leoshkevich PR rtl-optimization/87596 * gcc.target/i386/pr87596.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr87596.c Modified: trunk/gcc/ChangeLog trunk/gcc/lra-constraints.c trunk/gcc/testsuite/ChangeLog
[Bug bootstrap/87747] [9 regression] Bootstrap failure if using gcc-4.6 as stage1 compiler
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87747 --- Comment #3 from iii at gcc dot gnu.org --- Author: iii Date: Thu Oct 25 13:47:10 2018 New Revision: 265488 URL: https://gcc.gnu.org/viewcvs?rev=265488&root=gcc&view=rev Log: Fix rtx_code_size static initialization order fiasco r264556 and r264537 changed the format of EQ_ATTR_ALT RTXs to "ww", which also required adjusting rtx_code_size initializer. In order to simplify things, the list of rtx_codes known to use HOST_WIDE_INTs was replaced by the format string check. However, unlike the old one, this new check cannot be always performed at compile time, in which case a static constructor is generated. This may lead to a static initialization order fiasco with respect to other static constructors in the compiler, in case of PR87747, cselib's pool_allocator. gcc/ChangeLog: 2018-10-25 Ilya Leoshkevich PR bootstrap/87747 * rtl.c (RTX_CODE_HWINT_P_1): New helper macro. (RTX_CODE_HWINT_P): New macro. (rtx_code_size): Use RTX_CODE_HWINT_P (). Modified: trunk/gcc/ChangeLog trunk/gcc/rtl.c
[Bug target/87762] [9 Regression] extract_constrain_insn, at recog.c:2206 on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87762 --- Comment #3 from iii at gcc dot gnu.org --- Author: iii Date: Tue Nov 6 13:20:21 2018 New Revision: 265844 URL: https://gcc.gnu.org/viewcvs?rev=265844&root=gcc&view=rev Log: S/390: Introduce relative_long attribute In order to properly fix PR87762, we need to distinguish between instructions which support relative addressing and instructions which don't. We could check whether the existing "type" attribute is equal to "larl", but there are notable exceptions (lrl, for example), and changing them makes scheduling worse on z10. We could also check whether the existing "op_type" attribute is equal to "RIL-b" or "RIL-c". However, adding a new attribute provides more flexibility, since we don't depend idiosyncrasies which might be introduced into PoP in the future. gcc/ChangeLog: 2018-11-06 Ilya Leoshkevich PR target/87762 * config/s390/s390.md: Add relative_long attribute. Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/s390.md
[Bug target/87762] [9 Regression] extract_constrain_insn, at recog.c:2206 on s390x
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87762 --- Comment #4 from iii at gcc dot gnu.org --- Author: iii Date: Fri Nov 9 20:33:19 2018 New Revision: 265991 URL: https://gcc.gnu.org/viewcvs?rev=265991&root=gcc&view=rev Log: S/390: Allow relative addressing of literal pool entries r265490 allowed the compiler to choose in a more flexible way whether to use load or load-address-relative-long (LARL) instruction. When it chose LARL for literal pool references, the latter ones were rewritten by pass_s390_early_mach to use UNSPEC_LTREF, which assumes base register usage, which in turn is not compatible with LARL. The end result was an ICE because of unrecognizable insn. UNSPEC_LTREF and friends are necessary in order to communicate the dependency on the base register to pass_sched2. When relative addressing is used, no base register is necessary, so in such cases the rewrite must be avoided. gcc/ChangeLog: 2018-11-09 Ilya Leoshkevich PR target/87762 * config/s390/s390.c (s390_safe_relative_long_p): New function. (annotate_constant_pool_refs): Skip insns which support relative addressing. (annotate_constant_pool_refs_1): New helper function. (find_constant_pool_ref): Skip insns which support relative addression. (find_constant_pool_ref_1): New helper function. (replace_constant_pool_ref): Skip insns which support relative addressing. (replace_constant_pool_ref_1): New helper function. (s390_mainpool_start): Adapt to the new signature. (s390_mainpool_finish): Likewise. (s390_chunkify_start): Likewise. (s390_chunkify_finish): Likewise. (pass_s390_early_mach::execute): Likewise. (s390_prologue_plus_offset): Likewise. (s390_emit_prologue): Likewise. (s390_emit_epilogue): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/s390.c
[Bug target/88083] ICE in find_constant_pool_ref_1, at config/s390/s390.c:8231
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88083 --- Comment #1 from iii at gcc dot gnu.org --- Author: iii Date: Tue Nov 20 09:32:49 2018 New Revision: 266306 URL: https://gcc.gnu.org/viewcvs?rev=266306&root=gcc&view=rev Log: S/390: Skip LT(G) peephole when literal pool is involved By the time peephole optimizations run, we've already made up our mind whether to use base-register or relative addressing for literal pool entries. LT(G) supports only base-register addressing, and so it is too late to convert L(G)RL + compare to LT(G). This change should not make the code worse unless building with e.g. -fno-dce, since comparing literal pool entries to zero should be optimized away during earlier passes. gcc/ChangeLog: 2018-11-20 Ilya Leoshkevich PR target/88083 * config/s390/s390.md: Skip LT(G) peephole when literal pool is involved. * rtl.h (contains_constant_pool_address_p): New function. * rtlanal.c (contains_constant_pool_address_p): Likewise. gcc/testsuite/ChangeLog: 2018-11-20 Ilya Leoshkevich PR target/88083 * gcc.target/s390/pr88083.c: New test. Added: trunk/gcc/testsuite/gcc.target/s390/pr88083.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/s390/s390.md trunk/gcc/rtl.h trunk/gcc/rtlanal.c trunk/gcc/testsuite/ChangeLog
[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080 --- Comment #16 from iii at gcc dot gnu.org --- Author: iii Date: Mon Dec 3 09:49:02 2018 New Revision: 266734 URL: https://gcc.gnu.org/viewcvs?rev=266734&root=gcc&view=rev Log: Repeat jump threading after combine Consider the following RTL: (insn (set (reg 65) (if_then_else (eq %cc 0) 1 0))) (insn (parallel [(set %cc (compare (reg 65) 0)) (clobber %scratch)])) (jump_insn (set %pc (if_then_else (ne %cc 0) (label_ref 23) %pc))) Combine simplifies this into: (note NOTE_INSN_DELETED) (note NOTE_INSN_DELETED) (jump_insn (set %pc (if_then_else (eq %cc 0) (label_ref 23) %pc))) opening up the possibility to perform jump threading. gcc/ChangeLog: 2018-12-03 Ilya Leoshkevich PR target/80080 * cfgcleanup.c (class pass_postreload_jump): New pass. (pass_postreload_jump::execute): Likewise. (make_pass_postreload_jump): Likewise. * passes.def: Add pass_postreload_jump before pass_postreload_cse. * tree-pass.h (make_pass_postreload_jump): New pass. gcc/testsuite/ChangeLog: 2018-12-03 Ilya Leoshkevich PR target/80080 * gcc.target/s390/pr80080-4.c: New test. Added: trunk/gcc/testsuite/gcc.target/s390/pr80080-4.c Modified: trunk/gcc/ChangeLog trunk/gcc/cfgcleanup.c trunk/gcc/passes.def trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-pass.h